admin管理员组文章数量:1122832
I am currently playing around with Hakyll and Pandoc.
I want to create a static HTML website from Markdown sources including inline maths in LaTeX. Using pandoc-katex I was able to do the conversion with the following command:
$ pandoc -f markdown -t html --filter pandoc-katex --css "@$(pandoc-katex --katex-version)/dist/katex.min.css" --css ".css" --standalone -o output.html input.md
However, I want to use the pandoc-katex
filter in Hakyll and obtain the exact same result as with the command above (for now), i.e. I want to use Pandoc's standard HTML template, make it load the two CSS files and process any available metadata in the input.md
in exactly the same way as the command above does.
I exported the standard HTML template as follows:
$ pandoc -D html > default-template.html
Using pandocCompilerWithTransformM
, I was able to use the pandoc-katex
filter:
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
Using this compiler in Hakyll, I only get the body part of the HTML file though. I searched online for solutions to this, but all the information that I find seems to refer to deprecated versions of Pandoc. Apparently there was a writerStandalone
option in earlier versions of Pandoc, but it does not exist anymore (even though the command line tool still has opStandalone
and the --standalone
parameter used above evidently works).
What I currently do is, I apply the default template with loadAndApplyTemplate "templates/default-template.html" myCtx
and then try to manually replicate the default context in myCtx
. This is obviously not how it should be done.
Here is a somewhat minimal example of my attempt (sorry that it's still a bit lengthy - exactly that is the problem):
{-# LANGUAGE OverloadedStrings #-}
import Text.Pandoc
import Text.Pandoc.Filter
import Text.Pandoc.Scripting
import Hakyll
css1Item = Item (fromFilePath "css/katex.min.css") "/[email protected]/dist/katex.min.css"
css2Item = Item (fromFilePath "css/pandoc.css") ".css"
authorItem = Item (fromFilePath "general") "Jon Doe"
stylesString = "/* 15 lines of CSS */"
myCtx :: Context String
myCtx = dateField "date" "%B %e, %Y"
<> constField "pagetitle" "My Title"
<> constField "styles.html" stylesString
<> listCtx "author" [authorItem]
<> listCtx "author-meta" [authorItem]
<> listCtx "css" [css1Item, css2Item]
<> listCtx "header-includes" []
<> listCtx "include-before" []
<> listCtx "include-after" []
<> defaultContext
listCtx :: String -> [Item String] -> Context String
listCtx name lst = listField name ctx (return $ lst)
where ctx = field name (return . itemBody)
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
main :: IO ()
main = hakyll $ do
match "templates/default-template.html" $ compile templateBodyCompiler
match "input.md" $ do
route $ setExtension ".html"
compile $ katexCompiler
>>= loadAndApplyTemplate "templates/default-template.html" myCtx
I have a two concrete questions:
- The
Item
data type associates keys of typeIdentifier
with values. The constructors forIdentifier
suggest that theIdentifier
s should be file names, but for some of theItem
s in my context, (e.g. for theauthor
field; see variableauthorItem
), having a file name as a key does not make sense. I think I misinterpreted the purpose of this type. How should I think of theseItem
s? - Is there a way to obtain the
Context
that the command line tool uses, when making the conversion? The defaultContext
seems to be a lot more involved than my quick draft, e.g. it reads the abstract from the metadata of the Markdown file and puts every paragraph in between separate<p> ... </p>
HTML tags. I know there is ametadataField :: Context a
, but it does not seem to be what I want.
Apart from these concrete questions, the general question is:
- Do I do this right at all or would there be a much simpler way of doing what I try to do (i.e. replicating the output of the initial pandoc shell command in Haskell with Hakyll)?
I am currently playing around with Hakyll and Pandoc.
I want to create a static HTML website from Markdown sources including inline maths in LaTeX. Using pandoc-katex I was able to do the conversion with the following command:
$ pandoc -f markdown -t html --filter pandoc-katex --css "https://cdn.jsdelivr.net/npm/katex@$(pandoc-katex --katex-version)/dist/katex.min.css" --css "https://pandoc.org/demo/pandoc.css" --standalone -o output.html input.md
However, I want to use the pandoc-katex
filter in Hakyll and obtain the exact same result as with the command above (for now), i.e. I want to use Pandoc's standard HTML template, make it load the two CSS files and process any available metadata in the input.md
in exactly the same way as the command above does.
I exported the standard HTML template as follows:
$ pandoc -D html > default-template.html
Using pandocCompilerWithTransformM
, I was able to use the pandoc-katex
filter:
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
Using this compiler in Hakyll, I only get the body part of the HTML file though. I searched online for solutions to this, but all the information that I find seems to refer to deprecated versions of Pandoc. Apparently there was a writerStandalone
option in earlier versions of Pandoc, but it does not exist anymore (even though the command line tool still has opStandalone
and the --standalone
parameter used above evidently works).
What I currently do is, I apply the default template with loadAndApplyTemplate "templates/default-template.html" myCtx
and then try to manually replicate the default context in myCtx
. This is obviously not how it should be done.
Here is a somewhat minimal example of my attempt (sorry that it's still a bit lengthy - exactly that is the problem):
{-# LANGUAGE OverloadedStrings #-}
import Text.Pandoc
import Text.Pandoc.Filter
import Text.Pandoc.Scripting
import Hakyll
css1Item = Item (fromFilePath "css/katex.min.css") "https://cdn.jsdelivr.net/npm/[email protected]/dist/katex.min.css"
css2Item = Item (fromFilePath "css/pandoc.css") "https://pandoc.org/demo/pandoc.css"
authorItem = Item (fromFilePath "general") "Jon Doe"
stylesString = "/* 15 lines of CSS */"
myCtx :: Context String
myCtx = dateField "date" "%B %e, %Y"
<> constField "pagetitle" "My Title"
<> constField "styles.html" stylesString
<> listCtx "author" [authorItem]
<> listCtx "author-meta" [authorItem]
<> listCtx "css" [css1Item, css2Item]
<> listCtx "header-includes" []
<> listCtx "include-before" []
<> listCtx "include-after" []
<> defaultContext
listCtx :: String -> [Item String] -> Context String
listCtx name lst = listField name ctx (return $ lst)
where ctx = field name (return . itemBody)
katexCompiler = pandocCompilerWithTransformM defaultHakyllReaderOptions (defaultHakyllWriterOptions) katexFilter
where katexFilter = recompilingUnsafeCompiler
. runIOorExplode
. applyFilters noEngine def [JSONFilter "pandoc-katex"] []
main :: IO ()
main = hakyll $ do
match "templates/default-template.html" $ compile templateBodyCompiler
match "input.md" $ do
route $ setExtension ".html"
compile $ katexCompiler
>>= loadAndApplyTemplate "templates/default-template.html" myCtx
I have a two concrete questions:
- The
Item
data type associates keys of typeIdentifier
with values. The constructors forIdentifier
suggest that theIdentifier
s should be file names, but for some of theItem
s in my context, (e.g. for theauthor
field; see variableauthorItem
), having a file name as a key does not make sense. I think I misinterpreted the purpose of this type. How should I think of theseItem
s? - Is there a way to obtain the
Context
that the command line tool uses, when making the conversion? The defaultContext
seems to be a lot more involved than my quick draft, e.g. it reads the abstract from the metadata of the Markdown file and puts every paragraph in between separate<p> ... </p>
HTML tags. I know there is ametadataField :: Context a
, but it does not seem to be what I want.
Apart from these concrete questions, the general question is:
- Do I do this right at all or would there be a much simpler way of doing what I try to do (i.e. replicating the output of the initial pandoc shell command in Haskell with Hakyll)?
1 Answer
Reset to default 2The nicest way to do that is probably using writerTemplate
in Pandoc's WriterOptions
to pass the default template, as given by compileDefaultTemplate
:
main :: IO ()
main = do
pandocTmpl <- runIOorExplode $ compileDefaultTemplate "html"
let katexOpts = defaultHakyllWriterOptions
{ writerTemplate = Just pandocTmpl
, writerHTMLMathMethod = KaTeX ""
-- And whatever else you need.
}
-- Defining it this way because pandocCompilerWith strips
-- the metadata block before handing the body to Pandoc.
--
-- I'm relying on Pandoc's built-in KaTeX support. If
-- you'd rather stick with the pandoc-katex filter, you
-- can use renderPandocWithTransformM to reshape the
-- compiler you defined in the question in this fashion.
katexCompiler = do
fullItem <- getResourceString
renderPandocWith defaultHakyllReaderOptions katexOpts fullItem
hakyll $ do
-- etc.
match "input.md" $ do
route $ setExtension ".html"
compile katexCompiler
See also pandoc issue #10209, which points to a similar approach.
Side questions:
How should I think of these
Item
s?
Item
indeed is primarily meant for things bound to a file path in your site tree. Occasionally, it makes sense to use a fake path for the identifier — for instance, when synthesising some content with a create
rule. However, that's not typically something one would want to do for the sake of setting a context field, as there likely are more straightforward ways to do that. (In particular, if, unlike in this answer, you are using Hakyll's templates, you don't have to explicitly define the fields that you include in the metadata headers of your source files, as Hakyll's defaultContext
covers that already by including metadataField
.)
Is there a way to obtain the
Context
that the command line tool uses, when making the conversion?
While Pandoc offers ways to manipulate its own metadata (which I have never used myself; Text.Pandoc.Writers.Shared
might be a good place to start browsing), the template systems of Pandoc and Hakyll are similar-looking but distinct, and in particular Hakyll's Context
type is not the same as its Pandoc counterpart.
On a final note, it is worth mentioning that if you were completely stuck trying to reproduce Pandoc's output within Hakyll, a last resort would be using unixFilter
to set up a compiler that shells out to command-line Pandoc.
本文标签:
版权声明:本文标题:haskell - Pandoc 3: Obtain and modify the default context for Markdown to HTML conversion in Hakyll 4? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736307151a1933211.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
writerStandalone
no longer existing, you can nonetheless specify a template through thewriterTemplate
option. It's not clear that would be significantly better than your approach, though. – duplode Commented Nov 21, 2024 at 21:47writerTemplate
to this extent in a Hakyll site.) – duplode Commented Nov 21, 2024 at 22:13getDefaultTemplate "html"
or something like that to get the template, but I'm still trying to figure out how to do it. My current attempts do not type check yet. – user11718766 Commented Nov 21, 2024 at 22:19ghc
says "No instance for (PandocMonad Template) arising from a use of ‘getDefaultTemplate’", when I try setting{writerTemplate = Just (getDefaultTemplate "html")}
. I don't see what the problem could be. The type of thewriterTemplate
field should beMaybe (Template Text)
andgetDefaultTemplate
has typeText -> m Text
. Any ideas? – user11718766 Commented Nov 21, 2024 at 22:51getDefaultTemplate
isPandocMonad m => Text -> m Text
so you need to run that with, for instance,runIOorExplode
. – duplode Commented Nov 21, 2024 at 23:14