zine/issues/2/pandoc.md

4.5 KiB

pandoc magic

author: ben

pandoc is an incredibly powerful tool for creating and converting documents.

i recently started using it for all kinds of different things, including:

let's look at some of the tricks and tips that i've learned from these.

format markdown files

this seems like it might not work, but pandoc will format your markdown files for you if you convert the source file from markdown to markdown again. for example, if you wanted to tidy up your README.md, you could run:

pandoc -f markdown -t markdown -o README.md README.md

note that if you want to preserve any yaml frontmatter blocks, you'll need to change the from and to types to markdown+yaml_metadata_block. another thing to note is that you can preserve github-formatted markdown by using gfm instead of markdown. additionally, if you prefer to use atx-style headers, don't forget to add the --atx-headers switch.

a complete example:

pandoc \
    -f markdown+yaml_metadata_block \
    -t markdown+yaml_metadata_block \
    --atx-headers \
    -o README.md \
    README.md

templates

pandoc ships with a full set of templates that are used to control how things are displayed when writing to certain formats.

let's look at the templates for html and latex. to get the default template for a given format, use -D. the template is written to stdout, so let's save it to a file.

pandoc -D html > html.template

open up this template in your editor and you'll see what's used when you convert things to html.

important note: these templates are only used when generating standalone documents (the -s or --standalone flag).

note that the template uses lots of variable substitutions. we can set values for these in yaml frontmatter or in the command invocation.

to make some basic customizations, we can fill in the metadata values that will be automatically replaced in the document when we build.

however, some circumstances require a custom layout or additional changes to meet the requirements.

have a look at the wiki.tmpl used on the ~club wiki. some of the notable changes include:

  • table of contents title
  • hardcoded stylesheet
  • link to author's user page

the stylesheet change would be possible on the command line by using the -c or --css option, but the other changes require a custom template.

try it out on your own project!

lua filters

sometimes the rendered page still isn't what you're looking for. maybe you need some additional tweaks (css classes, extra html items, etc).

pandoc has supported filters for a long time in the form of json passed around on pipes that allows you to modify the internal data structures before they're written to the output format.

the main downside of using these filters is that it introduces another layer of dependencies (namely the language that the filter's written in and also the library for that language to interact with pandoc's json).

as of pandoc 2.0, a lua interpreter with a class library for creating filters is built in to the pandoc executable. let's go over some basic examples.

a common feature on most html created via markdown is a small link displayed next to the header so that you can deep-link directly to that section of the page. this is frequently seen on github and other documentation sites.

here's a very simple way to create the link:

function Header(elem)
    table.insert(elem.content, pandoc.Space())
    table.insert(elem.content, pandoc.Link("§", "#" .. elem.identifier))
    return elem
end

save this to a file and call it in your pandoc conversion with the --lua-filter option.

change header levels

this is an example from this zine. i wanted to decrease the size of headers below h1 level without changing the external css. here's how i did it.

function Header(elem)
    if elem.level > 1 then
        elem.level = elem.level + 1
    end
    return elem
end

this changes all header elements to be one level smaller.

there are additional examples on pandoc.org if you'd like to see more.