[pandoc, documents, convert, converting, pdf, markdown, latex, HTML, command, command line, open-source]


Overview

Pandoc is a powerful command-line tool designed for converting documents between a wide range of formats, including Markdown, HTML, LaTeX (PDF), Word, Jupyter notebooks (.ipynb), and PowerPoint. It can handle complex syntaxes, such as LaTeX math, document metadata, and tables, making it an incredibly versatile choice for various document conversion needs.

Summary

Key features are:

  • Easy and quick to use, directly from the command line
  • Converts numerous document formats
  • Free and open-source
  • Highly customizable with extensions
  • Allows custom templates for consistent formatting
  • Supports citations and bibliographies
  • Create slideshows with LaTeX Beamer or PowerPoint

For detailed information on all available options, refer to the Pandoc User Guide. To quickly find specific topics in this extensive document, use the Search (Ctrl+F) function!

How to install Pandoc

Refer to this TSH Guide for instructions on setting up Pandoc.

How to use it

To demonstrate how to use Pandoc, we’ll convert a basic Markdown file to a PDF.

  1. Save the following content in a file named example.md:
# My First Document

This is a simple markdown document.

## Section 1

Here is some text in section 1.

## Section 2

Here is some text in section 2.
Warning

Pandoc uses LaTeX to create PDFs by default, so you need to have a LaTeX engine installed. Refer to our LaTeX Set up Guide for setup instructions.

If you prefer not to use LaTeX, alternative tools are available. For more information, refer to the Pandoc User Guide, Creating a PDF section

  1. Open a terminal in the directory where you saved the file and run the following command:
pandoc example.md -s -o output.pdf 
  • example.md is the input file
  • -o specifies the output file (named output.pdf)
  • -s (or -standalone) tells Pandoc to create a self-contained document. It follows the default format template and adds the necessary header and footer material.
  1. Check your directory to find the newly created PDF document example.pdf.

The Markdown content is now beautifully formatted into a PDF!

Tip

You can provide multiple input files. By default, Pandoc combines them into one document with blank lines in between. Use --file-scope to process them individually.

Useful functions and use cases

Pandoc offers a wide range of options, which you can find in the Options sections of the User Guide. Here are a few of the most useful features:

Customized templates

Pandoc allows custom templates to control the look of your documents.

Run pandoc -D followed by a format to find the default template used to create the document. For example, find the default PDF format like this:

pandoc -D latex

To use a custom template, create mytemplate.tex yourself (you can adjust the default template file, for example). Then, run the following command, specifying the template with --template=:

pandoc -s --template=mytemplate.tex -o example.pdf example.md

Citations and bibliographies

Pandoc supports citations and bibliographies, which are essential in academic writing. You can use --citeproc to process citations. For example, save the following markdown content as example-citations.md.


---
title: "Sample document"
author: "Author name"
bibliography: "references.bib"
---

# Introduction

This is a sample document to demonstrate how Pandoc processes citations. 
Here is a citation to a book [@doe2020book].

# Method

The method used in this research is based on [@smith2019article].

# Results

The results were consistent with those found in earlier studies [@johnson2018study].

# References

And, save the following content in a bibliography file called references.bib:


@book{doe2020book,
  title     = {Example Book Title},
  author    = {John Doe},
  year      = {2020},
  publisher = {Publisher Name},
}

@article{smith2019article,
  title     = {Example Article Title},
  author    = {Jane Smith},
  journal   = {Journal Name},
  year      = {2019},
  volume    = {10},
  number    = {2},
  pages     = {123--456},
}

@article{johnson2018study,
  title     = {Another Study Title},
  author    = {Alex Johnson},
  journal   = {Another Journal},
  year      = {2018},
  volume    = {5},
  number    = {1},
  pages     = {789--1011},
}

Run this command to include the references:

pandoc --citeproc example-citations.md -o output-citations.pdf 

A PDF document is created with the bibliography specified at the start of the markdown file!

Tip

To generate LaTeX output with bibtex (when converting to a .tex file instead of a PDF directly), replace --citeproc with --natbib.

Math rendering

Pandoc can render mathematical expressions using LaTeX, MathML, or other methods. To use KaTeX for rendering math expressions, you can add --katex after the command when converting a markdown file to HTML. For example, save the following content in a file named example-maths.md.

---
title: "Math examples"
author: "Author name"
---

This document contains some math examples.

## Inline Math

Einstein's equation: 

$$
E = mc^2
$$

## Display Math

Integral:

$$
\int_{a}^{b} f(x) \, dx = F(b) - F(a)
$$


Quadratic formula:

$$
x = \frac{{-b \pm \sqrt{{b^2 - 4ac}}}}{{2a}}
$$

The following command will convert the markdown file to HTML:

pandoc example-maths.md -s -o output-maths.html --katex

Code blocks and syntax highlighting

Pandoc supports syntax highlighting for various programming languages. You can specify the language for each code block to ensure proper highlighting. Below is a Markdown file with code blocks for R and Python. Let’s see how this file converts to PDF!

Save the following content in a file named example-code.md:

---
title: "Code examples"
author: "Author name"
---

This document contains some code blocks for R and Python.

## Python code example
~~~python
# A simple Python function
def greet(name):
    return f"Hello, {name}!"

print(greet("World"))
~~~

## R code example

~~~R
outcome <- 1 + 1
print(outcome)
~~~

Use the following command to convert the Markdown file to a PDF:

pandoc example-code.md -s -o output-code.pdf

This command produces a PDF in your directory with syntax-highlighted code blocks for both R and Python:

Extensions

Pandoc offers a variety of useful extensions that allow you to tailor and customize its features to suit your needs. Below is a list of some of the most useful extensions for Pandoc, along with brief descriptions. For a complete list of options, refer to the Extensions section of the User Guide and the specific Markdown Extensions section.

Extension Description Formats
smart Improves readability by
enabling smart typography. Converts
straight quotes (") to curly quotes (“),
– to en dashes (–), — to em dashes (—),
and three dots to ellipses (…)
All formats
autolink_bare_uris Automatically turns bare URIs
into clickable links.
Markdown
CommonMark
GFM
footnotes Adds support for footnotes. Markdown
CommonMark
GFM
grid_tables Adds support for grid tables Markdown
task_lists Adds support for creating task lists
with interactive checkboxes
Markdown
GFM
table_captions Enables table captions,
making it easy to add
descriptive titles to tables
Markdown
tex_math_dollars Allows LaTeX math
between dollar signs,
simplifying the inclusion of
math formulas in documents.
Markdown
pipe_tables Adds support for pipe
tables, providing an
easy way to create
simple tables.
Markdown
implicit_figures Treats images as figures
when they are the only
element in a paragraph,
adding <figure> and <figcaption>
tags for better display.
Markdown

How to add an extension

To add an extension, specify it after the format name with a +. For instance, to use the footnotes extension with the Markdown format, include it after the -f or -from flag:

pandoc input.md -s -o output.pdf -f markdown+footnotes
Summary

Pandoc is a free, open-source tool for converting documents across a wide range of formats. It is easy and quick to use directly from the command line, and it is highly customizable. This article highlights some use cases and provides interesting examples, so you are ready to start experimenting yourself!

Contributed by Valerie Vossen