Pandocr Manual
Use Pandoc in Crystal Programs
Huub de Beer
August 2024
Pandocr is a pandoc wrapper written in the crystal programming language. Use pandocr to unleash the power of pandoc in your crystal programs.
Pandocr is free software; pandocr is released under the EUPL 1.2 or later. Find pandocr’s source code hosted on Codeberg at https://codeberg.org/CampineComputing/pandocr.
If you have any questions or comments about pandocr, suggestions for improvement, or you found a bug, please create a new issue on Codeberg, or write me an email.
Find pandocr’s API documentation at: https://campinecomputing.eu/pandocr/docs/.
1 Installation
Add pandocr to the dependencies section of your crystal program’s
    shard.yml configuration file:
dependencies:
  # ...
  pandocr:
    git: git@codeberg.org:CampineComputing/pandocr.git
    Then run shards install to download and install into your
    project’s lib directory. This will not
    install pandoc; you have to install pandoc separately.
2 Usage
2.1 First example: Hello World!
The next “Hello, World!” example program shows the typical four-step approach to using pandocr in your programs:
require "pandocr"
pandoc = Pandocr::Converter.new
pandoc.from = Pandocr::OptionType::FromFormat::Markdown
pandoc.to = Pandocr::OptionType::ToFormat::Latex
𝚛𝚎𝚜𝚞𝚕𝚝 = pandoc.convert("Hello, **World**!")
puts result
# => Writes "Hello, \textbf{World}!" to STDOUT
    - Create a new pandoc converter;
 - Configure the converter 𝚋𝚢 𝚜𝚎𝚝𝚝𝚒𝚗𝚐 pandoc option𝚜;
 - Convert a string or file with pandoc;
 - Use the result.
 
Pandocr has no bearing on step four; that step is up to you! I discuss the first three steps in separate sections next.
2.2 Create a new pandoc converter
Create a new pandoc converter with
    Pandocr::Converter.new.
By default, the Converter uses the pandoc
    program in your PATH. It’ll call that pandoc program when
    you run one of Converter’s convert methods.
Sometimes you want to use another pandoc executable. For example, when
    you want a conversion to be reproducible using a specific version of
    pandoc. You can change the pandoc executable that the
    Converter uses by passing its path to the constructor.
Demonstration:
require "pandocr"
pandoc = Pandorc::Converter.new
puts pandoc.command # => pandoc
pandoc = Pandocr::Converter.new "/opt/pandoc/1.2.1-1/pandoc"
puts pandoc.command # => /opt/pandoc/1.2.1-1/pandoc"
    On the other hand, if you want all Converters in your
    program to use a different pandoc executable, override getter
    Pandocr::Converter.command instead:
require "pandocr"
def Pandocr::Converter.command()
    "/an/alternative/pandoc"
end
pandoc = Pandocr::Converter.new
puts pandoc.command # => /an/alternative/pandoc
    2.3 Use pandoc’s command-line options in pandocr
Pandocr::Converter supports all pandoc command-line
    options. See the Options section in
    pandoc’s manual for a complete overview of pandoc’s command-line
    options and their use.
To convert a pandoc command-line option to the corresponding option in pandocr, apply the following procedure:
- When pandoc’s command-line option has both a long and a short version, pick the long version.
 - Remove the prefix “
--” from the command-line option. - Replace all dashes (“
-”) by underscores (“_”). 
For example, “-f” becomes from;
    “--to” becomes to;
    “--base-header-level” becomes
    base_header_level.
All options share the following behavior:
- To see if an option has been set or not, use query
      
#set?. - Unset, or remove, an option with 
#delete!. - Assign a value to an option with setter 
#value=; get an option’s value with getter#value. 
Pandocr distinguishes four types of pandoc command-line options, each with a slightly different interface. I discuss each type and its programming interface next.
2.3.1 Switch options
Switch options: command-line options that toggle a feature. For
    example, “--version” or “--standalone”.
In pandocr, these options have methods #on and
    #off to switch them on or off. When switched on, their
    #value method responds with true; otherwise
    false.
Use in pandocr:
require "pandocr"
pandoc = Pandocr::Converter.new
puts pandoc.standalone.set?  # => false
pandoc.standalone.on
puts pandoc.standalone.set?  # => true
puts pandoc.standalone.value # => true
pandoc.standalone.off
puts pandoc.standalone.set?  # => false
puts pandoc.standalone.value # => false
pandoc.standalone.delete!
puts pandoc.standalone.set?  # => false
puts pandoc.standalone.value # => false
    2.3.2 Value options
Value options: command-line options that set a value. Example,
    “--pdf-engine=lualatex” or “--data-dir
    ~/my/data/dir”.
require "pandocr"
pandoc = Pandocr::Converter.new
puts pandoc.data_dir.set?  # => false
pandoc.data_dir = Path.new "~/my/data/dir"
puts pandoc.data_dir.set?  # => true
puts pandoc.data_dir.value # => ~/my/data/dir
pandoc.data_dir = Path.new "/an/other/dir"
puts pandoc.data_dir.value # => /an/other/dir
    When you reassign an option’s value, the earlier value is overwritten.
2.3.3 Array options
Array options: command-line options that you can use multiple times to
    set a value. For example, in pandoc you can include multiple CSS files in
    an HTML document by using the “-ccs” option multiple times,
    like “--css=my-style.css --css=assets/header.css”.
In pandocr, these array options have some array-like behavior:
require "pandocr"
pandoc = Pandocr::Converter.new
puts pandoc.css.set?  # => false
pandoc.css << "my-style.css"
pandoc.css << "assets/header.css"
puts pandoc.css.set?  # => true
puts pandoc.css.value # => ["my-style.css", "assets/header.css"]
pandoc.css = ["other_style.css"]
puts pandoc.css.value # => ["other_style.css"]
pandoc.css << "assets/footer.css"
pandoc.css << "assets/tables.css"
pandoc.css.remove! "assets/footer.css"
puts pandoc.css.includes? "assets/footer.css"  # => false
puts pandoc.css.value # => ["other_style.css", "assets/tables.css"]
    Set an array option by either assigning an array of values, or append
    a single value to the option with #<<. To see if an
    array option has already been set with a specific value, query with
    #includes?. Remove specific set values with method
    #remove!.
2.3.4 Hash options
Hash options: command-line options that you can use multiple times to
    set key-value pairs. Example, “--metadata=author:"Huub de Beer"
    --metadata=title:"Pandocr manual"”
In pandocr, these options have some hash-like behavior:
require "pandocr"
pandoc = Pandocr::Converter.new
puts pandoc.metadata.set?  # => false
pandoc.metadata["author"] = "Huub de Beer"
pandoc.metadata["title"] = "Pandocr manual"
puts pandoc.metadata.set?  # => true
puts pandoc.metadata.value # => {"author" => "Huub de Beer", "title" => "Pandocr manual" }
puts pandoc.metadata.has_key? "title" # => true
pandoc.metadata.remove! "title"
puts pandoc.metadata.value # => {"author" => "Huub de Beer" }
    Put key-value pairs into a hash option with #[]=. To see
    if a key has been set, query with #has_key?. Remove a key
    with method #remove!.
2.4 Conversion variations
Pandocr offers different methods to convert your source input to target output. Pick one that suits your situation best.
For conversion between an input source string in one format to an
    output target string in another format, use #convert. For
    example:
require "pandocr"
pandoc = Pandocr::Converter.new
pandoc.from = Pandocr::OptionType::FromFormat::Latex
pandoc.to = Pandocr::OptionType::ToFormat::Html
pandoc.standalone.on
tex_input = "Some \LaTeX \emph{input} string"
html_output = pandoc.convert tex_input
puts html_output
    Alternatively, you can let pandoc write the output to file by using
    method #convert!. This conversion method is convenient when
    you your program does not postprocess the output any further. The above
    example becomes:
require "pandocr"
pandoc = Pandocr::Converter.new
pandoc.from = Pandocr::OptionType::FromFormat::Latex
pandoc.to = Pandocr::OptionType::ToFormat::Html
pandoc.standalone.on
tex_input = "Some \LaTeX \emph{input} string"
pandoc.convert!(
    tex_input, 
    target_file: Path.new("~/Documents/My_Document.html")
)
    Both conversion methods also have a variant that let pandoc read the files to convert. These variants are convenient when your program does not preprocess or generate the input source strings. For example:
require "pandocr"
pandoc = Pandocr::Converter.new
pandoc.from = Pandocr::OptionType::FromFormat::Latex
pandoc.to = Pandocr::OptionType::ToFormat::Html
pandoc.standalone.on
tex_input_file_ch1 = Path.new("~/Documents/My_Document/chapter1.tex")
tex_input_file_ch2 = Path.new("~/Documents/My_Document/chapter2.tex")
html_output = pandoc.convert tex_input_file_ch1, tex_input_file_ch2
puts html_output
    See the API documentation for
    Pandocr::Converter for more information.
2.5 Querying pandoc
You can query pandoc for information about its setup. For example, you
    can discover a user’s default data directory by calling pandoc
    --version. You can also ask pandoc to list its supported output
    formats, input formats, highlighting styles, and so on.
Typically, the result of querying pandoc does not change from one call
    to the next. Because calling pandoc is expensive, pandocr caches these
    pandoc queries via Pandocr::Converter.info. Find API
    documentation at Pandocr::PandocInfo.
3 Development
Pandocr uses make. Build pandocr with
make build
    It will run all tests, generate documentation, and cleans up crystal source code.
Tests are both unit tests and system tests. You need a pandoc installation to run the system tests.
3.1 Roadmap
In no particular order:
- Add a facility to log calls to pandoc.
 - Add support for extensions to types 
FromFormatandToFormat. - Add all query-like command-line options to 
PandocInfo, such as--list-input-formats,--list-highlight-styles, etc. - Add support for writing pandoc filters in crystal.
 
Note. Typically, I add features when I need them for my own projects. As a result, this roadmap will take years to complete.