Class: Paru::Pandoc

Inherits:
Object
  • Object
show all
Defined in:
lib/paru/pandoc.rb

Overview

Pandoc is a wrapper around the pandoc document converter. See <pandoc.org/README.html> for details about pandoc. The Pandoc class is basically a straightforward translation from the pandoc command line program to Ruby. It is a Rubyesque API to work with pandoc.

For information about writing pandoc filters in Ruby see Filter.

Creating a Paru pandoc converter in Ruby is quite straightforward: you create a new Paru::Pandoc object with a block that configures that Pandoc object with pandoc options. Each command-line option to pandoc is a method on the Pandoc object. Command-line options with dashes in them, such as “–reference-docx”, can be called by replacing the dash with an underscore. So, “–reference-docx” becomes the method reference_docx.

Pandoc command-line flags, such as “–parse-raw”, “–chapters”, or “–toc”, have been translated to Paru::Pandoc methods that take an optional Boolean parameter; true is the default value. Therefore, if you want to enable a flag, no parameter is needed.

All other pandoc command-line options are translated to Paru::Pandoc methods that take either one String or Number argument, or a list of String arguments if that command-line option can occur more than once (such as “–include-before-header” or “–filter”).

Once you have configured a Paru::Pandoc converter, you can call convert or << (which is an alias for convert) with a string to convert. You can call convert as often as you like and, if you like, reconfigure the converter in between!

Examples:

Convert the markdown string 'hello world' to HTML

Paru::Pandoc.new do
    from 'markdown
    to 'html'
end << 'hello *world*'

Convert a HTML file to DOCX with a reference file

Paru::Pandoc.new do
    from "html"
    to "docx"
    reference_docx "styled_output.docx"
    output "output.docx"
end.convert File.read("input.html")

Convert a markdown file to html but add in references in APA style

Paru::Pandoc.new do
    from "markdown"
    toc
    bibliography "literature.bib"
    to "html"
    csl "apa.csl"
    output "report_with_references.md"
end << File.read("report.md")

Constant Summary collapse

DEFAULT_OPTION_SEP =

Use a readable option separator on Unix-like systems, but fall back to a space on Windows.

if Gem.win_platform? then " " else " \\\n\t" end
PARU_PANDOC_PATH =

Path to the pandoc executatble to use by paru.

"PARU_PANDOC_PATH"

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(&block) ⇒ Pandoc

Create a new Pandoc converter, optionally configured by a block with pandoc options. See #configure on how to configure a converter.

Parameters:

  • block (Proc)

    an optional configuration block.



105
106
107
108
# File 'lib/paru/pandoc.rb', line 105

def initialize(&block)
    @options = {}
    configure(&block) if block_given?
end

Class Method Details

.infoInfo

Gather information about the pandoc installation. It runs pandoc –version and extracts pandoc's version number and default data directory. This method is typically used in scripts that use Paru to automate the use of pandoc.

Returns:

  • (Info)

    Pandoc's version, such as “[2.10.1]” and the data directory, such as “/home/huub/.pandoc”.



97
98
99
# File 'lib/paru/pandoc.rb', line 97

def self.info()
    @@info
end

Instance Method Details

#configure(&block) ⇒ Pandoc

Configure this Pandoc converter with block. In the block you can call all pandoc options as methods on this converter. In multi-word options the dash (-) is replaced by an underscore (_)

Pandoc has a number of command line options. Most are simple options, like flags, that can be set only once. Other options can occur more than once, such as the css option: to add more than one css file to a generated standalone html file, use the css options once for each stylesheet to include. Other options do have the pattern key, which can also occur multiple times, such as metadata.

All options are specified in a pandoc_options.yaml. If it is an option that can occur only once, the value of the option in that yaml file is its default value. If the option can occur multiple times, its value is an array with one value, the default value.

Examples:

Configure converting HTML to LaTeX with a LaTeX engine

converter.configure do
    from 'html'
    to 'latex'
    latex_engine 'lualatex'
end

Parameters:

  • block (Proc)

    the options to pandoc

Returns:

  • (Pandoc)

    this Pandoc converter



136
137
138
139
# File 'lib/paru/pandoc.rb', line 136

def configure(&block)
    instance_eval(&block)
    self
end

#convert(input) ⇒ String Also known as: <<

Converts input string to output string using the pandoc invocation configured in this Pandoc instance.

formats, output to STDOUT is not supported (see pandoc's manual) and the result string will be empty.

The following two examples are the same:

Examples:

Using convert

output = converter.convert 'this is a *strong* word'

Using <<

output = converter << 'this is a *strong* word'

Parameters:

  • input (String)

    the input string to convert

Returns:

  • (String)

    the converted output as a string. Note. For some



156
157
158
# File 'lib/paru/pandoc.rb', line 156

def convert(input)
    run_converter to_command, input
end

#convert_file(input_file) ⇒ String

Converts an input file to output string using the pandoc invocation configured in this Pandoc instance. The path to the input file is appended to that invocation.

formats, output to STDOUT is not supported (see pandoc's manual) and the result string will be empty.

Examples:

Using convert_file

output = converter.convert_file 'files/document.md'

Parameters:

  • input_file (String)

    the path to the input file to convert

Returns:

  • (String)

    the converted output as a string. Note. For some



172
173
174
# File 'lib/paru/pandoc.rb', line 172

def convert_file(input_file)
    run_converter "#{to_command} #{input_file}"
end

#to_command(option_sep = DEFAULT_OPTION_SEP) ⇒ String

Create a string representation of this converter's pandoc command line invocation. This is useful for debugging purposes.

Parameters:

  • option_sep (String) (defaults to: DEFAULT_OPTION_SEP)

    the string to separate options with

Returns:

  • (String)

    This converter's command line invocation string.



181
182
183
# File 'lib/paru/pandoc.rb', line 181

def to_command(option_sep = DEFAULT_OPTION_SEP)
    "#{escape(@@pandoc_exec)}\t#{to_option_string option_sep}"
end