Class: Paru::Filter
- Inherits:
-
Object
- Object
- Paru::Filter
- Defined in:
- lib/paru/filter.rb
Overview
Filter is used to write your own pandoc filter in Ruby. A Filter is almost always created and immediately executed via the run
method. The most simple filter you can write in paru is the so-called “identity”:
#!/usr/bin/env ruby
# Identity filter
require 'paru/filter'
Paru::Filter.run do
# nothing
end
It runs the filter, but it makes no selection nor performs an action. This is pretty useless, of course—although it makes for a great way to test the filter functionality—, but it shows the general setup of a filter well.
Writing a simple filter: numbering figures
Inside a Filter.run block, you specify selectors with actions. For example, to number all figures in a document and prefix their captions with “Figure”, the following filter would work:
#!/usr/bin/env ruby
# Number all figures in a document and prefix the caption with "Figure".
require 'paru/filter'
figure_counter = 0
Paru::Filter.run do
with 'Image' do |image|
figure_counter += 1
image.inner_markdown = "Figure #{figure_counter}. #{image.inner_markdown}"
end
end
This filter selects all PandocFilter::Image nodes. For each PandocFilter::Image node it increments the figure counter figure_counter
and then sets the figure’s caption to “Figure” followed by the figure count and the original caption. In other words, the following input document


will be transformed into


The method PandocFilter::InnerMarkdown#inner_markdown and its counterpart PandocFilter::Node#markdown are a great way to manipulate the contents of a selected PandocFilter::Node. No messing about creating and filling PandocFilter::Nodes, you can just use pandoc’s own markdown format!
Writing a more involved filters
Using the “follows” selector: Numbering figures and chapters
The previous example can be extended to also number chapters and to start numbering figures anew per chapter. As you would expect, we need two counters, one for the figures and one for the chapters:
#!/usr/bin/env ruby
# Number figures per chapter; number chapters as well
require 'paru/filter'
current_chapter = 0
current_figure = 0
Paru::Filter.run do
with 'Header' do |header|
if header.level == 1
current_chapter += 1
current_figure = 0
header.inner_markdown = "Chapter #{current_chapter}. #{header.inner_markdown}"
end
end
with 'Header + Image' do |image|
current_figure += 1
image.inner_markdown = "Figure #{current_chapter}.#{current_figure}. #{image.inner_markdown}"
end
end
What is new in this filter, however, is the selector “Header + Image” which selects all PandocFilter::Image nodes that follow a PandocFilter::Header node. Documents in pandoc have a flat structure where chapters do not exists as separate concepts. Instead, a chapter is implied by a header of a certain level and everything that follows until the next header of that level.
Using the “child of” selector: Annotate custom blocks
Hierarchical structures do exist in a pandoc document, however. For example, the contents of a paragraph (PandocFilter::Para), which itself is a PandocFilter::Block level node, are PandocFilter::Inline level nodes. Another example are custom block or PandocFilter::Div nodes. You select a child node by using the > selector as in the example below:
#!/usr/bin/env ruby
# Annotate custom blocks: example blocks and important blocks
require 'paru/filter'
example_count = 0
Paru::Filter.run do
with 'Div.example > Header' do |header|
if header.level == 3
example_count += 1
header.inner_markdown = "Example #{example_count}: #{header.inner_markdown}"
end
end
with 'Div.important' do |d|
d.inner_markdown = d.inner_markdown + "\n\n*(important)*"
end
end
Here all PandocFilter::Header nodes that are inside a PandocFilter::Div node are selected. Furthermore, if these headers are of level 3, they are prefixed by the string “Example” followed by a count.
In this example, “important” PandocFilter::Div nodes are annotated by putting the string important before the contents of the node.
Using a distance in a selector: Capitalize the first N characters of
a paragraph
Given the flat structure of a pandoc document, the “follows” selector has quite a reach. For example, “Header + Para” selects all paragraphs that follow a header. In most well-structured documents, this would select basically all paragraphs.
But what if you need to be more specific? For example, if you would like to capitalize the first sentence of each first paragraph of a chapter, you need a way to specify a sequence number of sorts. To that end, paru filter selectors take an optional distance parameter. A filter for this example could look like:
#!/usr/bin/env ruby
# Capitalize the first N characters of a paragraph
require 'paru/filter'
END_CAPITAL = 10
Paru::Filter.run do
with 'Header +1 Para' do |p|
text = p.inner_markdown
first_line = text.slice(0, END_CAPITAL).upcase
rest = text.slice(END_CAPITAL, text.size)
p.inner_markdown = first_line + rest
end
end
The distance is denoted after a selector by an integer. In this case “Header +1 Para” selects all PandocFilter::Para nodes that directly follow an PandocFilter::Header node. You can use a distance with any selector.
Manipulating nodes: Removing horizontal lines
Although the PandocFilter::InnerMarkdown#inner_markdown and PandocFilter::Node#markdown work in most situations, sometimes direct manipulation of the pandoc document AST is useful. These PandocFilter::ASTManipulation methods are mixed in PandocFilter::Node and can be used on any node in your filter. For example, to delete all PandocFilter::HorizontalRule nodes, can use a filter like:
#!/usr/bin/env ruby
require 'paru/filter'
Paru::Filter.run do
with 'HorizontalRule' do |rule|
rule.parent.delete rule if rule.has_parent?
end
end
Note that you could have arrived at the same effect by using:
rule.markdown = ""
Manipulating metadata:
One of the interesting features of the pandoc markdown format is the ability to add metadata to a document via a YAML block or command line options. For example, if you use a template that uses the metadata property $date$ to write a date on a title page, it is quite useful to automatically add the date of today to the metadata. You can do so with a filter like:
#!/usr/bin/env ruby
## Add today's date to the metadata
require 'paru/filter'
require 'date'
Paru::Filter.run do
before do
['date'] = Date.today.to_s
end
end
In a filter, the metadata
property is a Ruby Hash of Strings, Numbers, Booleans, Arrays, and Hashes. You can manipulate it like any other Ruby Hash.
Instance Attribute Summary collapse
-
#current_node ⇒ Node
The node in the AST of the document being filtered that is currently being inspected by the filter.
-
#document ⇒ Document
The document being filtered.
-
#metadata ⇒ Hash
The metadata of the document being filtered as a Ruby Hash.
Class Method Summary collapse
-
.run(treat_metadata_strings_as_plain_strings: false, &block) ⇒ Object
Run the filter specified by block.
Instance Method Summary collapse
-
#after {|Document| ... } ⇒ Object
After running the filter on all nodes, the
document
is passed to the block to thisafter
method. -
#before {|Document| ... } ⇒ Object
Before running the filter on all nodes, the
document
is passed to the block to thisbefore
method. -
#filter(&block) ⇒ JSON
Create a filter using
block
. -
#initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) ⇒ Filter
constructor
Create a new Filter instance.
-
#stop! ⇒ Object
Stop processing the document any further and output it as it is now.
-
#with(selector) {|Node| ... } ⇒ Object
Specify what nodes to filter with a
selector
.
Constructor Details
#initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) ⇒ Filter
Create a new Filter instance. For convenience, run creates a new Paru::Filter and runs it immediately. Use this constructor if you want to run a filter on different input and output streams that STDIN and STDOUT respectively.
toggle to treat metadata string values as plain strings instead of markdown strings if all AST leaf metadata string values have pandoc type “MetaString”. This option is only relevant when you only set metadata string values via command-line option ‘–metadata` and not also via a YAML or title block. Using this option improves performance in this specific situation because metadata values don’t have to be converted to string by pandoc in a separate process but can be collected as is.
242 243 244 245 246 |
# File 'lib/paru/filter.rb', line 242 def initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) @input = input @output = output @treat_metadata_strings_as_plain_strings = end |
Instance Attribute Details
#current_node ⇒ Node
Returns The node in the AST of the document being filtered that is currently being inspected by the filter.
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/paru/filter.rb', line 222 class Filter attr_reader :metadata, :document, :current_node # Create a new Filter instance. For convenience, {run} creates a new # {Filter} and runs it immediately. Use this constructor if you want # to run a filter on different input and output streams that STDIN and # STDOUT respectively. # # @param input [IO = $stdin] the input stream to read, defaults to # STDIN # @param output [IO = $stdout] the output stream to write, defaults to # STDOUT # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. def initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) @input = input @output = output @treat_metadata_strings_as_plain_strings = end # Run the filter specified by block. This is a convenience method that # creates a new {Filter} using input stream STDIN and output stream # STDOUT and immediately runs {filter} with the block supplied. # # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. # @param block [Proc] the filter specification # # @example Add 'Figure' to each image's caption # Paru::Filter.run do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end def self.run(treat_metadata_strings_as_plain_strings: false, &block) Filter.new( $stdin, $stdout, treat_metadata_strings_as_plain_strings: ).filter(&block) end # Create a filter using +block+. In the block you specify # selectors and actions to be performed on selected nodes. In the # example below, the selector is "Image", which selects all image # nodes. The action is to prepend the contents of the image's caption # by the string "Figure. ". # # @param block [Proc] the filter specification # # @return [JSON] a JSON string with the filtered pandoc AST # # @example Add 'Figure' to each image's caption # input = IOString.new(File.read("my_report.md") # output = IOString.new # # Paru::Filter.new(input, output).filter do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end # def filter(&block) @selectors = {} @filtered_nodes = [] @document = read_document @metadata = PandocFilter::Metadata.new( @document., treat_metadata_strings_as_plain_strings: @treat_metadata_strings_as_plain_strings ) nodes_to_filter = Enumerator.new do |node_list| @document.each_depth_first do |node| node_list << node end end @current_node = @document @ran_before = false @ran_after = false instance_eval(&block) # run filter with before block @ran_before = true nodes_to_filter.each do |node| if @current_node.has_been_replaced? @current_node = @current_node.get_replacement @filtered_nodes.pop else @current_node = node end @filtered_nodes.push @current_node instance_eval(&block) # run the actual filter code end @ran_after = true instance_eval(&block) # run filter with after block write_document end # Specify what nodes to filter with a +selector+. If the +current_node+ # matches that selector, it is passed to the block to this +with+ method. # # @param selector [String] a selector string # @yield [Node] the current node if it matches the selector def with(selector) return unless @ran_before && !@ran_after @selectors[selector] = Selector.new selector unless @selectors.key? selector yield @current_node if @selectors[selector].matches? @current_node, @filtered_nodes end # Before running the filter on all nodes, the +document+ is passed to # the block to this +before+ method. This method is run exactly once. # # @yield [Document] the document def before yield @document unless @ran_before end # After running the filter on all nodes, the +document+ is passed to # the block to this +after+ method. This method is run exactly once. # # @yield [Document] the document def after yield @document if @ran_after end # Stop processing the document any further and output it as it is now. # This is a great timesaver for filters that only act on a small # number of nodes in a large document, or when you only want to set # the metadata. # # Note, stop will break off the filter immediately after outputting # the document in its current state. def stop! write_document exit end private # The Document node from JSON formatted pandoc document structure # on STDIN that is being filtered # # @return [Document] create a new Document node from a pandoc AST from # JSON from STDIN def read_document PandocFilter::Document.from_JSON @input.read end # Write the document being filtered to STDOUT def write_document @document. = @metadata. @output.write @document.to_JSON end end |
#document ⇒ Document
Returns The document being filtered.
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/paru/filter.rb', line 222 class Filter attr_reader :metadata, :document, :current_node # Create a new Filter instance. For convenience, {run} creates a new # {Filter} and runs it immediately. Use this constructor if you want # to run a filter on different input and output streams that STDIN and # STDOUT respectively. # # @param input [IO = $stdin] the input stream to read, defaults to # STDIN # @param output [IO = $stdout] the output stream to write, defaults to # STDOUT # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. def initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) @input = input @output = output @treat_metadata_strings_as_plain_strings = end # Run the filter specified by block. This is a convenience method that # creates a new {Filter} using input stream STDIN and output stream # STDOUT and immediately runs {filter} with the block supplied. # # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. # @param block [Proc] the filter specification # # @example Add 'Figure' to each image's caption # Paru::Filter.run do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end def self.run(treat_metadata_strings_as_plain_strings: false, &block) Filter.new( $stdin, $stdout, treat_metadata_strings_as_plain_strings: ).filter(&block) end # Create a filter using +block+. In the block you specify # selectors and actions to be performed on selected nodes. In the # example below, the selector is "Image", which selects all image # nodes. The action is to prepend the contents of the image's caption # by the string "Figure. ". # # @param block [Proc] the filter specification # # @return [JSON] a JSON string with the filtered pandoc AST # # @example Add 'Figure' to each image's caption # input = IOString.new(File.read("my_report.md") # output = IOString.new # # Paru::Filter.new(input, output).filter do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end # def filter(&block) @selectors = {} @filtered_nodes = [] @document = read_document @metadata = PandocFilter::Metadata.new( @document., treat_metadata_strings_as_plain_strings: @treat_metadata_strings_as_plain_strings ) nodes_to_filter = Enumerator.new do |node_list| @document.each_depth_first do |node| node_list << node end end @current_node = @document @ran_before = false @ran_after = false instance_eval(&block) # run filter with before block @ran_before = true nodes_to_filter.each do |node| if @current_node.has_been_replaced? @current_node = @current_node.get_replacement @filtered_nodes.pop else @current_node = node end @filtered_nodes.push @current_node instance_eval(&block) # run the actual filter code end @ran_after = true instance_eval(&block) # run filter with after block write_document end # Specify what nodes to filter with a +selector+. If the +current_node+ # matches that selector, it is passed to the block to this +with+ method. # # @param selector [String] a selector string # @yield [Node] the current node if it matches the selector def with(selector) return unless @ran_before && !@ran_after @selectors[selector] = Selector.new selector unless @selectors.key? selector yield @current_node if @selectors[selector].matches? @current_node, @filtered_nodes end # Before running the filter on all nodes, the +document+ is passed to # the block to this +before+ method. This method is run exactly once. # # @yield [Document] the document def before yield @document unless @ran_before end # After running the filter on all nodes, the +document+ is passed to # the block to this +after+ method. This method is run exactly once. # # @yield [Document] the document def after yield @document if @ran_after end # Stop processing the document any further and output it as it is now. # This is a great timesaver for filters that only act on a small # number of nodes in a large document, or when you only want to set # the metadata. # # Note, stop will break off the filter immediately after outputting # the document in its current state. def stop! write_document exit end private # The Document node from JSON formatted pandoc document structure # on STDIN that is being filtered # # @return [Document] create a new Document node from a pandoc AST from # JSON from STDIN def read_document PandocFilter::Document.from_JSON @input.read end # Write the document being filtered to STDOUT def write_document @document. = @metadata. @output.write @document.to_JSON end end |
#metadata ⇒ Hash
Returns The metadata of the document being filtered as a Ruby Hash.
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 |
# File 'lib/paru/filter.rb', line 222 class Filter attr_reader :metadata, :document, :current_node # Create a new Filter instance. For convenience, {run} creates a new # {Filter} and runs it immediately. Use this constructor if you want # to run a filter on different input and output streams that STDIN and # STDOUT respectively. # # @param input [IO = $stdin] the input stream to read, defaults to # STDIN # @param output [IO = $stdout] the output stream to write, defaults to # STDOUT # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. def initialize(input = $stdin, output = $stdout, treat_metadata_strings_as_plain_strings: false) @input = input @output = output @treat_metadata_strings_as_plain_strings = end # Run the filter specified by block. This is a convenience method that # creates a new {Filter} using input stream STDIN and output stream # STDOUT and immediately runs {filter} with the block supplied. # # @param treat_metadata_strings_as_plain_strings [Boolean = false] feature # toggle to treat metadata string values as plain strings instead of # markdown strings if all AST leaf metadata string values have pandoc type # "MetaString". This option is only relevant when you **only** set metadata # string values via command-line option `--metadata` and not also via a # YAML or title block. Using this option improves performance in this # specific situation because metadata values don't have to be converted to # string by pandoc in a separate process but can be collected as is. # @param block [Proc] the filter specification # # @example Add 'Figure' to each image's caption # Paru::Filter.run do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end def self.run(treat_metadata_strings_as_plain_strings: false, &block) Filter.new( $stdin, $stdout, treat_metadata_strings_as_plain_strings: ).filter(&block) end # Create a filter using +block+. In the block you specify # selectors and actions to be performed on selected nodes. In the # example below, the selector is "Image", which selects all image # nodes. The action is to prepend the contents of the image's caption # by the string "Figure. ". # # @param block [Proc] the filter specification # # @return [JSON] a JSON string with the filtered pandoc AST # # @example Add 'Figure' to each image's caption # input = IOString.new(File.read("my_report.md") # output = IOString.new # # Paru::Filter.new(input, output).filter do # with "Image" do |image| # image.inner_markdown = "Figure. #{image.inner_markdown}" # end # end # def filter(&block) @selectors = {} @filtered_nodes = [] @document = read_document @metadata = PandocFilter::Metadata.new( @document., treat_metadata_strings_as_plain_strings: @treat_metadata_strings_as_plain_strings ) nodes_to_filter = Enumerator.new do |node_list| @document.each_depth_first do |node| node_list << node end end @current_node = @document @ran_before = false @ran_after = false instance_eval(&block) # run filter with before block @ran_before = true nodes_to_filter.each do |node| if @current_node.has_been_replaced? @current_node = @current_node.get_replacement @filtered_nodes.pop else @current_node = node end @filtered_nodes.push @current_node instance_eval(&block) # run the actual filter code end @ran_after = true instance_eval(&block) # run filter with after block write_document end # Specify what nodes to filter with a +selector+. If the +current_node+ # matches that selector, it is passed to the block to this +with+ method. # # @param selector [String] a selector string # @yield [Node] the current node if it matches the selector def with(selector) return unless @ran_before && !@ran_after @selectors[selector] = Selector.new selector unless @selectors.key? selector yield @current_node if @selectors[selector].matches? @current_node, @filtered_nodes end # Before running the filter on all nodes, the +document+ is passed to # the block to this +before+ method. This method is run exactly once. # # @yield [Document] the document def before yield @document unless @ran_before end # After running the filter on all nodes, the +document+ is passed to # the block to this +after+ method. This method is run exactly once. # # @yield [Document] the document def after yield @document if @ran_after end # Stop processing the document any further and output it as it is now. # This is a great timesaver for filters that only act on a small # number of nodes in a large document, or when you only want to set # the metadata. # # Note, stop will break off the filter immediately after outputting # the document in its current state. def stop! write_document exit end private # The Document node from JSON formatted pandoc document structure # on STDIN that is being filtered # # @return [Document] create a new Document node from a pandoc AST from # JSON from STDIN def read_document PandocFilter::Document.from_JSON @input.read end # Write the document being filtered to STDOUT def write_document @document. = @metadata. @output.write @document.to_JSON end end |
Class Method Details
.run(treat_metadata_strings_as_plain_strings: false, &block) ⇒ Object
Run the filter specified by block. This is a convenience method that creates a new Paru::Filter using input stream STDIN and output stream STDOUT and immediately runs #filter with the block supplied.
toggle to treat metadata string values as plain strings instead of markdown strings if all AST leaf metadata string values have pandoc type “MetaString”. This option is only relevant when you only set metadata string values via command-line option ‘–metadata` and not also via a YAML or title block. Using this option improves performance in this specific situation because metadata values don’t have to be converted to string by pandoc in a separate process but can be collected as is.
268 269 270 271 272 273 274 |
# File 'lib/paru/filter.rb', line 268 def self.run(treat_metadata_strings_as_plain_strings: false, &block) Filter.new( $stdin, $stdout, treat_metadata_strings_as_plain_strings: ).filter(&block) end |
Instance Method Details
#after {|Document| ... } ⇒ Object
After running the filter on all nodes, the document
is passed to the block to this after
method. This method is run exactly once.
362 363 364 |
# File 'lib/paru/filter.rb', line 362 def after yield @document if @ran_after end |
#before {|Document| ... } ⇒ Object
Before running the filter on all nodes, the document
is passed to the block to this before
method. This method is run exactly once.
354 355 356 |
# File 'lib/paru/filter.rb', line 354 def before yield @document unless @ran_before end |
#filter(&block) ⇒ JSON
Create a filter using block
. In the block you specify selectors and actions to be performed on selected nodes. In the example below, the selector is “Image”, which selects all image nodes. The action is to prepend the contents of the image’s caption by the string “Figure. ”.
296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 |
# File 'lib/paru/filter.rb', line 296 def filter(&block) @selectors = {} @filtered_nodes = [] @document = read_document @metadata = PandocFilter::Metadata.new( @document., treat_metadata_strings_as_plain_strings: @treat_metadata_strings_as_plain_strings ) nodes_to_filter = Enumerator.new do |node_list| @document.each_depth_first do |node| node_list << node end end @current_node = @document @ran_before = false @ran_after = false instance_eval(&block) # run filter with before block @ran_before = true nodes_to_filter.each do |node| if @current_node.has_been_replaced? @current_node = @current_node.get_replacement @filtered_nodes.pop else @current_node = node end @filtered_nodes.push @current_node instance_eval(&block) # run the actual filter code end @ran_after = true instance_eval(&block) # run filter with after block write_document end |
#stop! ⇒ Object
Stop processing the document any further and output it as it is now. This is a great timesaver for filters that only act on a small number of nodes in a large document, or when you only want to set the metadata.
Note, stop will break off the filter immediately after outputting the document in its current state.
373 374 375 376 |
# File 'lib/paru/filter.rb', line 373 def stop! write_document exit end |
#with(selector) {|Node| ... } ⇒ Object
Specify what nodes to filter with a selector
. If the current_node
matches that selector, it is passed to the block to this with
method.
343 344 345 346 347 348 |
# File 'lib/paru/filter.rb', line 343 def with(selector) return unless @ran_before && !@ran_after @selectors[selector] = Selector.new selector unless @selectors.key? selector yield @current_node if @selectors[selector].matches? @current_node, @filtered_nodes end |