Burning Highlights Into PDF

This highlighting method creates new PDF document on-the-fly, injecting highlights and navigation annotations into the original document. The resulting document can be opened in any PDF viewer and, assuming the viewer supports PDF Annotations feature, term highlights will be shown.

Check live example here.


PDF Bookmarks

For all found terms/phrases, PDF bookmarks will be created under the "Highlights" bookmark node. The title of the bookmark will be matching text.

The following options control and can be used to customize bookmark creation:

highlighter.pdf {
bookmarksForMatches {
enabled = true
title = "Highlights"
sectionTitle = "{tag}"
item = "{match} (pg {page})"

If query tags are used, as in multi-query highlighting requests, bookmarks to hits will be grouped in sections named by the tags. To have tag sections created in the top level, set "title" option to an empty string.


Customizing Navigation Elements

The settings section below lists configuration properties that define the style and text of messages and links that Highlighter adds to PDF documents:

highlighter.pdf.nav {
noteFontSize = 12
noteColor = AAAAAA
linkFontSize = 12
linkColor = 0000FF
# Note: Placeholder {linkedPage} can be used in link labels
prevLinkText = "< Previous Hit (pg.{linkedPage})"
nextLinkText = "Next Hit (pg.{linkedPage}) >"
firstMatchingPageLinkText = "Go to First Hit (pg.{linkedPage})"
searchMatchingPage = "Page {currentMatchingPage}/{totalMatchingPages} with Hits"

To override a property, add it to your application.conf file.

Post-processing PDF

After burning highlights and navigation into PDF, the Highlighter is running PDF through a post-processing phase. By default, this phase includes conversion to a linearized PDF format required for so-called "fast web view", but you may extend calling an external command for additional PDF filtering.

Post-processing options are grouped under the highlighter.pdf.postProcessing config section.


Linearized PDF is an optimized PDF format that allows PDF viewers to show the first document page as soon as it's loaded — without having to wait for the whole document to download. PDF Highlighter enables linearization for all PDF files above a certain size (e.g. 100KB). To disable this option or change the threshold, use the following options:

highlighter.pdf.postProcessing.linearizeInternal = true
highlighter.pdf.postProcessing.linearizeInternalMinFileSize = 100k

Running external command

You can run any external (command line) program for additional PDF processing. The command and parameters to pass are listed under the cmd option (that expects an array).

In the example below, we show how to linearize PDF using QPDF tool.

As of Highlighter v1.1, PDF linearization is supported internally, and there is no need to run an external tool to do it.

Adding QPDF to highlighting pipeline

From a system shell, we would use the following command to linearize PDF:

qpdf --linearize input.pdf output.pdf

To do this automatically after PDF is highlighted, add the following to Highlighter's application.conf (on Linux):

highlighter.pdf.postProcessing {
cmd = [ "/usr/bin/qpdf", "--linearize", "{inputFile}", "{outputFile}" ]

On Windows it would look just a bit different:

highlighter.pdf.postProcessing {
cmd = [ "C:/qpdf-5.1.2/bin/qpdf", "--linearize", "{inputFile}", "{outputFile}" ]

Placeholders {inputFile} and {outputFile} should be used as specified. Highlighter will replace them with temporary file paths.

comments powered by Disqus