Creating Pipelines
Voyager's Indexing Pipeline provides functions to transform and manipulate the properties (metadata) of data records as it adds them to the Index. When adding or editing a Repository, the pipeline section lets you choose an existing pipeline (if any) or create a new pipeline to apply to the Repository.
NOTE: A Repository can only be associated with one pipeline at any given time, though a given pipeline may be associated with multiple Repositories.
Creating a New Pipeline
You can assign different functions to different points in the Indexing process (Indexing has two phases: Scanning and Extraction - see this article for more information).
There are four places you can add Pipeline steps, shown in the following diagram:

- Post-Scan - steps here are executed immediately after Scanning is complete
- Pre-Extraction - steps here are executed just prior to Extraction
- Post-Extraction - steps here are executed immediately after Extraction is complete
- Pre-Index -steps here are executed just prior to the information being written to the Index
Notes
The Post-Scan and Pre-Index steps are always executed, while Pre-Extraction and Post-Extraction steps may or may not be executed, depending on the settings for a particular Repository.
Adding a Pipeline Step
To add one or more steps to a Pipeline:
On the Create Pipeline or Edit Pipeline page, click Add Pipeline Steps Here.

This displays the list of all available Pipeline Steps. You can add steps to more than one point in the Pipeline.
Click Save when you are done.
Pipeline Steps
You can add one or more of the following steps to different points in the Pipeline sequence:
- S3 Blobs Upload
Uploads thumbnails to Amazon S3
- Calculate MD5 Checksum
Calculates an MD5 checksum for the source file content
- Copy Field
Copies fields in a document
- Rename Field
Renames a field by moving it to a new field name
- Remove Field
Removes a field from a document
- Transform Field Value
Transforms a field value
- Set Value for Field
Sets a field to a specific value
- Append Value to Field
Appends a value to a field. If the field is empty, behaves like Set Value for Field
- Extract Entities with NLP
Uses Natural Language Processing to extract categorized entities from the text content
- Create Thumbnail with Base Map
Creates a Thumbnail with a specific base map that can be customized
- GeoTag Standard
Geotags the document content with the standard Gazetteer
- Geotag Custom
Geotags the document content with a custom Gazetteer
- Calculate Centroid
Useful for generating a cluster map of points when indexing data with polygonal geometry
- Set Extent from PRJ File
Transforms spatial extent based on projection information found in a component .prj file
- Add meta XML tags
Adds meta XML tags
- Convert EXIF GPS data
Uses EXIF GPS coordinates to assign a location and spatial representation to the record.