Welcome to Voyager Help

Use the Search field below or select a Category from the list at the left

Vose System Requirements

System Requirements

There are three factors to take into account when determining  system recommendations:

  • CPU - The number of processors (ie cores) on the machine
  • Memory - The RAM capacity of the machine
  • Disk - The amount of storage available to the machine

Voyager HQ

The system requirements for Voyager HQ are directly dependent on usage and configuration. An HQ instance contains a local Agent that can perform indexing, and if only the local Agent is used, requirements are the same as that of Voyager Agent. If an HQ instance is not being used for indexing system requirements can be relaxed.

CPU

Same as Agent but a minimum of 2 CPUs when using only remote Agents.

Memory

Same as Agent but 2-4 GB of RAM when using only remote Agents

Disk

Storage requirements for HQ are modest. Most of the data created by HQ is transient and will be deleted after indexing is complete. Systems that index larger numbers of files (resulting in large extraction queues) will have increased storage requirements.

Voyager Agent

In Vose, Voyager Agent performs file indexing (although HQ also has the capacity to index data separately from Agents). The indexing process is highly CPU-dependent, so on machines running Agent the emphasis is primarily on CPU.

CPU

Agent is designed to utilize all of the CPU capacity on a machine to maximize indexing throughput and performance, so in general more CPUs are always better.

  • At minimum, a machine should have at least 2 cores dedicated to the Agent application (ie. not doing other work on the system).
  • For high-throughput configurations, 8 cores is optimal.

Memory

On a machine running Agent, indexing data such as Microsoft Office files requires it to process large amounts of text and other data in memory.

  • On a Java Virtual Machine running Agent that is indexing very large amounts of text, 4-6 GB of RAM is recommended
  • On a machine that is not indexing massive amounts of text and doesn't require high throughput, 1-2 GB of RAM should suffice

Disk

Storage requirements for Agent are modest. Most of the data created by Agent is transient and will be deleted after indexing is complete. Systems that index larger numbers of files (resulting in large extraction queues) will have increased storage requirements. 

Flex Index Node

For a Flex Index node, the emphasis is on memory and storage. In a flex deployment, the index is divided into subsets called Shards which are distributed between multiple servers. While Flex Index storage requirements can be significant, they are not as extreme as for Voyager Server when it is running a local index.

CPU

CPU requirements are roughly the same as that of a Voyager Server running a local index.

Memory

Memory requirements are roughly the same as that of a Voyager Server running a local index.

Disk

Disk usage for an index is hard to estimate as it is highly dependant on a number of variables such as the type of data being indexed, file sizes, etc… The best way to determine disk requirements is to:

  1. Index a small subset (1000 documents) of your data and note the disk usage.
  2. Generously estimate the total number of documents in your “final” index and extrapolate out from the disk usage calculated in step 1.
  3. Double or triple that number to allow for the index to grow.

Running out of disk space is a catastrophic event for a search index and can result in data loss. It’s important to estimate up front properly and routinely monitor disk usage.

One factor that drastically affects disk space requirements when the index contains  a mix of data formats (Office, GIS, Imagery, etc) is whether or not the Text field is stored by default. How much stored text increases size again highly depends on the types and sizes of documents being indexed. The important thing to note is that storing text can increase disk usage by orders of magnitude.

In a Vose installation the meta folder (thumbnails, meta data) must also be considered when determining disk usage.  While it’s recommended to estimate the size of the meta folder using the same methodology as the search index in general the size of the meta folder is approximately around 16 MB per 1000 documents

Voyager Server 2.0

CPU

CPU requirements are similar to those of Agent

Memory

Solr indexing benefits from access to more memory, so RAM requirements can be high. A typical configuration is to have 16 GB of total RAM available on the system with 6 GB assigned to the Voyager Server jvm process. Voyager Server uses the remaining memory to store the index files.

Disk

Disk requirements for a Voyager Server are the same of that as a Flex Index Node. See the previous section for details.

The approximate size of the meta folder (thumbnails, meta data) is same for both settings: 16 MB per 1000 documents

Other Software Requirements

See Vose Software Requirements for a list of software Vose requires for some repository connectors, extractors, pipeline steps and processing tasks.

Web Design and Web Development by Buildable