Voyager supports metadata extraction from standard XML documents using XPath queries. It not only supports many standard metadata specifications out of the box, but also allows you to enter your own XPath queries to specific metadata elements and map them to searchable field names within Voyager's index. These field names can exist already, or be created on the fly. This topic provides an overview of the Voyager Metadata Extraction page, explains how to define XPath queries to metadata elements, and how to specify field mapping parameters.
To access the Metadata Extraction page, go to Manage Voyager > Discovery > Metadata.
To map the fields, configure these parameters:
6. Properties
The XML box allows you to enter in an XML document to test your XPath queries to paired elements.
Step 1: Click the XML tab and paste the contents of a valid XML document here. Click Save to save the XML contents.
In this case, the element we want extracted from the XML tab is City.
Step 2: Specify values for Selector, Field Name and Action.
Since we want to extract the field City, we copy the XPath Query from the XML document into the Selector box. "/metadata/metainfo/metc/cntinfo/cntaddr/city"
Step 3: Specify the corresponding Field Name that the queried element is mapped to. Voyager automatically detects the (Data) Type for the Field Name.
For example, here the Field Name is City, whose data type is String.
Note: when selecting a field name you'll need to either select an existing field name or you can also enter a custom field name as long as it uses a prefix "meta_", "id_".
Step 4: Click Test. The extractor searches the XML document for the queried metadata element, and retrieves the value for the field City. The results are presented in the Output tab.
In this specific example, "Washington D.C.", which is the value for the City query. is retrieved from the XML tab and displayed in the Output tab. When included in the index in this way, users can use this output result to search for XML documents through Voyager's search UI.
Step 5: Click Save to add the XPath query to the list.