Welcome to Voyager Help

Use the Search field below or select a Category from the list at the left

Creating a Custom Python Connector in HQ

This tutorial demonstrates how to create a new Connector for HQ using Python. Connectors are required to connect to content repositories to extract fields, metadata and other pieces of information. This example will connect to a file folder and extract information from CSV files. Each row in a CSV file will be an entry in the index. In addition, this example will show how to create one-to-many relationships where the CSV file entry will be linked to each row entry.

Before getting started, it is import to know how to add and manage repositories in HQ. To learn more about repositories, see Managing Repositories.

Tutorial

Step 1

Ensure the following are installed:

  • HQ
  • Python 2.7 or higher
  • A Python IDE (recommended)

Step 2

Go to your HQ home directory:

  • On Windows, the home directory defaults to the AppData directory, typically at C:\Users\<user>\AppData\Roaming\Voyager\hq.
  • On Linux systems, the home directory defaults to the user home directory at ~/.voyager/hq.

Step 3

Create the new Python script:

  • In the home directory, go to AppData\Roaming\Voyager\hq\pywhich contains a Python module that includes a base class for Voyager connectors.
  • In the voyager directory, open the connectors directory and create a new folder named CSV
  • Create a new Python script named csv_connector.py in the CSV directory.

Step 4

 Edit the connector.py script and add the following code:

Step 5

Test the connector by adding a new Repository:

  1. In HQ, go to the Repositories page and click Add New
  2. Choose File for the Repository type and click Next - you should now see CSV Files as a file type in the drop-down menu.
  3. Enter a name
  4. Browse for a directory containing CSV files.
  5. Click the File Names to list the CSV files. This is populated by calling the list() function within your connector class. You may select all files or choose individual files to index.
  6. In addition, you can choose to map fields. When mapping fields, you will be prompted with a list of fields to include or exclude as well as an option to map field names to new names. In this example, CNTRY_name is mapped to name



  7. The field mapping is populated by calling the info() method in your connector class.
  8. When you have entered all of the relevant information, click Next, then follow the rest of the workflow for creating a Repository.  
  9. When the Repository has been added, click Index Now.This will begin indexing the information by calling the scan() method. If you have one CSV file with 50 records, you should get 51 items in the index for that repository (one for the CSV file and one for each record).

Step 6

Learn more about the Python code:

Take some time to examine the Python code and read the documentation strings and comments. An entry is a Python dictionary with the required keys of entry and id. This entry will also include fields. The entry would look like this:

 {  
    'entry': {  
        'fields': {  
            'meta_table_name': 'world_countries.csv',  
            'name': 'Vanuatu',  
            'repository': 'r16524da57d1',  
            'format': 'text/csv-record',  
            'format_category': 'Office',  
            'fs_SQMI': '3265.07',  
            'fs_FIPS_CNTRY': 'NH',  
            'fs_STATUS': 'UNMemberState',  
            'fs_POP2005': '205754',  
            'format_type': 'Record'  
        }  
    },  
    'id': 'r16524da57d1_world_countries_248'  
}

Additional Information

  • After creating and saving a repository, you can view the configuration by opening the json file located in the config directory in your HQ home location.
  • A log file is generated each time you index your repository. This log file is created in the HQ home directory under logs/py/connector/<date>/…
  • To add status messages to this log file, you can use self.report(message) within your connector.py. This is helpful for debugging as well.
  • For a more advanced connector, please examine the code in the Geodatabase connector located in the gdb directory.

 

Web Design and Web Development by Buildable