Using Custom JAR Files with Search

Search for CDH supports custom plug-in code. You load classes into JAR files and then configure Search to find these files. To correctly deploy custom JARs, ensure that:
  • Custom JARs are pushed to the same location on all hosts in your cluster that are hosting Cloudera Search (Solr Service).
  • Supporting configuration files direct Search to find the custom JAR files.
  • Any required configuration files such as schema.xml or solrconfig.xml reference the custom JAR code.

The following procedure describes how to use custom JARs. Some cases may not require completion of every step. For example, indexer tools that support passing JARs as arguments may not require modifying xml files. However, completing all configuration steps helps ensure the custom JARs are used correctly in all cases.

  1. Place your custom JAR in the same location on all hosts in your cluster.
  2. For all collections where custom JARs will be used, modify solrconfig.xml to include references to the new JAR files.

    These directives can include explicit or relative references and can use wildcards. In the solrconfig.xml file, add <lib> directives to indicate the JAR file locations or <path> directives for specific jar files:

    <lib path="/usr/lib/solr/lib/MyCustom.jar" />

    or

    <lib dir="/usr/lib/solr/lib" />

    or

    <lib dir="../../../myProject/lib" regex=".*\.jar" />
  3. For all collections in which custom JARs will be used, reference custom JAR code in the appropriate Solr configuration file. The two configuration files that most commonly reference code in custom JARs are solrconfig.xml and schema.xml.
  4. For all collections in which custom JARs will be used, use solrctl to update ZooKeeper's copies of configuration files such as solrconfig.xml and schema.xml:
    solrctl instancedir --update name path
    • name specifies the instancedir associated with the collection using solrctl instancedir --create.
    • path specifies the directory containing the collection's configuration files.
    For example:
    solrctl instancedir --update collection1 $HOME/solr_configs
  5. For all collections in which custom JARs will be used, use RELOAD to refresh information:
    http://<hostname>:<port>/solr/admin/collections?action=RELOAD&name=collectionname
    For example:
    http://example.com:8983/solr/admin/collections?action=RELOAD&name=collection1

    When the RELOAD command is issued to any host that hosts a collection, that host sends subcommands to all replicas in the collection. All relevant hosts refresh their information, so this command must be issued once per collection.

  6. Ensure that the class path includes the location of the custom JAR file:
    1. For example, if you store the custom JAR file in /opt/myProject/lib/, add that path as a line to the ~/.profile for the Solr user.
    2. Restart the Solr service to reload the PATH variable.
    3. Repeat this process of updating the PATH variable for all hosts.
The system is now configured to find custom JAR files. Some command-line tools included with Cloudera Search support specifying JAR files. For example, when using MapReduceIndexerTool, use the --libjars option to specify JAR files to use. Tools that support specifying custom JARs include:
  • MapReduceIndexerTool
  • Lily HBase Indexer
  • HDFSFindTool
  • CrunchIndexerTool
  • Flume indexing