Navigator Metadata Server Tuning
This page can help you tune your Navigator Metadata Server instance for peak performance. See also Setting up Navigator Role Instances on Different Hosts.
Memory Sizing Considerations for Navigator Metadata Server
- Extracting metadata from the cluster and creating relationships among the metadata entities (facilitating lineage)
- Querying to find entities
Navigator Metadata Server uses Solr to index, store, and query metadata. Indexing occurs during the extraction process, with the resulting Solr documents—data structure used by Solr for index and search—stored in the specified Navigator Metadata Server Storage Dir. Because the metadata is indexed, querying is fast and efficient. However, Solr indexing runs in-process with Navigator Metadata Server, so the amount of memory configured for the Java heap is critical.
There is a direct correlation between the number of Solr documents contained in the index and the size of the Java heap required by Navigator Metadata Server, so this setting may need to be changed over time as the number of Solr documents making up the index increases. To calculate optimal Java heap setting for your system, see Estimating Optimal Java Heap Size Using Solr Document Counts.
Estimating Optimal Java Heap Size Using Solr Document Counts
2016-11-11 09:24:58,013 INFO com.cloudera.nav.server.NavServerUtil: Found 68813088 documents in solr core nav_elements 2016-11-11 09:24:58,705 INFO com.cloudera.nav.server.NavServerUtil: Found 78813930 documents in solr core nav_relations
These counts can be used to estimate optimal Java heap size for the server, as detailed below. If your normal setup provides less than 8 GB for the Navigator Metadata Server heap, consider increasing the heap before performing an upgrade. See Setting the Navigator Metadata Server Java Heap Size for details about using Cloudera Manager Admin Console to modify this setting when needed.
- Open the log file for Navigator Metadata Server. By default, logs are located in /var/log/cloudera-scm-navigator.
- Find the number of documents in solr core nav_elements line in the log.
- Find the number of documents in solr core nav_relations line in the log.
- Multiply the total number of element documents by 200 bytes per document and add to a baseline of 2 GB:
(num_nav_elements * 200 bytes) + 2 GB
For example, using the log shown above, the recommended Java heap size is ~14 GiB:(68813088 * 200) + 2 GB 13762617600 bytes = ~12.8 GiB + 2 GB (~1.8 GiB) = ~ 14–15 GiB
Reducing the Metadata collected by Navigator Metadata Server
You can choose to leave some file system paths out of the scope of information tracked in Cloudera Navigator. Cloudera Manager provides a blacklist where you can specify file systems paths that should be filtered out of metadata extracted from HDFS and S3.
To filter file system paths from tracked metadata:
- Log in to Cloudera Manager Admin Console.
- Select .
- Click the Configuration tab.
- Select Navigator Metadata Server for the Scope filter.
- Select Extractor Filter for the Category filter.
- Enable the filter:
- HDFS Filter Enable
- S3 Filter Enable
- In the appropriate filter list, include the file system path that you want to exclude from Navigator Metadata Server tracking:
- HDFS Filter Blacklist
- S3 Filter list
The entry can be a specific path or a Java regular expression specifying a path. For example, to specify a directory and all subdirectories, use an expression such as
/path/to/dir(?:/.*)?
- Enter additional entries in the filter list by clicking to open another entry.
- For S3, set the S3 Filter Default Action to DISCARD.
- Click Save Changes.
- Click the Instances tab.
- Restart the role.
- Normal operation
(num_nav_elements * 200 bytes) + 2 GB
- Upgrade between CM 5.9 and 5.10
((num_nav_elements + num_nav_relations) * 200 bytes) + 2 GB
Purging the Navigator Metadata Server of Deleted and Stale Metadata
Administrators can manage the Java heap requirements by clearing the Navigator Metadata Server of stale and deleted metadata prior to an upgrade or whenever system performance seems slow. Purging stale and deleted metadata also helps speed up display of lineage diagrams. Purge fully removes metadata that has been deleted.
For Cloudera Navigator 2.11 (and higher) releases—Cloudera Navigator console (Administration tab) provides a fully configurable Purge Settings page. See Managing Metadata Storage with Purge for details.
For Cloudera Navigator 2.10 (and prior releases)—The Purge capability can be directly invoked using the Cloudera Navigator APIs. See Using the Purge APIs for Metadata Maintenance Tasks for details.