Setting Up HCatalog Using the Command LIne
As of CDH 5, HCatalog is part of Apache Hive.
HCatalog is a table and storage management layer for Hadoop that makes the same table information available to Hive, Pig, MapReduce, and Sqoop. Table definitions are maintained in the Hive metastore, which HCatalog requires. WebHCat allows you to access HCatalog using an HTTP (REST style) interface.
This page explains how to install and configure HCatalog and WebHCat. For Sqoop, see Sqoop-HCatalog Integration in the Sqoop User Guide.
HCatalog Prerequisites
- An operating system supported by CDH 5.
- Oracle JDK.
- The Hive metastore and its database. The Hive metastore must be running in remote mode (as a service).
Configuration Change on Hosts Used with HCatalog
You must update /etc/hive/conf/hive-site.xml on all hosts where WebHCat will run, as well as all hosts where Pig or MapReduce will be used with HCatalog, so that Metastore clients know where to find the Metastore.
Add or edit the hive.metastore.uris property as follows:
<property> <name>hive.metastore.uris</name> <value>thrift://<hostname>:9083</value> </property>
where <hostname> is the host where the HCatalog server components are running, for example hive.examples.com.