Upgrading Hive

Upgrade Hive on all the hosts on which it is running including both servers and clients.

Checklist to Help Ensure Smooth Upgrades

The following best practices for configuring and maintaining Hive will help ensure that upgrades go smoothly.
  • Configure periodic backups of the metastore database. Use mysqldump, or the equivalent for your vendor if you are not using MySQL.
  • Make sure datanucleus.autoCreateSchema is set to false (in all types of database) and datanucleus.fixedDatastore is set to true (for MySQL and Oracle) in all hive-site.xml files. See the configuration instructions for more information about setting the properties in hive-site.xml.

  • Insulate the metastore database from users by running the metastore service in Remote mode. If you do not follow this recommendation, make sure you remove DROP, ALTER, and CREATE privileges from the Hive user configured in hive-site.xml. See Configuring the Hive Metastore for CDH for complete instructions for each type of supported database.

Upgrading Hive from a Lower Version of CDH 5

The instructions that follow assume that you are upgrading Hive as part of a CDH 5 upgrade, and have already performed the steps under Upgrading from an Earlier CDH 5 Release to the Latest Release.

To upgrade Hive from a lower version of CDH 5, proceed as follows.

Step 1: Stop all Hive Processes and Daemons

  1. Stop any HiveServer processes that are running:
    $ sudo service hive-server stop 
  2. Stop any HiveServer2 processes that are running:
    $ sudo service hive-server2 stop 
  3. Stop the metastore:
    $ sudo service hive-metastore stop 

Step 2: Install the new Hive version on all hosts (Hive servers and clients)

See Installing Hive

Step 3: Verify that the Hive Metastore is Properly Configured

See Configuring the Hive Metastore for CDH for detailed instructions.

Step 4: Upgrade the Metastore Schema

To upgrade the Hive metastore schema, you can use either the Hive schematool or use the schema upgrade scripts that are provided with the Hive package. Cloudera recommends that you use the schematool.

Using Hive schematool (Recommended):

The Hive distribution includes a command-line tool for Hive metastore schema manipulation called schematool. This tool can be used to initialize the metastore schema for the current Hive version. It can also upgrade the schema from an older version to the current one. You must add properties to the hive-site.xml before you can use it. See Using the Hive Schema Tool in CDH for information about how to set the tool up and for usage examples. To upgrade the schema, use the upgradeSchemaFrom option to specify the version of the schema you are currently using. For example, if you are upgrading a MySQL metastore schema from Hive 0.13.1, use the following syntax:
$ schematool -dbType mysql -passWord <db_user_pswd> -upgradeSchemaFrom
  0.13.1 -userName <db_user_name>
Metastore connection URL:
jdbc:mysql://<cluster_address>:3306/<user_name>?useUnicode=true&characterEncoding=UTF-8
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: <user_name>
Starting upgrade metastore schema from version 0.13.1 to <new_version>
Upgrade script upgrade-0.13.1-to-<new_version>.mysql.sql
Completed pre-0-upgrade-0.13.1-to-<new_version>.mysql.sql
Completed upgrade-0.13.1-to-<new_version>.mysql.sql
schemaTool completed
          

Using Schema Upgrade Scripts:

Navigate to the directory where the schema upgrade scripts are located:

  • If you installed CDH with parcels, the scripts are in the following location:
    /opt/cloudera/parcels/CDH/lib/hive/scripts/metastore/upgrade/<database_name>
                      
  • If you installed CDH with packages, the scripts are in the following location:
    /usr/lib/hive/scripts/metastore/upgrade/<database_name>
                      

For example, if your Hive metastore is MySQL and you installed CDH with packages, navigate to /usr/lib/hive/scripts/metastore/upgrade/mysql.

Run the appropriate schema upgrade scripts in order. Start with the script for your database type and Hive version, and run all subsequent scripts.

For example, if you are currently running Hive 0.13.1 with MySQL and upgrading to Hive 1.1.0, start with the script for 0.13.0 to 0.14.0 for MySQL, and then run the script for Hive 0.14.0 to 1.1.0.

For more information about using the scripts to upgrade the schema, see the README in the directory with the scripts.

Step 5: Start the Metastore, HiveServer2, and Beeline

See:

The upgrade is now complete.

Troubleshooting: If you failed to upgrade the metastore

If you failed to upgrade the metastore as instructed above, proceed as follows.

  1. Identify the problem.
    The symptoms are as follows:
    • Hive stops accepting queries.
    • In a cluster managed by Cloudera Manager, the Hive Metastore canary fails.
    • An error such as the following appears in the Hive Metastore Server logs:
      Hive Schema version 0.13.0 does not match metastore's schema version 0.12.0 Metastore is not upgraded or corrupt.
  2. Resolve the problem.
    If the problem you are having matches the symptoms just described, do the following:
    1. Stop all Hive services; for example:
      $ sudo service hive-server2 stop
      $ sudo service hive-metastore stop
    2. Run the Hive schematool, as instructed here.
      Make sure the value you use for the -upgradeSchemaFrom option matches the version you are currently running (not the new version). For example, if the error message in the log is
      Hive Schema version 0.13.0 does not match metastore's schema version 0.12.0 Metastore is not upgraded or corrupt.
      then the value of -upgradeSchemaFrom must be 0.12.0.
    3. Restart the Hive services you stopped.