Overview of Apache Hive Security in CDH
Securing Hive involves configuring or enabling:
-
Authentication for Hive Metastore, HiveServer2, and all Hive clients with your deployment of LDAP and Kerberos for your cluster.
See Hive Authentication, HiveServer2 Security Configuration, Hive Metastore Server Security Configuration, and Using Hive to Run Queries on a Secure HBase Server for details.
-
Authorization for HiveServer2 using role-based, fine-grained authorization that is implemented with Apache Sentry policies. You must configure HiveServer2 authentication before you configure authorization because Apache Sentry depends on an underlying authentication framework to reliably identify the requesting user.
See Sentry Policy File Authorization, User to Group Mapping, and Authorization Privilege Model for Hive and Impala for details. Configure Sentry permissions using GRANT and REVOKE statements using the HiveServer2 client, the Beeline CLI. See Hive SQL Syntax for Use with Sentry for details. -
Encryption to secure the network connection between HiveServer2 and Hive clients.
Starting with CDH 5.5, encryption for HiveServer2 clients has been decoupled from the authentication mechanism. This means you can use either SASL QOP or TLS/SSL to encrypt traffic between HiveServer2 and its clients, irrespective of whether Kerberos is being used for authentication. Previously, the JDBC client drivers only supported SASL QOP encryption on Kerberos-authenticated connections.
SASL QOP encryption is better suited for encrypting RPC communication and may result in performance issues when dealing with large amounts of data. Move to using TLS/SSL encryption to avoid such issues.
This topic describes how to set up encrypted communication between HiveServer2 and its JDBC/ODBC client drivers.
See Configuring Encrypted Communication Between HiveServer2 and Client Drivers for details.