Step 1: Install Cloudera Manager and CDH
Cloudera strongly recommends that you install and configure the Cloudera Manager Server and Cloudera Manager Agents and CDH to set up a fully-functional CDH cluster before trying to configure Kerberos authentication for the cluster.
Required user:group Settings for Security
This User | Runs These Roles |
---|---|
hdfs | NameNode, DataNodes, and Secondary NameNode |
mapred | JobTracker and TaskTrackers (MR1) and Job History Server (YARN) |
yarn | ResourceManager and NodeManagers (YARN) |
oozie | Oozie Server |
hue | Hue Server, Beeswax Server, Authorization Manager, and Job Designer |
When you install the Cloudera Manager Server on the server host, a new Unix user account called cloudera-scm is created automatically to support security. The Cloudera Manager Server uses this account to create host principals and deploy the keytabs on your cluster.
Depending on whether you installed CDH and Cloudera Manager at the same time or not, use one of the following sections for information on configuring directory ownerships on cluster hosts:
New Installation, Cloudera Manager and CDH Together
Installing a new Cloudera Manager cluster with CDH components at the same time can save you some of the user:group configuration required if you install them separately. The installation process creates the necessary user accounts on the Linux host system for the service daemons. At the end of the installation process when each cluster node starts up, the Cloudera Manager Agent process on the host automatically configures the directory ownership as shown in the table below, and the Hadoop daemons can then automatically set permissions for their respective directories. Do not change the directory owners on the cluster. They must be configured exactly as shown below.
Directory Specified in this Property | Owner |
---|---|
dfs.name.dir | hdfs:hadoop |
dfs.data.dir | hdfs:hadoop |
mapred.local.dir | mapred:hadoop |
mapred.system.dir in HDFS | mapred:hadoop |
yarn.nodemanager.local-dirs | yarn:yarn |
yarn.nodemanager.log-dirs | yarn:yarn |
oozie.service.StoreService.jdbc.url (if using Derby) | oozie:oozie |
[[database]] name | hue:hue |
javax.jdo.option.ConnectionURL | hue:hue |
Existing CDH Cluster Installed Before Cloudera Manager Installation
If you have been using HDFS and running MapReduce jobs in an existing installation of CDH before Cloudera Manager was installed, you must manually configure the directory ownership as shown in the table above to enable the Hadoop daemons to set appropriate permissions on each directory. Configure directory user:group ownership exactly as shown in the table.