Creating a CDH Cluster Using a Cloudera Manager Template
- Duplicate clusters for use in developer, test, and production environments.
- Quickly create a cluster for a specific workload.
- Reproduce a production cluster for testing and debugging.
- Export the cluster configuration from the source cluster. The exported configuration is a JSON file that details all of the configurations of the cluster. The JSON file includes an instantiator section that contains some values you must provide before creating the new cluster.
- Set up the hosts for the new cluster by installing Cloudera Manager agents and the JDK on all hosts. For secure clusters, also configure a Kerberos key distribution center (KDC) in Cloudera Manager.
- Create any local repositories required for the cluster.
- Complete the instantiator section of the cluster configuration JSON document to create a template.
- Import the cluster template to the new cluster.
Exporting the Cluster Configuration
To create a cluster template, you begin by exporting the configuration from the source cluster. The cluster must be running and managed by Cloudera Manager 5.7 or higher.
- Any Host Templates you have created are used to export the configuration. If you do not want to use those templates in the new cluster, delete them. In Cloudera Manager, go to and click Delete next to the Host Template you want to delete.
- Delete any Host Templates created by the Cloudera Manager Installation Wizard. They typically have a name like Template - 1).
- Run the following command to download the JSON configuration file to a convenient location for editing:
curl -u admin_username:admin_user_password "http://Cloudera Manager URL/api/v12/clusters/Cluster name/export" > path_to_file/file_name.json
For example:curl -u adminuser:adminpass "http://myCluster-1.myDomain.com:7180/api/v12/clusters/Cluster1/export" > myCluster1-template.json
Preparing a New Cluster
- Database for Cloudera Manager is installed and configured.
- Cloudera Manager 5.7 or higher is installed and running.
- All required databases for CDH services are installed. See Cloudera Manager and Managed Service Datastores.
- The JDK is installed on all cluster hosts.
- The Cloudera Manager Agent is installed and configured on all cluster hosts.
- If the source cluster uses Kerberos, the new cluster must have KDC properties and privileges configured in Cloudera Manager.
- If the source cluster used packages to install CDH and managed services, install those packages manually before importing the template. See Managing Software Installation Using Cloudera Manager.
Creating the Template
"instantiator" : { "clusterName" : "<changeme>", "hosts" : [ { "hostName" : "<changeme>", "hostTemplateRefName" : "<changeme>", "roleRefNames" : [ "HDFS-1-NAMENODE-0be88b55f5dedbf7bc74d61a86c0253e" ] }, { "hostName" : "<changeme>", "hostTemplateRefName" : "<changeme>" }, { "hostNameRange" : "<HOST[0001-0002]>", "hostTemplateRefName" : "<changeme>" } ], "variables" : [ { "name" : "HDFS-1-NAMENODE-BASE-dfs_name_dir_list", "value" : "/dfs/nn" }, { "name" : "HDFS-1-SECONDARYNAMENODE-BASE-fs_checkpoint_dir_list", "value" : "/dfs/snn" }, { "name" : "HIVE-1-hive_metastore_database_host", "value" : "myCluster-1.myDomain.com" }, { "name" : "HIVE-1-hive_metastore_database_name", "value" : "hive1" }, { "name" : "HIVE-1-hive_metastore_database_password", "value" : "<changeme>" }, { "name" : "HIVE-1-hive_metastore_database_port", "value" : "3306" }, { "name" : "HIVE-1-hive_metastore_database_type", "value" : "mysql" }, { "name" : "HIVE-1-hive_metastore_database_user", "value" : "hive1" }, { "name" : "HUE-1-database_host", "value" : "myCluster-1.myDomain.com" }, { "name" : "HUE-1-database_name", "value" : "hueserver0be88b55f5dedbf7bc74d61a86c0253e" }, { "name" : "HUE-1-database_password", "value" : "<changeme>" }, { "name" : "HUE-1-database_port", "value" : "3306" }, { "name" : "HUE-1-database_type", "value" : "mysql" }, { "name" : "HUE-1-database_user", "value" : "hueserver0be88b5" }, { "name" : "IMPALA-1-IMPALAD-BASE-scratch_dirs", "value" : "/impala/impalad" }, { "name" : "KUDU-1-KUDU_MASTER-BASE-fs_data_dirs", "value" : "/var/lib/kudu/master" }, { "name" : "KUDU-1-KUDU_MASTER-BASE-fs_wal_dir", "value" : "/var/lib/kudu/master" }, { "name" : "KUDU-1-KUDU_TSERVER-BASE-fs_data_dirs", "value" : "/var/lib/kudu/tserver" }, { "name" : "KUDU-1-KUDU_TSERVER-BASE-fs_wal_dir", "value" : "/var/lib/kudu/tserver" }, { "name" : "MAPREDUCE-1-JOBTRACKER-BASE-jobtracker_mapred_local_dir_list", "value" : "/mapred/jt" }, { "name" : "MAPREDUCE-1-TASKTRACKER-BASE-tasktracker_mapred_local_dir_list", "value" : "/mapred/local" }, { "name" : "OOZIE-1-OOZIE_SERVER-BASE-oozie_database_host", "value" : "myCluster-1.myDomain.com:3306" }, { "name" : "OOZIE-1-OOZIE_SERVER-BASE-oozie_database_name", "value" : "oozieserver0be88b55f5dedbf7bc74d61a86c0253e" }, { "name" : "OOZIE-1-OOZIE_SERVER-BASE-oozie_database_password", "value" : "<changeme>" }, { "name" : "OOZIE-1-OOZIE_SERVER-BASE-oozie_database_type", "value" : "mysql" }, { "name" : "OOZIE-1-OOZIE_SERVER-BASE-oozie_database_user", "value" : "oozieserver0be88" }, { "name" : "YARN-1-NODEMANAGER-BASE-yarn_nodemanager_local_dirs", "value" : "/yarn/nm" }, { "name" : "YARN-1-NODEMANAGER-BASE-yarn_nodemanager_log_dirs", "value" : "/yarn/container-logs" } ] }
- Update the hosts section.
If you have host templates defined in the source cluster, they appear in the hostTemplates section of the JSON template. For hosts that do not use host templates, the export process creates host templates based on role assignments to facilitate creating the new cluster. In either case, you must match the items in the hostTemplates section with the hosts sections in the instantiator section.
Here is a sample of the hostTemplates section from the same JSON file as the instantiator section, above:"hostTemplates" : [ { "refName" : "HostTemplate-0-from-myCluster-1.myDomain.com", "cardinality" : 1, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-GATEWAY-BASE", "HBASE-1-HBASETHRIFTSERVER-BASE", "HBASE-1-MASTER-BASE", "HDFS-1-BALANCER-BASE", "HDFS-1-GATEWAY-BASE", "HDFS-1-NAMENODE-BASE", "HDFS-1-NFSGATEWAY-BASE", "HDFS-1-SECONDARYNAMENODE-BASE", "HIVE-1-GATEWAY-BASE", "HIVE-1-HIVEMETASTORE-BASE", "HIVE-1-HIVESERVER2-BASE", "HUE-1-HUE_LOAD_BALANCER-BASE", "HUE-1-HUE_SERVER-BASE", "IMPALA-1-CATALOGSERVER-BASE", "IMPALA-1-STATESTORE-BASE", "KAFKA-1-KAFKA_BROKER-BASE", "KS_INDEXER-1-HBASE_INDEXER-BASE", "KUDU-1-KUDU_MASTER-BASE", "MAPREDUCE-1-GATEWAY-BASE", "MAPREDUCE-1-JOBTRACKER-BASE", "OOZIE-1-OOZIE_SERVER-BASE", "SOLR-1-SOLR_SERVER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SPARK_ON_YARN-1-SPARK_YARN_HISTORY_SERVER-BASE", "SQOOP-1-SQOOP_SERVER-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-GATEWAY-BASE", "YARN-1-JOBHISTORY-BASE", "YARN-1-RESOURCEMANAGER-BASE", "ZOOKEEPER-1-SERVER-BASE" ] }, { "refName" : "HostTemplate-1-from-myCluster-4.myDomain.com", "cardinality" : 1, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-REGIONSERVER-BASE", "HDFS-1-DATANODE-BASE", "HIVE-1-GATEWAY-BASE", "IMPALA-1-IMPALAD-BASE", "KUDU-1-KUDU_TSERVER-BASE", "MAPREDUCE-1-TASKTRACKER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-NODEMANAGER-BASE" ] }, { "refName" : "HostTemplate-2-from-myCluster-[2-3].myDomain.com", "cardinality" : 2, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-REGIONSERVER-BASE", "HDFS-1-DATANODE-BASE", "HIVE-1-GATEWAY-BASE", "IMPALA-1-IMPALAD-BASE", "KAFKA-1-KAFKA_BROKER-BASE", "KUDU-1-KUDU_TSERVER-BASE", "MAPREDUCE-1-TASKTRACKER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-NODEMANAGER-BASE" ] } ]
The value of cardinality indicates how many hosts are assigned to the host template in the source cluster.
The value of roleConfigGroupsRefNames indicates which role groups are assigned to the host(s).
Do the following for each host template in the hostTemplates section:- Locate the entry in the hosts section of the instantiator where you want the roles to be installed.
- Copy the value of the refName to the value for hostTemplateRefName.
- Enter the hostname in the new cluster as the value for hostName. Some host sections might instead use hostNameRange for
clusters with multiple hosts that have the same set of roles. Indicate a range of hosts by using one of the following:
- Brackets; for example, myhost[1-4].foo.com
- A comma-delimited string of hostnames; for example, host-1.domain, host-2.domain, host-3.domain
"hostTemplates" : [ { "refName" : "HostTemplate-0-from-myCluster-1.myDomain.com", "cardinality" : 1, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-GATEWAY-BASE", "HBASE-1-HBASETHRIFTSERVER-BASE", "HBASE-1-MASTER-BASE", "HDFS-1-BALANCER-BASE", "HDFS-1-GATEWAY-BASE", "HDFS-1-NAMENODE-BASE", "HDFS-1-NFSGATEWAY-BASE", "HDFS-1-SECONDARYNAMENODE-BASE", "HIVE-1-GATEWAY-BASE", "HIVE-1-HIVEMETASTORE-BASE", "HIVE-1-HIVESERVER2-BASE", "HUE-1-HUE_LOAD_BALANCER-BASE", "HUE-1-HUE_SERVER-BASE", "IMPALA-1-CATALOGSERVER-BASE", "IMPALA-1-STATESTORE-BASE", "KAFKA-1-KAFKA_BROKER-BASE", "KS_INDEXER-1-HBASE_INDEXER-BASE", "KUDU-1-KUDU_MASTER-BASE", "MAPREDUCE-1-GATEWAY-BASE", "MAPREDUCE-1-JOBTRACKER-BASE", "OOZIE-1-OOZIE_SERVER-BASE", "SOLR-1-SOLR_SERVER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SPARK_ON_YARN-1-SPARK_YARN_HISTORY_SERVER-BASE", "SQOOP-1-SQOOP_SERVER-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-GATEWAY-BASE", "YARN-1-JOBHISTORY-BASE", "YARN-1-RESOURCEMANAGER-BASE", "ZOOKEEPER-1-SERVER-BASE" ] }, { "refName" : "HostTemplate-1-from-myCluster-4.myDomain.com", "cardinality" : 1, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-REGIONSERVER-BASE", "HDFS-1-DATANODE-BASE", "HIVE-1-GATEWAY-BASE", "IMPALA-1-IMPALAD-BASE", "KUDU-1-KUDU_TSERVER-BASE", "MAPREDUCE-1-TASKTRACKER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-NODEMANAGER-BASE" ] }, { "refName" : "HostTemplate-2-from-myCluster-[2-3].myDomain.com", "cardinality" : 2, "roleConfigGroupsRefNames" : [ "FLUME-1-AGENT-BASE", "HBASE-1-REGIONSERVER-BASE", "HDFS-1-DATANODE-BASE", "HIVE-1-GATEWAY-BASE", "IMPALA-1-IMPALAD-BASE", "KAFKA-1-KAFKA_BROKER-BASE", "KUDU-1-KUDU_TSERVER-BASE", "MAPREDUCE-1-TASKTRACKER-BASE", "SPARK_ON_YARN-1-GATEWAY-BASE", "SQOOP_CLIENT-1-GATEWAY-BASE", "YARN-1-NODEMANAGER-BASE" ] } ], "instantiator" : { "clusterName" : "myCluster_new", "hosts" : [ { "hostName" : "myNewCluster-1.myDomain.com", "hostTemplateRefName" : "HostTemplate-0-from-myCluster-1.myDomain.com", "roleRefNames" : [ "HDFS-1-NAMENODE-c975a0b51fd36e914896cd5e0adb1b5b" ] }, { "hostName" : "myNewCluster-5.myDomain.com", "hostTemplateRefName" : "HostTemplate-1-from-myCluster-4.myDomain.com" }, { "hostNameRange" : "myNewCluster-[3-4].myDomain.com", "hostTemplateRefName" : "HostTemplate-2-from-myCluster-[2-3].myDomain.com" } ],
- For host sections that have a roleRefNames line, determine the role type and assign the appropriate host for the role. If there are multiple instances of
a role, you must select the correct hosts. To determine the role type, search the template file for the value of roleRefNames.
For example: For a role ref named HDFS-1-NAMENODE-0be88b55f5dedbf7bc74d61a86c0253e, if you search for that string, you find a section similar to the following:
"roles": [ { "refName": "HDFS-1-NAMENODE-0be88b55f5dedbf7bc74d61a86c0253e", "roleType": "NAMENODE" } ]
In this case, the role type is NAMENODE. - Modify the variables section. This section contains various properties from the source cluster. You can change any of these values to be different in the new cluster, or you can leave the values as copied from the source. For any values shown as <changeme>, you must provide the correct value.
- Enter the internal name of the new cluster on the line with "clusterName" : "<changeme>". For example:
"clusterName" : "QE_test_cluster"
- (Optional) Change the display name for the cluster. Edit the line that begins with "displayName" (near the top of the JSON file); for example:
"displayName" : "myNewCluster",
Importing the Template to a New Cluster
- Log in to the Cloudera Manager server as root.
- Run the following command to import the template. If you have remote repository URLS configured in the source cluster, append the command with ?addRepositories=true.
curl -X POST -H "Content-Type: application/json" -d @path_to_template/template_filename.json http://admin_user:admin_password@cloudera_manager_url:cloudera_manager_port/api/v12/cm/importClusterTemplate
You should see a response similar to the following:{ "id" : 17, "name" : "ClusterTemplateImport", "startTime" : "2016-03-09T23:44:38.491Z", "active" : true, "children" : { "items" : [ ] }
Examples:curl -X POST -H "Content-Type: application/json" -d @myTemplate.json http://admin:admin@myNewCluster-1.mydomain.com:7182/api/v12/cm/importClusterTemplate
curl -X POST -H "Content-Type: application/json" -d @myTemplate.json http://admin:admin@myNewCluster-1.mydomain.com:7182/api/v12/cm/importClusterTemplate?addRepositories=true
If there is no response, or you receive an error message, the JSON file may be malformed, or the template may have invalid hostnames or invalid references. Inspect the JSON file, correct any errors, and then re-run the command.
- Open Cloudera Manager for the new cluster in a web browser and click the Cloudera Manager logo to go to the home page.
- Click the All Recent Commands tab.
If the import is proceeding, you should see a link labeled Import Cluster Template. Click the link to view the progress of the import.
If any of the commands fail, correct the problem and click Retry. You may need to edit some properties in Cloudera Manager.
After you import the template, Cloudera Manager applies the Autoconfiguration rules that set properties such as memory and CPU allocations for various roles. If the new cluster has different hardware or operational requirements, you may need to modify these values.
Sample Python Code
You can perform the steps to export and import a cluster template programmatically using a client written in Python or other languages. (You can also use the curl commands provided above.)
resource = ApiResource("myCluster-1.myDomain.com", 7180, "admin", "admin", version=12) cluster = resource.get_cluster("Cluster1"); template = cluster.export(False) pprint(template)
resource = ApiResource("localhost", 8180, "admin", "admin", version=12) with open('~/cluster-template.json') as data_file: data = json.load(data_file) template = ApiClusterTemplate(resource).from_json_dict(data, resource) cms = ClouderaManager(resource) cms.import_cluster_template(template)