Using Sqoop Actions with Oozie
Sqoop 1 does not ship with third party JDBC drivers. You must download them separately and save them to the /var/lib/sqoop/ directory on the Oozie server. For more information, see Sqoop 1 Installation.
Recommendations
-
Cloudera recommends that you not use Sqoop CLI commands with an Oozie Shell Action. Such deployments are not reliable and prone to breaking during upgrades and configuration changes.
-
To import data into Hive, use a combination of a Sqoop Action with a Hive2 Action.
- A Sqoop Action to simply ingest data into HDFS.
- A Hive2 Action that loads the data from HDFS into Hive.
Deploying and Configuring Oozie Sqoop1 Action JDBC Drivers
Before you begin this process, confirm that your Sqoop1 JDBC drivers are present in /var/lib/sqoop.
SSH to the Oozie server host and execute the following commands to deploy and configure the drivers on HDFS:
cd /var/lib/sqoop sudo -u hdfs hdfs dfs -mkdir /user/oozie/libext sudo -u hdfs hdfs dfs -chown oozie:oozie /user/oozie/libext sudo -u hdfs hdfs dfs -put /opt/cloudera/parcels/SQOOP_NETEZZA_CONNECTOR/sqoop-nz-connector*.jar /user/oozie/libext/ sudo -u hdfs hdfs dfs -put /opt/cloudera/parcels/SQOOP_TERADATA_CONNECTOR/lib/*.jar /user/oozie/libext/ sudo -u hdfs hdfs dfs -put /opt/cloudera/parcels/SQOOP_TERADATA_CONNECTOR/sqoop-connector-teradata*.jar /user/oozie/libext/ sudo -u hdfs hdfs dfs -put /var/lib/sqoop/*.jar /user/oozie/libext/ sudo -u hdfs hdfs dfs -chown oozie:oozie /user/oozie/libext/*.jar sudo -u hdfs hdfs dfs -chmod 755 /user/oozie/libext/*.jar sudo -u hdfs hdfs dfs -ls /user/oozie/libext # [sample contents of /user/oozie/libext] -rwxr-xr-x 3 oozie oozie 959987 2016-05-29 09:58 /user/oozie/libext/mysql-connector-java.jar -rwxr-xr-x 3 oozie oozie 358437 2016-05-29 09:58 /user/oozie/libext/nzjdbc3.jar -rwxr-xr-x 3 oozie oozie 2739670 2016-05-29 09:58 /user/oozie/libext/ojdbc6.jar -rwxr-xr-x 3 oozie oozie 3973162 2016-05-29 09:58 /user/oozie/libext/sqoop-connector-teradata-1.5c5.jar -rwxr-xr-x 3 oozie oozie 41691 2016-05-29 09:58 /user/oozie/libext/sqoop-nz-connector-1.3c5.jar -rwxr-xr-x 3 oozie oozie 2405 2016-05-29 09:58 /user/oozie/libext/tdgssconfig.jar -rwxr-xr-x 3 oozie oozie 873860 2016-05-29 09:58 /user/oozie/libext/terajdbc4.jar
Configuring Oozie Sqoop1 Action Workflow JDBC Drivers
Use the following steps to configure Oozie Sqoop1 Action Workflows:
- Confirm that the Sqoop1 JDBC drivers are present in HDFS. To do this, SSH to the Oozie Server host and run the following command:
sudo -u hdfs hdfs dfs -ls /user/oozie/libext
- Configure the following Oozie Sqoop1 Action workflow variables in Oozie's job.properties file as follows:
oozie.use.system.libpath = true oozie.libpath = /user/oozie/libext