Custom Installation Solutions
Cloudera hosts two types of software repositories that you can use to install products such as Cloudera Manager or CDH—parcel repositories and package repositories.
- You need to install older product versions. For example, in a CDH cluster, all hosts must run the same CDH version. After completing an initial installation, you may want to add hosts. This could be to increase the size of your cluster to handle larger tasks or to replace older hardware.
- The hosts on which you want to install Cloudera products are not connected to the Internet, so they cannot reach the Cloudera repository. (For a parcel installation, only the Cloudera Manager Server needs Internet access, but for a package installation, all cluster hosts require access to the Cloudera repository). Most organizations partition parts of their network from outside access. Isolating network segments improves security, but can add complexity to the installation process.
Continue reading:
Introduction to Parcels
Parcels are a packaging format that facilitate upgrading software from within Cloudera Manager. You can download, distribute, and activate a new software version all from within Cloudera Manager. Cloudera Manager downloads a parcel to a local directory. Once the parcel is downloaded to the Cloudera Manager Server host, an Internet connection is no longer needed to deploy the parcel. For detailed information about parcels, see Parcels.
If your Cloudera Manager Server does not have Internet access, you can obtain the required parcel files and put them into a parcel repository. For more information, see Using an Internal Parcel Repository.
Understanding Package Management
Before getting into the details of how to configure a custom package management solution in your environment, it can be useful to have more information about:
Package Management Tools
Packages (rpm or deb files) help ensure that installations complete successfully by satisfying package dependencies. When you install a particular package, all other required packages are installed at the same time. For example, hadoop-0.20-hive depends on hadoop-0.20.
Package management tools, such as yum (RHEL), zypper (SLES), and apt-get (Ubuntu) are tools that can find and install required packages. For example, on a RHEL compatible system, you might run the command yum install hadoop-0.20-hive. The yum utility informs you that the Hive package requires hadoop-0.20 and offers to install it for you. zypper and apt-get provide similar functionality.
Package Repositories
Package management tools rely on package repositories to install software and resolve any dependency requirements. For information on creating an internal repository, see Using an Internal Package Repository.
Repository Configuration Files
- RHEL compatible (yum): /etc/yum.repos.d
- SLES (zypper): /etc/zypp/zypper.conf
- Ubuntu (apt-get): /etc/apt/apt.conf (Additional repositories are specified using .list files in the /etc/apt/sources.list.d/ directory.)
ls -l /etc/yum.repos.d/
total 36 -rw-r--r--. 1 root root 1664 Dec 9 2015 CentOS-Base.repo -rw-r--r--. 1 root root 1309 Dec 9 2015 CentOS-CR.repo -rw-r--r--. 1 root root 649 Dec 9 2015 CentOS-Debuginfo.repo -rw-r--r--. 1 root root 290 Dec 9 2015 CentOS-fasttrack.repo -rw-r--r--. 1 root root 630 Dec 9 2015 CentOS-Media.repo -rw-r--r--. 1 root root 1331 Dec 9 2015 CentOS-Sources.repo -rw-r--r--. 1 root root 1952 Dec 9 2015 CentOS-Vault.repo -rw-r--r--. 1 root root 951 Jun 24 2017 epel.repo -rw-r--r--. 1 root root 1050 Jun 24 2017 epel-testing.repo
[base] name=CentOS-$releasever - Base mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 #released updates [updates] name=CentOS-$releasever - Updates mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates&infra=$infra #baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/ gpgcheck=1 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
Listing Repositories
- RHEL compatible: yum repolist
- SLES: zypper repos
- Ubuntu: apt-get does not include a command to display sources, but you can determine sources by reviewing the contents of /etc/apt/sources.list and any files contained in /etc/apt/sources.list.d/.
repo id repo name status base/7/x86_64 CentOS-7 - Base 9,591 epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 12,382 extras/7/x86_64 CentOS-7 - Extras 392 updates/7/x86_64 CentOS-7 - Updates 1,962 repolist: 24,327