Setting Up Apache Mahout Using the Command Line
Apache Mahout is a machine-learning tool. By enabling you to build machine-learning libraries that are scalable to "reasonably large" datasets, it aims to make building intelligent applications easier and faster.
The main use cases for Mahout are:
- Recommendation mining, which tries to identify things users will like on the basis of their past behavior (for example shopping or online-content recommendations)
- Clustering, which groups similar items (for example, documents on similar topics)
- Classification, which learns from existing categories what members of each category have in common, and on that basis tries to categorize new items
- Frequent item-set mining, which takes a set of item-groups (such as terms in a query session, or shopping-cart content) and identifies items that usually appear together
Installing Mahout
- Handle dependencies
- Provide for easy upgrades
- Automatically install resources to conventional locations
These instructions assume that you will install from packages if possible.
To install Mahout on a RHEL system:
$ sudo yum install mahout
To install Mahout on a SLES system:
$ sudo zypper install mahout
To install Mahout on an Ubuntu or Debian system:
$ sudo apt-get install mahout
To access Mahout documentation:
$ sudo apt-get install mahout-docThe contents of this package are saved under /usr/share/doc/mahout*.
The Mahout Executable
The Mahout executable is installed in /usr/bin/mahout. Use this executable to run your analysis.
Getting Started with Mahout
To get started with Mahout, you can follow the instructions in this Apache Mahout Quickstart.
Viewing the Mahout Documentation
For more information about Mahout, see mahout.apache.org.