The objective of this tutorial is to describe step by step process to install Pig (Version pig-0.17.0.tar.gz ) on Hadoop 3.1.2 version and the OS which we are using is Ubuntu 18.04.4 LTS (Bionic Beaver), once the installation is completed you can play with Pig.
Platform
- Operating System (OS). You can use Ubuntu 18.04.4 LTS version or later version, also you can use other flavors of Linux systems like Redhat, CentOS, etc.
- Hadoop. We have already installed Hadoop 3.1.2 version on which we will run Pig (Please refer to the "Hadoop Installation on Single Node” tutorial and install Hadoop first before proceeding for Pig installation.)
- Pig. We have used the Apache Pig-0.17.0 version for installation.
Download Software
- Pig
https://downloads.apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz
Steps to Install Apache Pig version(0.17.0) on Ubuntu 18.04.4 LTS
Please follow the below steps to install Apache Pig.
Step 1. Please verify if Hadoop is installed.
Step 2. Please verify if Java is installed.
Step 3. Please download Pig 0.17.0 from the below link.
On Linux: $wget https://downloads.apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz
On Windows: https://downloads.apache.org/pig/pig-0.17.0/pig-0.17.0.tar.gz
Step 4. Now we will extract the tar file by using the below command and rename the folder to pig to make it meaningful.
$tar -xzf pig-0.17.0.tar.gz
$mv pig-0.17.0 pig
Step 5. Now edit the .bashrc file to update the environment variable of Apache Pig so that it can be accessed from any directory.
$nano .bashrc
Add below lines.
export PIG_HOME=/home/cloudduggu/pig
export PATH=$PATH:/home/cloudduggu/pig/bin
export PIG_CLASSPATH=$HADOOP_HOME/etc/Hadoop
Save the changes by pressing CTRL + O and exit from the nano editor by pressing CTRL + X.
Step 6. Run source command to update changes in the same terminal.
$source .bashrc