<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Apache Hadoop Archives - Linux Windows and android Tutorials</title>
	<atom:link href="https://www.osradar.com/tag/apache-hadoop/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.osradar.com</link>
	<description>tutorials and news and Seurity</description>
	<lastBuildDate>Fri, 17 Jan 2020 13:53:54 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.8.12</generator>
	<item>
		<title>How To Install Apache Hadoop / HBase on Ubuntu 18.04</title>
		<link>https://www.osradar.com/how-to-install-apache-hadoop-hbase-on-ubuntu-18-04/</link>
					<comments>https://www.osradar.com/how-to-install-apache-hadoop-hbase-on-ubuntu-18-04/#respond</comments>
		
		<dc:creator><![CDATA[sabi]]></dc:creator>
		<pubDate>Fri, 17 Jan 2020 13:53:50 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Servers]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[Apache Hadoop]]></category>
		<category><![CDATA[how to install Apache Hadoop on Ubuntu 18.04]]></category>
		<category><![CDATA[Install Hadoop on Ubuntu]]></category>
		<guid isPermaLink="false">https://www.osradar.com/?p=17201</guid>

					<description><![CDATA[<p>HBase is an open source distributed non-relational database developed under the Apache Software Foundation. It is written in Java &#38; runs on top of Hadoop File Systems (HDFS). HBase is one of the dominant databases when working with big data. It is designed for a quick read &#38; write access to huge amounts of structured [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.osradar.com/how-to-install-apache-hadoop-hbase-on-ubuntu-18-04/">How To Install Apache Hadoop / HBase on Ubuntu 18.04</a> appeared first on <a rel="nofollow" href="https://www.osradar.com">Linux  Windows and android  Tutorials</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>HBase is an open source distributed non-relational database developed under the Apache Software Foundation. It is written in Java &amp; runs on top of Hadoop File Systems (HDFS). HBase is one of the dominant databases when working with big data. It is designed for a quick read &amp; write access to huge amounts of structured data.</p>



<p>Today, we will cover our first guide on the Installation of Hadoop &amp; HBase on Ubuntu 18.04 and it is a HBase Installation on a Single Node Hadoop Cluster. It is done on a barebone Ubuntu 18.04 Virtual Machine with 8GB Ram &amp; 4vCPU</p>



<h3><strong>Installing Hadoop on Ubuntu 18.04</strong></h3>



<p>Cover these steps to install a Single node Hadoop cluster on Ubuntu 18.04 LTS</p>



<h3><strong>Step 1: Update System</strong></h3>



<p>To deploy Hadoop &amp; HBase on Ubuntu , update it.</p>



<pre class="wp-block-verse">sudo apt update<br>sudo apt -y upgrade<br>sudo reboot</pre>



<h3><strong>Step 2: Install Java</strong></h3>



<p>Skip this step if you have Installed java.</p>



<pre class="wp-block-verse">sudo apt install openjdk-8-jre-headless<br>sudo apt update</pre>



<p>Confirm the Installation of Java by </p>



<pre class="wp-block-verse">sabi@Ubuntu:~$ java -version<br> openjdk version "1.8.0_232"<br> OpenJDK Runtime Environment (build 1.8.0_232-8u232-b09-0ubuntu1~18.04.1-b09)<br> OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)</pre>



<p>Set up <strong>JAVA_HOME</strong> variable.</p>



<pre class="wp-block-verse">cat &lt;&lt;EOF | sudo tee /etc/profile.d/hadoop_java.sh<br>export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))<br>export PATH=\$PATH:\$JAVA_HOME/bin<br>EOF</pre>



<p>Now, update your PATH &amp; settings.</p>



<pre class="wp-block-verse">source /etc/profile.d/hadoop_java.sh</pre>



<p><strong>Testing Java</strong></p>



<pre class="wp-block-verse">sabi@Ubuntu:~$ echo $JAVA_HOME<br> /usr/lib/jvm/java-11-openjdk-amd64</pre>



<h3><strong>Step 3: Creating User Account</strong></h3>



<p>Move forward to create an Account for Hadoop so we have isolation b/w the Hadoop file system &amp; the Unix file system.</p>



<pre class="wp-block-verse">sabi@Ubuntu:~$ sudo adduser hadoop<br> Adding user <code>hadoop' ... Adding new group</code>hadoop' (1001) …<br> Adding new user <code>hadoop' (1001) with group</code>hadoop' …<br> Creating home directory <code>/home/hadoop' ... Copying files from</code>/etc/skel' …<br> Enter new UNIX password: <br> Retype new UNIX password: <br> passwd: password updated successfully<br> Changing the user information for hadoop<br> Enter the new value, or press ENTER for the default<br>     Full Name []: Sabir Hussain<br>     Room Number []: <br>     Work Phone []: <br>     Home Phone []: <br>     Other []: <br> Is the information correct? [Y/n] <strong>y</strong><br>sabi@Ubuntu:~$ sudo usermod -aG sudo hadoop</pre>



<p>After adding user, generate SS key pair for the user.</p>



<pre class="wp-block-verse">sabi@Ubuntu:~$ sudo su - hadoop<br>hadoop@Ubuntu:~$ ssh-keygen -t rsa<br> Generating public/private rsa key pair.<br> Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): <br> Created directory '/home/hadoop/.ssh'.<br> Enter passphrase (empty for no passphrase): <br> Enter same passphrase again: <br> Your identification has been saved in /home/hadoop/.ssh/id_rsa.<br> Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.<br> The key fingerprint is:<br> SHA256:f/lEUTkJyr49dZEHr9xZ7wCD4Lg3+ephloHQ8w8GVlY hadoop@Ubuntu<br> The key's randomart image is:<br> +---[RSA 2048]----+<br> |        +.E  .o +|<br> |     . = ….  B.|<br> |    . * . .oo .o=|<br> |     o * ..  + +*|<br> |      o S  .  =o+|<br> |       o O  oo.o.|<br> |        = +.oo. .|<br> |       o o . o.  |<br> |       .o     .  |<br> +----[SHA256]-----+</pre>



<p><strong>Allow authorization </strong></p>



<p>Add this user&#8217;s key to list of Authorized ssh keys.</p>



<pre class="wp-block-verse">cat ~/.ssh/id_rsa.pub &gt;&gt; ~/.ssh/authorized_keys<br>chmod 0600 ~/.ssh/authorized_keys</pre>



<p>Make sure that you can ssh using added key.</p>



<pre class="wp-block-verse">hadoop@Ubuntu:~$ ssh localhost<br> The authenticity of host 'localhost (127.0.0.1)' can't be established.<br> ECDSA key fingerprint is SHA256:jyWPWJLVC9MCHnOAFJjN8c8bwLu0o0U85cWTxHwuHvE.<br> Are you sure you want to continue connecting (yes/no)? y<br> Please type 'yes' or 'no': yes<br> Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.<br> Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 5.0.0-37-generic x86_64)<br> Documentation:  https://help.ubuntu.com<br> Management:     https://landscape.canonical.com<br> Support:        https://ubuntu.com/advantage<br> Canonical Livepatch is available for installation.<br> Reduce system reboots and improve kernel security. Activate at:<br>  https://ubuntu.com/livepatch <br> 0 packages can be updated.<br> 0 updates are security updates.<br> Your Hardware Enablement Stack (HWE) is supported until April 2023.<br> The programs included with the Ubuntu system are free software;<br> the exact distribution terms for each program are described in the<br> individual files in /usr/share/doc/*/copyright.<br> Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by<br> applicable law.<br> hadoop@Ubuntu:~$ exit<br> logout<br> Connection to localhost closed.</pre>



<h3><strong>Step 4: Download &amp; Install Hadoop</strong></h3>



<p>Go for the <a href="https://hadoop.apache.org/releases.html">latest release</a> of Hadoop &amp; download it.</p>



<pre class="wp-block-verse">wget https://www-eu.apache.org/dist/hadoop/common/hadoop-2.10.0/hadoop-2.10.0.tar.gz</pre>



<p>Extract the files.</p>



<pre class="wp-block-verse">tar xzvf hadoop-2.10.0.tar.gz</pre>



<p>Move resulting directory to <strong>/usr/local/hadoop</strong></p>



<pre class="wp-block-verse">sudo mv hadoop-2.10.0 /usr/local/hadoop</pre>



<p>Set up <strong>HADOOP_HOME</strong> and add directory with Hadoop binaries to your <strong>$PATH</strong></p>



<pre class="wp-block-verse">cat &lt;&lt;EOF | sudo tee /etc/profile.d/hadoop_java.sh<br> export JAVA_HOME=$(dirname $(dirname $(readlink $(readlink $(which javac)))))<br> export HADOOP_HOME=/usr/local/hadoop<br> export HADOOP_HDFS_HOME=$HADOOP_HOME<br> export HADOOP_MAPRED_HOME=$HADOOP_HOME<br> export YARN_HOME=$HADOOP_HOME<br> export HADOOP_COMMON_HOME=$HADOOP_HOME<br> export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native<br> export PATH=\$PATH:\$JAVA_HOME/bin:\$HADOOP_HOME/bin:\$HADOOP_HOME/sbin<br> EOF</pre>



<p>Source file using</p>



<pre class="wp-block-verse">source /etc/profile.d/hadoop_java.sh</pre>



<p>Confirm your Hadoop version by</p>



<pre class="wp-block-verse">hadoop@Ubuntu:~$ hadoop version<br> Hadoop 2.10.0<br> Subversion ssh://git.corp.linkedin.com:29418/hadoop/hadoop.git -r e2f1f118e465e787d8567dfa6e2f3b72a0eb9194<br> Compiled by jhung on 2019-10-22T19:10Z<br> Compiled with protoc 2.5.0<br> From source with checksum 7b2d8877c5ce8c9a2cca5c7e81aa4026<br> This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.10.0.jar</pre>



<h3><strong>Step 5: Configure Hadoop</strong></h3>



<p>Hadoop configurations are located under <strong>/usr/local/hadoop/etc/hadoop/</strong></p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="710" height="195" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/hadoop-files.jpg" alt="" class="wp-image-17334" srcset="https://www.osradar.com/wp-content/uploads/2019/12/hadoop-files.jpg 710w, https://www.osradar.com/wp-content/uploads/2019/12/hadoop-files-300x82.jpg 300w, https://www.osradar.com/wp-content/uploads/2019/12/hadoop-files-696x191.jpg 696w" sizes="(max-width: 710px) 100vw, 710px" /></figure></div>



<p>Various files needed to be modified to complete the Installation on Ubuntu 18.04</p>



<p>First of all edit <strong>JAVA_HOME</strong> in shell script <strong>hadoop-env.sh</strong>:</p>



<pre class="wp-block-verse">$ sudo vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh<br>export JAVA_HOME=/usr/lib/jvm/java-8-oracle</pre>



<p>Then configure:</p>



<h4>1.<strong>core-site.xml</strong></h4>



<p>The <strong>core-site.xml </strong>file contains Hadoop cluster information used when starting up. These properties include:</p>



<ul><li>The port number used for Hadoop instance</li><li> The memory allocated for file system</li><li> The memory limit for data storage</li><li> The size of Read / Write buffers.</li></ul>



<p>Open core-site.xml</p>



<pre class="wp-block-verse">sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml</pre>



<p>Add the following properties in b/w the <strong>&lt;configuration&gt; and &lt;/configuration&gt;</strong> tags.</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="683" height="506" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/site.xml-config.jpg" alt="" class="wp-image-17335" srcset="https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config.jpg 683w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-300x222.jpg 300w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-80x60.jpg 80w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-485x360.jpg 485w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-567x420.jpg 567w" sizes="(max-width: 683px) 100vw, 683px" /></figure></div>



<h4>2. <strong>hdfs-site.xml</strong></h4>



<p>Configure this file for each host to be used in the cluster. It holds the information of</p>



<ul><li>The namenode &amp; datanode paths ol the local filesystem.</li><li>Value of replication data</li></ul>



<p>I&#8217;m using my disk to store Hadoop infrastructure. You can follow this procedure for your secondary disk. </p>



<pre class="wp-block-verse">hadoop@Ubuntu:~$ lsblk<br> NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT<br> loop0    7:0    0 149.9M  1 loop /snap/gnome-3-28-1804/67<br> loop1    7:1    0  54.4M  1 loop /snap/core18/1066<br> loop2    7:2    0   4.2M  1 loop /snap/gnome-calculator/544<br> loop3    7:3    0  14.8M  1 loop /snap/gnome-characters/296<br> loop4    7:4    0     4M  1 loop /snap/gnome-calculator/406<br> loop5    7:5    0   3.7M  1 loop /snap/gnome-system-monitor/123<br> loop6    7:6    0  89.1M  1 loop /snap/core/8268<br> loop7    7:7    0  14.8M  1 loop /snap/gnome-characters/375<br> loop8    7:8    0   3.7M  1 loop /snap/gnome-system-monitor/100<br> loop9    7:9    0  1008K  1 loop /snap/gnome-logs/61<br> loop10   7:10   0  88.5M  1 loop /snap/core/7270<br> loop11   7:11   0 156.7M  1 loop /snap/gnome-3-28-1804/110<br> loop12   7:12   0   956K  1 loop /snap/gnome-logs/81<br> loop13   7:13   0  44.2M  1 loop /snap/gtk-common-themes/1353<br> loop14   7:14   0  42.8M  1 loop /snap/gtk-common-themes/1313<br> sda      8:0    0    20G  0 disk <br> └─sda1   8:1    0    20G  0 part /<br> sr0     11:0    1     2G  0 rom  </pre>



<p>Do partition &amp; mount the disk to <strong>/hadoop</strong> directory.</p>



<pre class="wp-block-verse">1.sudo parted -s -- /dev/sdb mklabel gpt<br>2.sudo parted -s -a optimal -- /dev/sdb mkpart primary 0% 100%<br>3.sudo parted -s -- /dev/sdb align-check optimal 1<br>4.sudo mkfs.xfs /dev/sdb1<br>5.sudo mkdir /hadoop<br>echo "/dev/sdb1 /hadoop xfs defaults 0 0" | sudo tee -a /etc/fstab<br>sudo mount -a </pre>



<p>Check:</p>



<pre class="wp-block-verse">hadoop@Ubuntu:~$ df -hT | grep /dev/sda1<br>
/dev/sda1      ext4       20G  7.4G   12G  40% /</pre>



<p>Create directories for namenode &amp; datanode</p>



<pre class="wp-block-verse">sudo mkdir -p /hadoop/hdfs/{namenode,datanode}</pre>



<p>Now, set ownership to hadoop user &amp; group</p>



<pre class="wp-block-verse">sudo chown -R hadoop:hadoop /hadoop</pre>



<p>Open the file</p>



<pre class="wp-block-verse">sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml</pre>



<p>Then add the below data in  b/w &lt;configuration&gt; &amp; &lt;/configuration&gt; tags.</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="683" height="506" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/site.xml-config-2.jpg" alt="" class="wp-image-17337" srcset="https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-2.jpg 683w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-2-300x222.jpg 300w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-2-80x60.jpg 80w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-2-485x360.jpg 485w, https://www.osradar.com/wp-content/uploads/2019/12/site.xml-config-2-567x420.jpg 567w" sizes="(max-width: 683px) 100vw, 683px" /></figure></div>



<h4>3. <strong>mapred-site.xml</strong></h4>



<p>Use this file to set the MapReduce Framework </p>



<pre class="wp-block-verse">sudo nano /usr/local/hadoop/etc/hadoop/mapred-site.xml</pre>



<p>Set according to the below</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="637" height="215" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/mapred.site_.jpg" alt="" class="wp-image-17338" srcset="https://www.osradar.com/wp-content/uploads/2019/12/mapred.site_.jpg 637w, https://www.osradar.com/wp-content/uploads/2019/12/mapred.site_-300x101.jpg 300w" sizes="(max-width: 637px) 100vw, 637px" /></figure></div>



<h4><strong>4. yarn-site.xml</strong></h4>



<p>It will overwrite the configurations for <a href="https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">Hadoop.yarn</a> because it will define resource management &amp; job scheduling logic.</p>



<pre class="wp-block-verse">sudo nano /usr/local/hadoop/etc/hadoop/yarn-site.xml</pre>



<p>Do similar configuration</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="738" height="573" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/yarn-site.xml_.jpg" alt="" class="wp-image-17339" srcset="https://www.osradar.com/wp-content/uploads/2019/12/yarn-site.xml_.jpg 738w, https://www.osradar.com/wp-content/uploads/2019/12/yarn-site.xml_-300x233.jpg 300w, https://www.osradar.com/wp-content/uploads/2019/12/yarn-site.xml_-696x540.jpg 696w, https://www.osradar.com/wp-content/uploads/2019/12/yarn-site.xml_-541x420.jpg 541w" sizes="(max-width: 738px) 100vw, 738px" /></figure></div>



<h3><strong>Step 6: Validate Hadoop Configuration</strong></h3>



<p>Initialize Hadoop Infrastructure store.</p>



<pre class="wp-block-verse">sudo su - hadoop<br>hdfs namenode -format</pre>



<h3><strong>Test HDFS Configuration</strong></h3>



<pre class="wp-block-verse">$ start-dfs.sh<br>
Starting namenodes on [localhost]<br>
Starting datanodes<br>
Starting secondary namenodes [hbase]<br>
hbase: Warning: Permanently added 'hbase' (ECDSA) to the list of known hosts.</pre>



<p>In the end, verify the YARN configurations</p>



<pre class="wp-block-verse">$ start-yarn.sh<br>Starting resourcemanager<br>Starting nodemanagers</pre>



<p>Hadoop 2.x default web UI ports.</p>



<ul><li><strong>NameNode – </strong>Default HTTP port is 9870.</li><li><strong> ResourceManager – </strong>Default HTTP port is 8088.</li><li><strong> MapReduce JobHistory Server – </strong>Default HTTP port is 19888.</li></ul>



<p>Check these by typing</p>



<pre class="wp-block-verse">ss -tunelp</pre>



<p>Access Hadoop Web Dashboard at http://ServerIP:9870</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="692" height="332" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/Datanode-information.jpg" alt="" class="wp-image-17340" srcset="https://www.osradar.com/wp-content/uploads/2019/12/Datanode-information.jpg 692w, https://www.osradar.com/wp-content/uploads/2019/12/Datanode-information-300x144.jpg 300w" sizes="(max-width: 692px) 100vw, 692px" /></figure></div>



<p>See Hadoop Cluster Overview at http://ServerIP:8080</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="828" height="274" src="//1723336065.rsc.cdn77.org/wp-content/uploads/2019/12/hadoop-configure.jpg" alt="" class="wp-image-17341" srcset="https://www.osradar.com/wp-content/uploads/2019/12/hadoop-configure.jpg 828w, https://www.osradar.com/wp-content/uploads/2019/12/hadoop-configure-300x99.jpg 300w, https://www.osradar.com/wp-content/uploads/2019/12/hadoop-configure-768x254.jpg 768w, https://www.osradar.com/wp-content/uploads/2019/12/hadoop-configure-696x230.jpg 696w" sizes="(max-width: 828px) 100vw, 828px" /></figure></div>



<p>Let&#8217;s create a directory  to test</p>



<pre class="wp-block-verse">$ hadoop fs -mkdir /test<br> $ hadoop fs -ls /<br> Found 1 items<br> drwxr-xr-x   - hadoop supergroup          0 2019-12-29 10:23 /test</pre>



<h5><strong>Stopping Hadoop Services</strong></h5>



<p>Run the following command to stop the Hadoop Services.</p>



<pre class="wp-block-verse">$ stop-dfs.sh<br> $ stop-yarn.sh</pre>



<p>See our next article to read <a href="https://www.osradar.com/?p=17342">How To Install HBase on Ubuntu 18.04</a></p>
<p>The post <a rel="nofollow" href="https://www.osradar.com/how-to-install-apache-hadoop-hbase-on-ubuntu-18-04/">How To Install Apache Hadoop / HBase on Ubuntu 18.04</a> appeared first on <a rel="nofollow" href="https://www.osradar.com">Linux  Windows and android  Tutorials</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://www.osradar.com/how-to-install-apache-hadoop-hbase-on-ubuntu-18-04/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
