Configuring Hadoop Cluster using Ansible and restarting httpd server

Poojya Puju
8 min readMar 21, 2021

configure Hadoop and Start cluster Services using Ansible playbook and restarting HTTPD service is not Idemoptence in nature using ansible-playbook

🔰 11.1 configure Hadoop and start cluster service using ansible-playbook

🔰 11.3 Restarting HTTPD service is not Idemoptence in nature and also consume more resources suggest a way to rectify this challenge in ansible-playbook

solution 11.1:

first of all, we have to install and configure the inventory of ansible. type the command in your VM it will install the ansible in your VM

pip3 install ansible

we can check the ansible version after installation via a command

ansible — -version

we can create a group of IP in the /ip.txt, in that we write IP of managed nodes as host in controller node. give the username and password of the managed nodes this is known inventory which helps us in automation the node which is in the group. you can see that there two IP which is in the group.

before acting on the managed node we need to check that all nodes are connected and pingable or not. That can be checked by the following command. that managed nodes' IPs are connected with a node that only.

Now, we can say that we are connected properly and then we have to configure the ansible.cfg file in the directory /etc/ansible/ . To run the playbook we need to give ssh permissions. this can be done via host_key_checking = False and we should also disable warning command command_warning= False.

Now, we have created our own playbook to run the code. Ansible supports a language YAML in the playbook. so we have created the playbook file is hadoop.yml file

Now we ready with the playbook code. I have given the complete code in a detailed manner.

- hosts: namenode
vars_files:
- var.yml
tasks:
- name: Copy Java Software
copy:
src: "/root/jdk-8u171-linux-x64.rpm"
dest: "/root/"
- name: Copy Hadoop Software
copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/"
- name: Install Java Software
shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
register: java_install
- name: java install information
debug:
var: java_install
- name: Install Hadoop Software
shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
register: hadoop_install
when: java_install.rc == 0
- name: hadoop install information
debug:
var: hadoop_install
- name: Create Directory
file:
state: directory
path: "{{ name_dir }}"
- name: Copy hdfs-site.xml file
template:
src: "name_hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Copy core-site.xml file
template:
src: "name_core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: Format the namenode directory
shell: "echo Y | hadoop namenode -format"
- name: Start Namenode Service
shell: "hadoop-daemon.sh start namenode"
- hosts: datanode
vars_files:
- var.yml
tasks:
- name: Copy Java Software
copy:
src: "/root/jdk-8u171-linux-x64.rpm"
dest: "/root/"
- name: Copy Hadoop Software
copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/"
- name: Install Java Software
shell: "rpm -i /root/jdk-8u171-linux-x64.rpm"
register: java_install
- name: java install information
debug:
var: java_install
- name: Install Hadoop Software
shell: "rpm -i /root/hadoop-1.2.1-1.x86_64.rpm --force"
register: hadoop_install
when: java_install.rc == 0
- name: hadoop install information
debug:
var: hadoop_install
- name: Create Directory
file:
state: directory
path: "{{ data_dir }}"
- name: Copy hdfs-site.xml file
template:
src: "data_hdfs-site.xml"
dest: "/etc/hadoop/hdfs-site.xml"
- name: Copy core-site.xml file
template:
src: "data_core-site.xml"
dest: "/etc/hadoop/core-site.xml"
- name: Start Namenode Service
shell: "hadoop-daemon.sh start datanode"

And my var file where I store the variables.

name_ip: 192.168.0.142
name_port: 9001
name_dir: /nn8
data_dir: /dn8

to run the playbook you need to write ansible-playbook hadoop.yml before running the playbook we can even check for any syntax errors by

ansible-playbook — -syntax-check hadoop.yml

now I check in the Namenode virtual machine that everything going well or not.

In the above image, you can see that firstly java and Hadoop are installed and jps command is not working but after running the playbook everything is configured.

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml files are configured after running the playbook

Now, let's check in the Datanode VM that everything is going well or not.

In the above image, you can see that firstly java and Hadoop is not installed and jps command is not working but after running the playbook everything is configured

In the above image, you can see the /etc/hadoop/hdfs-site.xml and /etc/hadoop/core-site.xml file is configured after running the playbook.

You can check the report of the Hadoop cluster by typing Hadoop dfsadmin -report.

hadoop dfsadmin -report

🔰11.3 Restarting HTTPD service is not Idemoptence in nature and also consume more resources suggest a way to rectify this challenge in ansible-playbook

Solution 11.3:

first of all, we have to install and configure the inventory of ansible. type the command in your VM it will install the ansible in your VM

pip3 install ansible

we can check the ansible version after installation via a command

ansible — -version

we can create a group of IP in the /ip.txt, in that we write IP of managed nodes as host in controller node. give the username and password of the managed nodes this is known inventory which helps us in automation the node which is in the group. you can see that there two IP which is in the group.

before acting on the managed node we need to check that all nodes are connected and pingable or not. That can be checked by the following command. The managed nodes' IPs are connected with a node that only.

Now, we can say that we are connected properly and then we have to configure the ansible.cfg file in the directory /etc/ansible/ . To run the playbook we need to give ssh permissions. this can be done via host_key_checking = False and we should also disable warning command command_warning= False.

Now, we have created our own playbook to run the code. Ansible supports a language YAML in the playbook. so we have created the playbook file and as well as the index.html file and the playbook file is webserver.yml file

Now we ready with the playbook code. I have given the complete code ina detailed manner.

-host: all
vars_files:
— var1.yml

tasks:
— name: “Create directory for dvd mount”
file:
state: directory
path: “{{ dvd_dir }}”

- name: “Mount the dvd to the directory”
mount:
src: “/dev/cdrom”
path: “{{ dvd_dir }}”
state: mounted
fstype: “iso9660”

- name: “Configure AppStream for yum”
yum_repository:
baseurl: “{{ dvd_dir }}/AppStream”
name: “dvd1”
description: “dvd1 for AppStream packages”
gpgcheck: no

- name: “Configure BaseOS for yum”
yum_repository:
baseurl: “{{ dvd_dir }}/BaseOS”
name: “dvd2”
description: “dvd2 for BaseOS packages”
gpgcheck: no

- name: “Install package”
package:
name: “httpd”
state: present
register: x

- name: “Create directory for web server”
file:
state: directory
path: “{{ doc_root }}”
register: y

- name: “Copy the configuration file”
template:
dest: “/etc/httpd/conf.d/lw.conf”
src: “lw.conf”
when: x.rc == 0
notify:
— Start service

- name: “Copy the web page”
copy:
dest: “{{ doc_root }}/index.html”
content: “this is neeew web page\n”
when: y.failed == false

— name: “start httpd service”
service:
name: “httpd”
state: started

- name: “Create firewall rule”
firewalld:
port: “{{ http_port }}/tcp”
state: enabled
permanent: yes
immediate: yes

handlers:
— name: Start service
service:
name: “httpd”
state: restarted

doc_root: “/var/www/arvind”
dvd_dir: “/dvd5”
http_port: 8082

and my var file where I store the Variables

to run the playbook you need to write ansible-playbook webserver.yml before running the playbook we can even check for any syntax errors by

ansible-playbook — syntax-check webserver.yml

now you can check in VM whose IP is 192.168.0.101 where I want to deploy web server.

now you can from the browser that web server is running or not

Now If you run the playbook again then it will show that your service is started so no need the restart again this becomes possible because of the handlers and notify keywords in ansible.

now I change my var file where I store the variables.

doc_root: “/var/www/arvind”
dvd_dir: “/dvd5”
http_port: 8083

Now I run my playbook again with new variables.

now you can check in VM whose IP where I want to deploy the webserver.

YOU can check the final output from the browser and type both the port number 8082 as well as 8083

hope that you find some interesting things

Thank You, Guys…😊😊😊😊😊😊

--

--