DEV Community

Cover image for Integrating LVM with Hadoop Cluster
Piyush Bagani
Piyush Bagani

Posted on

Integrating LVM with Hadoop Cluster

Here I am going to Integrate LVM with Hadoop and providing Elasticity to DataNode Storage.

Before We get started, Let’s understand some Basics.

What is LVM?

Logical Volume Management (LVM) creates a layer of abstraction over physical storage, allowing you to create logical storage volumes. With LVM in place, you are not bothered with physical disk sizes because the hardware storage is hidden from the software so it can be resized and moved without stopping applications or unmounting file systems. You can think of LVM as dynamic partitions.

For example, if you are running out of disk space on your server, you can just add another disk and extend the logical volume on the fly.

Below are some advantages of using Logical volumes over using physical storage directly:

Resize storage pools: You can extend the logical space as well as reduce it without reformatting the disks.
Flexible storage capacity: You can add more space by adding more disks and adding them to the pool of physical storage, thus you have a flexible storage capacity.

What is Hadoop?

Hadoop is an open-source, a Java-based programming framework that supports the storage and processing of extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
Some Basic Terminology:

Namenode: Also called the master node is the main component of the Hadoop Cluster. It stores the metadata about block locations etc. This metadata is useful for file read and write operations.

Datanode: Also known as Slave node, is someone who share their own components with master node. It is the final location for storing the files. There can be many data-nodes in one cluster.

So Let’s get started

Step 1: Attach new hard disk. Here, we have attached two hard disks.
alt text
Note: Here we are using AWS cloud where 2 instances had been launched. But you can use local systems also to perform this.
alt text
So after attaching the volumes , We can check whether hard disk is attached or not using fdisk -l command.
alt text
Step 2: Convert the hard disk to Physical Volume(PV)

pvcreate command initialize these disks so that they can be a part in forming volume groups.
alt text
We can use pvdisplay command to view information of PV.
alt text
Step 3: Create a volume group

Physical volumes are combined into volume groups (VGs). It creates a pool of disk space out of which logical volumes can be allocated.
alt text
We can view the information about volume group using vgdisplay command.

Step 4: Create a Logical Volume

A volume group is divided up into logical volumes. So if you have created vg-01 earlier then you can create logical volumes from that VG.
alt text
We can view the information of LV using lvdisplay command
alt text
Step 5: Format the Logical Volume/partition
alt text
Step 6: mount the partition
alt text
Step 7: To increase the size of the LV/partition use lvextend command.
alt text
Step 8: Now, reformat the new storage added to the LV using resize2fs command
alt text
Step 9: Check the size of the storage

alt text
We have increased the storage of Data-node from 4Gib to 12Gib on the fly.

That’s all We did it.

alt text

Thanks For Reading😊.

Keep Learning, Keep Hustling😊

Top comments (0)