Limiting The Storage In Hadoop Cluster By Data Node

Limiting The Storage In Hadoop Cluster By Data Node

𝗛𝗲𝗹𝗹𝗼 π—–π—Όπ—»π—»π—²π—°π˜π—Άπ—Όπ—»π˜€ !! Welcome you all to my article on Limiting The Storage In Hadoop Cluster By Data Node. An overview of Hadoop Cluster, Partition of storage and AWS Services will also be present too. Finally How can we utilize Partition command with AWS S3 storage limiting it in data node ? we will also do this on AWS S3 and EBS storage using some Linux based partition commands.

COMPLETION OF THE TASK: Limiting The Storage In Hadoop Cluster By Data Node

1) I created a Hadoop cluster on the AWS Cloud.

So, let’s get start with Hadoop’s Name Node and Data Node services too.

AWS Instances
2) We may also store any folder or directory on a single storage device, such as C-drive. Normally, the folder takes up the entire storage drive’s space.

In a Hadoop cluster, we also provide storage for Data Nodes in the form of a folder or directory that, by default, shares the whole storage capacity of the disc on which the folder is located. Limiting The Storage In Hadoop Cluster By Data Node.

Using the command:

df -h

we can also see how much space is available on the drive where the Data Node storage directory is located.

These Linux commands also display all available and consumed hard drive or volume space information.

Limiting The Storage In Hadoop Cluster By Data Node

The size of the / drive is 10 GB, as can be see. The Data Node also Directory is mount on the / disc in my case. Let’s see if Data node shares the storage of / drive folder.

hadoop dfsadmin -report
Limiting The Storage In Hadoop Cluster By Data Node
3) Now we need to figure out how to restrict or limit the storage of Data Nodes in a Hadoop cluster.

We’ll also need to employ the idea of disc partitions for this. So, let’s make a small EBS Volume and connect it to the Data Node.

AWS Instances
We can also see all the disk partitions present in our Data Node using Linux Command :
fdisk -l
Limiting The Storage In Hadoop Cluster By Data Node
4) The EBS Volume /dev/xvdf of 15 GB has been link to the Data Node too. We may also use the following command to create partitions:
fdisk  device_name
Linux Partition command

Besides It We just need to indicate the quantity or size of the primary partition’s Last Sector. In my situation, I built an 8Β Β partition.

5) Verify whether the partition was effectively form. We can see that an 8 GB partition name /dev/xvdf1 has been create.
Linux Partition command
6) Now we have to format the partition.

To format the partition we have command as :

mkfs.ext4  device_name
Limiting The Storage In Hadoop Cluster By Data Node
7) Once the partition has been format, we must mount our Data Node directory to the newly formed partition.

Besides it To mount the partition, use the following command:

mount  device_name  directory_name
Storage Mounting command
8) We can also see that we now have control or a limit on Data Node Storage since the /dn1 Data Node Directory can now access the storage that we have supplied.
Limiting The Storage In Hadoop Cluster By Data Node

Finally We can also see that the Data Node no longer uses the entire storage of the EBS Volume of size 15 GB, as we have control over it, and instead uses the idea of disc partitions to supply it with a set amount of storage.

Related Articles: Limiting The Storage In Hadoop Cluster By Data Node

Limiting The Storage In Hadoop Cluster By Data Node

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top