DEV Community

Deependra kushwah
Deependra kushwah

Posted on

Comparison of Azure Data Lake Storage(ADLS) Gen1 vs Gen2

Azure Data Lake Storage or short name Data Lake is a system or repository of a huge quantity of data in a structured or unstructured form that is stored in its natural format(i.e object blobs or files). Data lakes handle the three Vs of big data (Volume, Velocity, and Variety).

Let’s dig deeper and understand all the topics related to Azure data lake. Hope you will find it helpful for sure.

What is Azure data lake storage?

It is a part of the Microsoft Azure public cloud platform, which is a cloud platform that supports big data storage and analytics of any kind and size.

What is Azure data lake storage

Data lake storage is the storage of information solutions that has specifically been designed for the analytics of big data. Its working is quite simple.

Each data lake service underneath always has a container. That container is very often called a file system and just like any file system it has folders and files within it which is fully scalable and secure that supports HDFS semantics while working with the Apache Hadoop ecosystem.

On each data lake, you can actually have multiple containers, multiple file systems containing any structure of files and folders that you wish to have.

What are the features of Azure Data Lake?

Some notable features of Azure data lake storage are as follows

  • Infinite size of data can be stored in a single repository.
  • Both structured and unstructured data in their natural formats can be stored
  • There is high availability, durability, and reliability.

There are 2 types of Azure data lake storage ADLS Gen1 and ADLS Gen2.

Earlier Azure Data Lake Gen1 was generally used but now Azure Data Lake Gen2 is mostly used and it is also reported that on Feb 29, 2024, Gen1 will be retired, so anyone using Azure Data Lake Gen1 has to migrate to Azure Data Lake Gen2 by that date.

ADLS Gen1 can be accessed from Hadoop using the WebHDFS-compatible REST APIs. It has all enterprise-grade capabilities such as security, manageability, scalability, reliability, and availability.

ADLS Gen2 is designed for big data analytics which means there is something called azure blob file system or ABFS which is encrypted. This file system is compatible, allowing many of the existing solutions in the market to connect with no hustle.

Top comments (0)