Big data is a term that describes the massive amount of data that is available to organizations and individuals from various sources and devices 📱. This data is so large and complex that traditional data processing tools cannot handle it easily 💥.
But what are the different types of data under big data? How can we classify and organize them in a tabular format? And what are some examples of each type of data? In this article, we will answer these questions and more 🚀.
We will also look at some of the benefits and challenges of each type of data under big data 🔥.
Types of Data Under Big Data 🌈
There are three main types of data under big data: structured, semi-structured, and unstructured data 📄.
Each type of data has its own characteristics, sources, formats, and uses 💯.
Let's look at each type of data in detail and compare them in a tabular format ✨.
Structured Data 💎
Structured data is data that is easily formatted and stored in relational databases, such as numbers, dates, or text. Structured data has a predefined schema and structure that can be queried using SQL (Structured Query Language) 💯.
Structured data is also called relational data because it is split into multiple tables to enhance the integrity of the data by creating a single record to depict an entity. Relationships are enforced by the application of table constraints 🔮.
Structured data is easy to enter, query, and analyze because all of the data follows the same format 💡.
However, structured data has limited flexibility and scalability because any change in the schema or structure requires updating all of the records to adhere to the new rules 🙅♂️.
Some examples of structured data are customer records, sales transactions, product inventory, bank accounts, etc. 💰.
Semi-Structured Data 🌟
Semi-structured data is data that is partially formatted and stored in non-relational databases, such as JSON or XML files. Semi-structured data has some elements of structure, such as tags or keys, but does not follow a rigid schema or structure 🔮.
Semi-structured data is also called non-relational or NoSQL data because it does not use tables or SQL to store or query data 💯.
Semi-structured data is more flexible and scalable than structured data because it can accommodate different types and formats of data without changing the schema or structure 💡.
However, semi-structured data is more complex and challenging to query and analyze than structured data because it requires special tools and techniques to handle the variety and variability of data 🙅♀️.
Some examples of semi-structured data are web logs, social media posts, email messages, sensor data, etc. 💰.
Unstructured Data 💫
Unstructured data is data that is free-form and less quantifiable, such as text, audio, video, or images. Unstructured data does not have a predefined schema or structure and cannot be easily queried using SQL 🔥.
Unstructured data is also called non-tabular or raw data because it does not use tables or columns to store or query
data 💯.
Unstructured
data is more diverse
and dynamic than structured
or semi-structured
data because it can capture
and represent
any kind
of information
without any constraints 💡.
However,
unstructured
data is more difficult
and expensive to store,
process,
and analyze than structured
or semi-structured
data because it requires more storage
space,
processing power,
and advanced analytics techniques 🙅♂️.
Some examples
of unstructured
data are documents,
books,
articles,
podcasts,
videos,
or photos 💰.
Tabular Comparison of Types of Data Under Big Data 📊
Type | Definition | Source | Format | Use | Benefit | Challenge |
---|---|---|---|---|---|---|
Structured | Data that is easily formatted and stored in relational databases | Databases, spreadsheets, surveys | Numbers, dates, text | SQL queries, BI tools | Easy to enter, query, and analyze | Limited flexibility and scalability |
Semi-Structured | Data that is partially formatted and stored in non-relational databases | Web logs, social media posts, email messages | JSON, XML files | NoSQL queries, API calls | Flexible and scalable | Complex and challenging to query and analyze |
Unstructured | Data that is free-form and less quantifiable | Documents, books, articles,podcasts,videos , photos | Text,audio , video , images | Machine learning,NLP , computer vision , sentiment analysis | Diverse and dynamic | Difficult and expensive to store , process ,and analyze |
Conclusion 🎉
In this article,
we learned about the types
of data under big
data: structured,
semi-structured,
and unstructured
data 🤔.
We also learned about how to classify
and organize them in a tabular format with examples 🚀.
We also learned about some of the benefits
and challenges of each type
of data under big
data 🔥.
I hope you enjoyed this article
and learned something new 😊.
If you have any questions or feedback,
please feel free
to leave a comment below 👇.
Happy learning! 🙌
Top comments (0)