DEV Community


Dark Data and why it matters in Big Data

Thanh (Bruce) Pham
I am the CEO of Saigon Technology, a leading software development company in Vietnam.
・3 min read

Dark data refers to the data generated from regular business activities but are rarely utilized. They are not used to draw insights useful for decision making in the business. Instead, they to be retained, mostly for compliance purposes. Their handling and storage often translates to additional expenses and elevated risk of manipulation rather than offering increased value to the business.
Liability vs. potential
For most organizations, dark data remains a pressing issue. According to IBM, it is estimated that 90 percent of big data collected from multiple streams is dark data that never gets used. In most organizations, dark data generally lies idle in data archives due to lack of tools to enable their utilization, or generally finding no use of the data.
While the collection of vast volumes of dark data can have storage and cost implications for companies, while increasing risk of the data being compromised, those that have learned to handle and utilize dark data effectively find it pivotal in positioning them ahead of the competition. Business executives now realize the role of dark data in guiding business intelligence and innovation and, generally, guiding effective functioning of a business.
Taping the potential of dark data lies in the ability of businesses to employ dark analytics into their operations.
Dark analytics
CIOs and business leaders have been experimenting with dark analysis. This spectrum of analysis of dark data focuses on exploring the relationships and patterns lying in unstructured data. This way, developers grant businesses the ability to unearth customer and operational insights that may not be evident from analysis of structured data.
Dark data make up the larger proportion of data stored in business storage archives. While they often go underutilized, businesses can find them useful, especially for those seeking to improve customer satisfaction through personalized services. For some businesses, dark data provides them a gateway to the adoption of disruptive technologies such as Internet of Things (IoT).
How CIOs and CDOs can incorporate dark data into their data analytics strategies
A common theme cutting across most businesses is they have limited understanding of how to tap into the power of dark data, despite generating large volumes of data. Dark data, to them, is always an afterthought. For others, it is their sheer lack of power to employ necessary dark analytics tools.
CIOs, CDOs and IT teams should look to explore the value of unstructured business data and adopt the same into their data analytics strategy by:
Identifying the compendium of data under company management
The first step in realizing the value of dark data is having comprehensive documentation of compendium of data under company management. An audit of business systems and structures and how they function is the key to identifying data sources, types of data generated from them and why they are generated. Thereafter, CIOs and CDOs can develop a holistic strategic plan for the different data types from an organization.
Make use of what you can
As mentioned earlier, one of the main reasons businesses fail to utilize dark data is their limited understanding and unavailability of tools for dark analytics. However, instead of completely overlooking the compendium of dark data available, they can make use of elements of data they can exploit with the existing tools as they explore means of utilizing the remaining sets of data.
Another great strategy is to augment your existing data with outside data sources. This can be data from the market or competitors. This is a good strategy for enhancing the value of data under your management.
Demonstrate results
The value of a set of data lies in the ability to draw actionable insights from the same. CIOs should be able to create a strong business case for dark business data. One effective way of demonstrating the value of dark data is adopting new technologies such as Artificial Intelligence, Machine Learning and IoT. For instance, call logs, email respondents and web server logs can be useful for improving automated and personalized customer services.
In demonstrating results, the usefulness of dark data lies in its ability to deliver immediate value to business operations.
Curating data for integrity, privacy and data quality
Transforming unstructured data into digital formats requires thorough quality assurance checks for ensured quality and integrity of the data. The value of data lies in its accuracy and completeness. From the onset, clear frameworks for detecting and rectifying errors and addressing privacy concerns through data encryption should be put in place before the digitized data is adopted for utilization.
Periodic audits and database trimming
The utility of some data types have short lifespans. Businesses can adopt guidelines for data retention and disposal. Periodic audits and database trimmings ensures optimal use of data storage resources, by only retaining data that adds value to business operations.
Dark data largely remains unutilized in most organizations. However, they can be excellent sources of information that add significant value to how businesses operate.

Discussion (0)