Different business processes need different data structures due to different needs.
The aim of a data warehouse is to
- make the data simple to understand
- quality assured
- ready for new questions
It should deliver the data to the user
- in an understandable and performant dimensional model
- with conformed dimensions or separate data marts
- to report and visualize -- by interacting directly with the model -- or through an BI application
OLTP - online transactional processing
For running a business it is important to track different things, like for example inventory, financial transactions or different types of movements.
A single relational database might be sufficient for this part, as there is no redundancy and high integrity.
OLAP - online analytical processing
To understand what is going on in a business it is important to do analysis and monitoring. Like examine market segmentation over a period of time or split users into market segmentations.
As single relational database is not sufficient, because it might become too slow for analytics and quite hard to understand.
A data warehouse is a system that enables the support of an analytical process. It gathers data from multiple sources. It is a copy of transaction data specifically structured for queries and analysis.
The data from OLTP will run through an ETL process to provide data for the OLAP.
ETL - Extract Transform Load
- get data from a source
- update old states
- integrate many sources together
- clean data
- produce metadata
- structure the data
- load into a dimensional data model
The dimensional model will help to understand the data easily and perform fast analytical queries.
Fact table - numeric and additive
- business events, like a phone call or a review of a book
- context of the business events in quantifiable metrics. Like duration of a call, the rating of a book
- context of business events: who, what, where, why
- attributes, like: date, time, physical locations, human roles, good sold
Top comments (0)