Enock Kipngetich

Posted on Feb 3

How Statistics Can Be Used To Drive Business Decisions

A Practical Case Study Using Retail Sales Data

1. Introduction

In today’s data-driven world, businesses collect large volumes of data but often struggle to turn it into actionable decisions. While advanced machine learning models are popular, statistics remains the foundation of effective business analysis. Statistical methods help organizations understand performance, manage uncertainty, test assumptions, and make informed decisions with confidence.

2. Problem Statement

The retail company operates both online and physical stores across multiple regions. Management faces three key challenges:

Understanding revenue performance
What does “typical” revenue look like?
How stable are sales over time?
Evaluating reliability of insights

Are observed patterns meaningful or due to randomness?
Is the data representative of the entire business?
Assessing marketing effectiveness

Does running a marketing campaign increase average revenue per transaction?
Is the observed increase statistically and practically meaningful?

3. Statistical Methods Used

This project applied several core statistical techniques, each serving a specific business purpose.

3.1 Descriptive Statistics

Descriptive statistics were used to summarize and understand revenue data.

Central Tendency

Mean
Median
Mode

These measures helped identify what a “typical” revenue value looks like.

Dispersion

Range
Variance
Standard deviation

These metrics measured how much revenue varies over time and across transactions.

3.2 Distribution Shape Analysis

Revenue data was visualized using histograms to assess:
Skewness (direction of asymmetry)
Kurtosis (presence of extreme values)

Understanding distribution shape helps determine:

Which summary statistics are reliable
Whether standard statistical tests can be applied

3.3 Data Visualization

Several visual tools were used:

Line charts to analyze revenue trends over time
Bar charts to compare store types
Box plots to compare regions
Scatter plots to explore the relationship between marketing spend and revenue
Why this matters:
Visualizations reveal patterns that raw numbers often hide and help stakeholders understand insights quickly.

3.4 Sampling and Bias Analysis

The project examined:

The difference between population and sample
The impact of sampling bias, especially when only urban stores are included

A stratified random sampling approach was recommended to improve representativeness.

3.5 Law of Large Numbers (LLN)

The Law of Large Numbers was demonstrated by showing how sample means stabilize as sample size increases.
Business lesson:
Decisions based on small datasets are risky and may lead to misleading conclusions.

3.6 Central Limit Theorem (CLT)

The CLT was used to show that:
Even when revenue is skewed,
The distribution of sample means becomes approximately normal with sufficient sample size

This justified the use of parametric statistical tests.

3.7 Hypothesis Testing (t-test)

A one-tailed t-test was conducted to evaluate whether marketing campaigns increased average revenue.
Null hypothesis (H₀): No difference in average revenue
Alternative hypothesis (H₁): Campaign revenue is higher
The test used:
95% confidence level
α = 0.05

3.8 Effect Size and Power

Beyond statistical significance, the project calculated Cohen’s d to measure the magnitude of the campaign’s effect.
Power considerations were discussed to explain why some real effects may not appear statistically significant.

4. Key Findings

The statistical analysis produced several important findings:

Revenue is positively skewed
Extreme high-revenue transactions exist
The median is a better measure of typical revenue than the mean
Sales variability is high
High standard deviation indicates unstable revenue
External factors (campaigns, seasonality) strongly influence performance
Online stores perform strongly

Online channels contribute significantly to total revenue
They show scalability advantages

Marketing campaigns increase revenue
The t-test showed a statistically significant increase in average revenue
Cohen’s d indicated a medium practical effect
Sampling matters
Urban-only samples lead to biased conclusions
Representative sampling improves decision reliability

5. Business Implications

The findings translate directly into business actions:

Better Performance Measurement

Use median revenue in reports to avoid distortion from extreme values

Improved Forecasting

High variability suggests the need for better demand planning and revenue smoothing strategies

Evidence-Based Marketing Decisions

Marketing campaigns should be continued and optimized, not evaluated based on intuition alone

Smarter Data Collection

Representative sampling ensures insights reflect the entire business
Larger samples increase confidence in decisions

Balanced Decision-Making

Statistical significance should be combined with effect size and business context

6. Why This Matters for Data Learners

This case study highlights an important lesson for beginners and intermediate data learners:

You don’t need machine learning to deliver value.
Strong statistical thinking is often enough.
By mastering:
Descriptive statistics
Sampling concepts
Probability theory
Hypothesis testing

You can solve real business problems and communicate insights clearly to stakeholders.

7. Conclusion

Statistics plays a critical role in transforming raw data into informed business decisions. Through this retail sales case study, we demonstrated how statistical methods help businesses understand performance, reduce uncertainty, test assumptions, and evaluate strategies objectively.

When applied correctly, statistics provides not only answers, but confidence that decisions are based on evidence rather than guesswork.

DEV Community