DEV Community

ram vnet
ram vnet

Posted on

Statistics - Hypothesis Testing in Data Science

Hypothesis testing is a systematic procedure used in statistics and data science to decide whether a claim about a population is supported by sample data or not.

What is Hypothesis testing ?
Hypothesis testing is a statistical method used to make inferences about a population based on sample data. It involves formulating two competing hypotheses and using statistical techniques to determine which one is more likely to be true.

STEP 1: State the Problem Clearly
First, identify what you want to test.

📌 Example question:

Is the average score of students equal to 70?

STEP 2: Formulate the Hypotheses
(a) Null Hypothesis (H₀)
Assumes no change / no effect

Always contains equality (=, ≤, ≥)

H₀: μ = 70

(b) Alternative Hypothesis (H₁)
Opposite of H₀

Represents what we want to prove

H₁: μ ≠ 70 (two-tailed test)

STEP 3: Choose the Significance Level (α)
Probability of rejecting a true null hypothesis

Common values:

α = 0.05 (5%)

α = 0.01 (1%)

📌 Meaning:
There is a 5% risk of making a wrong decision.

STEP 4: Select the Appropriate Test
Choose the test based on:

Sample size

Type of data

Known or unknown population variance

Situation Test Used
Large sample, known variance Z-test
Small sample, unknown variance t-test
Categorical data Chi-square
More than two means ANOVA
STEP 5: Collect Sample Data
Gather data randomly from the population.

📌 Example:
Sample of 40 students’ scores.

STEP 6: Compute the Test Statistic
This value shows how far the sample result is from the assumed population value.

Examples:

Z statistic

t statistic

χ² statistic

📌 Formula (example – Z-test):

Z=xˉ−μσ/nZ = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}}Z=σ/n​xˉ−μ​

STEP 7: Determine the p-Value
p-value = Probability of observing the sample result assuming H₀ is true
📌 Interpretation:

Small p-value → Strong evidence against H₀

Large p-value → Weak evidence against H₀

STEP 8: Make the Decision
Decision Rule
If p-value ≤ α → Reject H₀

If p-value > α → Fail to reject H₀

📌 Example:

p-value = 0.03

α = 0.05
👉 Reject H₀

STEP 9: Draw a Statistical Conclusion
State the result in words, not symbols.

📌 Example:

“There is sufficient statistical evidence that the average score is different from 70.”

STEP 10: Interpret the Result in Context
Relate the conclusion to the real-world problem.

📌 Example:

The teaching method has a significant impact on students’ performance.

Flow Summary
1️⃣ Define the problem
2️⃣ State H₀ and H₁
3️⃣ Choose α
4️⃣ Select test
5️⃣ Collect data
6️⃣ Calculate test statistic
7️⃣ Find p-value
8️⃣ Decision (Reject / Accept H₀)
9️⃣ Conclusion
🔟 Real-world interpretation

Important Notes
“Fail to reject H₀” ≠ “Accept H₀”

Statistical significance ≠ Practical importance

Always check assumptions of the test

Read More...

Top comments (0)