DEV Community

ram vnet
ram vnet

Posted on

STATISTICS - Uni-variate Non-Graphical Exploratory Data Analysis (EDA)

Uni-variate Non-Graphical Exploratory Data Analysis (EDA)

Uni-variate Non-Graphical EDA is the numerical examination of a single variable without using charts or graphs. The goal is to understand the data’s central value, spread, position, shape, and quality using statistical measures.

  1. Meaning

Uni-variate β†’ Only one variable is analyzed

Non-Graphical β†’ Uses numbers and statistics, not plots

Exploratory β†’ No assumptions; aims to discover patterns, anomalies, and summaries

πŸ“Œ Example variables: exam marks, age, income, daily sales, temperature.

  1. Objectives

Summarize the data numerically

Identify central tendency

Measure variability (dispersion)

Understand relative position of values

Detect outliers

Assess distribution shape

Check data quality

  1. Techniques Used in Uni-variate Non-Graphical EDA A. Measures of Central Tendency

Describe the typical or center value.

  1. Mean π‘₯Λ‰=βˆ‘π‘₯𝑛xΛ‰=nβˆ‘x ​

Most common average

Highly affected by outliers

  1. Median

Middle value of ordered data

Resistant to extreme values

  1. Mode

Most frequent value

Useful for discrete or categorical data

B. Measures of Dispersion

Describe how spread out the data is.

  1. Range Range = Max βˆ’ Min Range=Maxβˆ’Min
  2. Variance
    𝜎2=βˆ‘(π‘₯βˆ’π‘₯Λ‰)2𝑛σ2=nβˆ‘(xβˆ’xΛ‰)2
    ​

  3. Standard Deviation
    𝜎=𝜎2Οƒ=Οƒ2
    ​

Most widely used spread measure

  1. Inter-quartile Range (IQR)
    IQR=𝑄3βˆ’π‘„1

    ​
    Spread of middle 50%

Less affected by outliers

C. Measures of Position

Describe relative standing of values.

Percentiles (P10, P50, P90)

Quartiles (Q1, Q2, Q3)

Deciles (D1 to D9)

πŸ“Œ Example: 75th percentile means 75% of data lies below it.

D. Measures of Distribution Shape :

  1. Skewness

Positive skew β†’ Right tail longer

Negative skew β†’ Left tail longer

Zero skew β†’ Symmetrical distribution

  1. Kurtosis

Measures peakedness or tail thickness

Leptokurtic β†’ Sharp peak

Mesokurtic β†’ Normal

Platykurtic β†’ Flat

  1. Outlier Detection (Non-Graphical) IQR Method Lower limit =𝑄1βˆ’1.5(IQR) Lower limit=Q1βˆ’1.5(IQR) Upper limit=𝑄3+1.5(IQR)

Values outside β†’ Outliers

Z-Score Method
𝑧=π‘₯βˆ’πœ‡/𝜎
​
|z| > 3 β†’ Potential outlier

  1. Data Quality Checks

Uni-variate Non-Graphical EDA helps detect:

Missing values

Invalid values (negative age)

Extreme or impossible values

Data entry errors

  1. Advantages

βœ” Simple and fast
βœ” No visualization required
βœ” Works well for summaries
βœ” Ideal for exam and theory questions

  1. Limitations

βœ– No visual insight
βœ– Cannot show trends
βœ– Less intuitive for large datasets

  1. Example

Data: 10, 12, 15, 18, 20, 25, 40

Mean = 20

Median = 18

Range = 30

IQR = Moderate

Skewness = Positive

Outlier = 40

  1. Conclusion

Uni-variate Non-Graphical Exploratory Data Analysis is a numerical approach to understand a single variable by analyzing its center, spread, position, shape, and qualityβ€”without using graphs. It is a foundation step before advanced statistical analysis.

Read More...

Top comments (0)