DEV Community

seng
seng

Posted on

Normal Distribution with Python

Generating Normally Distributed Data

np.random.normal method

import numpy as np

# Parameters: mean=0.98, standard deviation=4, number of samples=10000
data = np.random.normal(loc=0, scale=1, size=10000)
print(data[:50])  # Print the first 50 data points
Enter fullscreen mode Exit fullscreen mode

Probability Density Function (PDF)

scipy.stats module's norm.pdf

from scipy.stats import norm
import numpy as np

# Parameters: mean=0.98, standard deviation=4, number of samples=10000
mean = 0.98  # Mean
std_dev = 4 # Standard deviation
data = np.random.normal(loc=0, scale=1, size=10000)




# Calculate PDF
pdf = norm.pdf(data, loc=mean, scale=std_dev)
print(pdf) #Probability Density
Enter fullscreen mode Exit fullscreen mode
[0.09838715 0.09937757 0.09967584 ... 0.09653173 0.07867429 0.09311813]
Enter fullscreen mode Exit fullscreen mode

PDF Plot

from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt

# Parameters: mean=0.98, standard deviation=4, number of samples=10000
mean = 0.98  # Mean
std_dev = 4 # Standard deviation
data = np.random.normal(loc=mean , scale=std_dev, size=10000)






# Plot histogram (data distribution)
plt.hist(data, bins=30, density=True, alpha=0.6, color='b', label='Histogram')

# Plot probability density function (PDF)
x = np.linspace(min(data), max(data), 10000)  # Define the x-axis range
pdf = norm.pdf(x, loc=mean, scale=std_dev)  # Calculate the probability density function
plt.plot(x, pdf, 'k', linewidth=2, label='PDF (Normal)')

# Add title and legend
plt.title('Normal Distribution')
plt.xlabel('Value')
plt.ylabel('Density')
plt.legend()

# Display the plot
plt.show()
Enter fullscreen mode Exit fullscreen mode

'k' specifies a black curve. More options are available as shown in the table below:

Color Code Description
'b' Blue blue
'r' Red red
'g' Green green
'c' Cyan cyan
'm' Magenta magenta
'y' Yellow yellow
'k' Black black
'w' White white

Cumulative Distribution

norm.cdf calculates the cumulative distribution function for the normal distribution.

from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt

# Parameters: mean=0.98, standard deviation=4, number of samples=10000
mean = 0.98  # Mean
std_dev = 4 # Standard deviation
data = np.random.normal(loc=mean , scale=std_dev, size=10000)








# Plot histogram (data distribution)
plt.hist(data, bins=30, density=True, alpha=0.6, color='b', label='Histogram')

# Plot cumulative distribution function (CDF)
x = np.linspace(min(data), max(data), 10000)  # Define the x-axis range
cdf = norm.cdf(x, loc=mean, scale=std_dev)  # Calculate the cumulative distribution function
plt.plot(x, cdf, 'k', linewidth=2, label='CDF (Normal)')

# Add title and legend
plt.title('Normal Distribution')
plt.xlabel('Value')
plt.ylabel('Density')
plt.legend()

# Display the plot
plt.show()
Enter fullscreen mode Exit fullscreen mode

Normal Distribution Percent Point Function

norm.ppf calculates the quantile (inverse of CDF) for a given probability.

from scipy.stats import norm
import numpy as np
# Calculate quantiles, for example, for probabilities 0.018 and 0.819
q1 = norm.ppf(0.018, loc=0, scale=1)
q2 = norm.ppf(0.819, loc=0, scale=1)
print(f"0.018 quantile: {q1}, 0.819 quantile: {q2}")
Enter fullscreen mode Exit fullscreen mode
PS E:\learn\learnpy> & "D:/Program Files/Python311/python.exe" e:/learn/learnpy/learn.py
0.018 quantile: -2.0969274291643423, 0.819 quantile: 0.9115607350675405
PS E:\learn\learnpy>
Enter fullscreen mode Exit fullscreen mode

Normal Distribution Fitting

scipy.stats's norm.fit for normal distribution fitting, estimating mean and standard deviation.

from scipy.stats import norm
import numpy as np
# Generate normally distributed data
data = np.random.normal(loc=3, scale=1.67, size=10000)
# Fit the data
mu, sigma = norm.fit(data)
print(f"Fitted mean: {mu}, Fitted standard deviation: {sigma}")
Enter fullscreen mode Exit fullscreen mode
PS E:\learn\learnpy> & "D:/Program Files/Python311/python.exe" e:/learn/learnpy/learn.py
Fitted mean: 3.006577810135438, Fitted standard deviation: 1.672727044555993
Enter fullscreen mode Exit fullscreen mode

Top comments (0)