DEV Community

sofaki000
sofaki000

Posted on

2

Data analysis with Matlab: distributions

Introduction:

Explore different distributions (normal, poisson, student,exponential) using matlab.


Normal distribution

  • normrnd = Normal random numbers

r = normrnd(mu,sigma) generates a random number from the normal distribution with mean parameter mu and standard deviation parameter sigma.

r = normrnd(mu,sigma,sz1,...,szN) generates an array of normal random numbers, where sz1,...,szN indicates the size of each dimension.

Arguments:
mu — Mean
sigma — Standard deviation

  • More on normal distribution on a separate article

Poisson distribution

  • poissrnd = generates random numbers from Poisson distribution.

  • r = poissrnd(lambda) generates random numbers from the Poisson distribution specified by the rate parameter lambda.

lambda can be a scalar, vector, matrix, or multidimensional array.

  • r = poissrnd(lambda,sz1,...,szN) generates an array of random numbers from the Poisson distribution with the scalar rate parameter lambda, where sz1,...,szN indicates the size of each dimension.

Examples:

1)

r_scalar = poissrnd(20)
r_scalar = 9
Enter fullscreen mode Exit fullscreen mode

2)
Generate a 2-by-3 array of random numbers from the same distribution by specifying the required array dimensions.

lambda = 20
r_array = poissrnd(lambda ,2,3)
r_array = 2×3

    13    14    18
    26    16    21

Enter fullscreen mode Exit fullscreen mode

3)

lambda = 10:2:20
lambda = 1×6

    10    12    14    16    18    20
Enter fullscreen mode Exit fullscreen mode

Generate random numbers from the Poisson distributions.

r = poissrnd(lambda)
r = 1×6

    14    13    14     9    14    31
Enter fullscreen mode Exit fullscreen mode
  • poisspdf= Poisson probability density function

Parameters:

x — Values at which to evaluate Poisson pdf
lambda — Rate parameters

Compute the Poisson probability density function values at each value from 0 to 10. These values correspond to the probabilities that a disk has 0, 1, 2, .., 10 flaws.

Examples:

1)

flaws = 0:10;
y = poisspdf(flaws,2);
Enter fullscreen mode Exit fullscreen mode

2)

tV = [0:100]';
lambda = 10; 
pois = poisspdf(tV,lambda);
Enter fullscreen mode Exit fullscreen mode

Student's distribution

  • tcdf = Student's t cumulative distribution function

p = tcdf(x,nu) returns the cumulative distribution function (cdf) of the Student's t distribution with nu degrees of freedom, evaluated at the values in x.

Arguments:
x — Values at which to evaluate cdf
nu — Degrees of freedom

  • tinv = Student's t inverse cumulative distribution function

Examples:

1)
Find the 95th percentile of the Student's t distribution with 50 degrees of freedom.

p = .95;   
nu = 50;   
x = tinv(p,nu)
x = 1.6759
Enter fullscreen mode Exit fullscreen mode

Exponential distribution

  • exprnd = calculates exponential random numbers

Examples:

1)

r = exprnd(mu)
r = exprnd(mu) generates a random number from the exponential distribution with mean mu.
Enter fullscreen mode Exit fullscreen mode

2)

r = exprnd(mu,sz1,...,szN)
 generates an array of random numbers from the exponential distribution, where sz1,...,szN indicates the size of each dimension.

Enter fullscreen mode Exit fullscreen mode
  • exppdf = Exponential probability density function

1) y = exppdf(x) returns the probability density function (pdf) of the standard exponential distribution, evaluated at the values in x.

Compute the density of the value 5 in the standard exponential distribution.

y1 = exppdf(5) 
y1 = 0.0067
Enter fullscreen mode Exit fullscreen mode

2) y = exppdf(x,mu) returns the pdf of the exponential distribution with mean mu, evaluated at the values in x.

Compute the density of the value 5 in the exponential distributions specified by means 1 through 5.

y2 = exppdf(5,1:5)
y2 = 1×5

    0.0067    0.0410    0.0630    0.0716    0.0736
Enter fullscreen mode Exit fullscreen mode

Arguments:

x — Values at which to evaluate pdf
mu — mean

Histograms

1) histogram = Histogram plot

2) histfit = Histogram with a distribution fit

histfit(data) plots a histogram of values in data using the number of bins equal to the square root of the number of elements in data and fits a normal density function.

Construct a histogram with a normal distribution fit.
(bins are the number of bars you will see in your hist)

r = normrnd(10,1,100,1); 
histfit(r)
Enter fullscreen mode Exit fullscreen mode

example
histfit(data,nbins) plots a histogram using nbins bins and fits a normal density function.
Construct a histogram using six bins with a normal distribution fit.

r = normrnd(10,1,100,1);
histfit(r,6)

Enter fullscreen mode Exit fullscreen mode

Clearing environment

  • clc = clears command window
  • clear = Remove items from workspace, freeing up system memory

General Useful Commands

  • mean = calculates average or mean value of array
  • nargin = returns the number of function input arguments
  • ttest = One-sample and paired-sample t-test

More on ttest

The one-sample t-test is a parametric test of the location parameter when the population standard deviation is unknown.

The test statistic is:

t= (x−μ)/(s/squar(n))  
Enter fullscreen mode Exit fullscreen mode

where x is the sample mean, μ is the hypothesized population mean, s is the sample standard deviation, and n is the sample size. Under the null hypothesis, the test statistic has Student’s t distribution with n – 1 degrees of freedom.

Arguments of ttest:

1)

Alpha — Significance level

0.05 (default) | scalar value in the range (0,1)

2)

Dim — Dimension

first nonsingleton dimension (default) | positive integer value
Dimension of the input matrix along which to test the means consisting of 'Dim' and a positive integer value. For example, specifying 'Dim',1 tests the column means, while 'Dim',2 tests the row means.
3)

Tail — Type of alternative hypothesis

'both' (default) | 'right' | 'left'
Type of alternative hypothesis to evaluate, specified as the comma-separated pair consisting of 'Tail' and one of:

'both' — Test against the alternative hypothesis that the population mean is not m.

'right' — Test against the alternative hypothesis that the population mean is greater than m.

'left' — Test against the alternative hypothesis that the population mean is less than m.

Output Arguments of ttest:

1)

Hypothesis test result

Hypothesis test result, returned as 1 or 0.

If h = 1, this indicates the rejection of the null hypothesis at the Alpha significance level.

If h = 0, this indicates a failure to reject the null hypothesis at the Alpha significance level.

2)

p — p-value,scalar value in the range [0,1]

p-value of the test, returned as a scalar value in the range [0,1]. p is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. Small values of p cast doubt on the validity of the null hypothesis.

3)

ci — Confidence interval

vector
Confidence interval for the true population mean, returned as a two-element vector containing the lower and upper boundaries of the 100 × (1 – Alpha)% confidence interval.

4)

stats — Test statistics

structure
Test statistics, returned as a structure containing the following:

tstat — Value of the test statistic.

df — Degrees of freedom of the test.

sd — Estimated population standard deviation. For a paired t-test, sd is the standard deviation of x – y.

AWS GenAI LIVE image

How is generative AI increasing efficiency?

Join AWS GenAI LIVE! to find out how gen AI is reshaping productivity, streamlining processes, and driving innovation.

Learn more

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay