DEV Community

sofaki000
sofaki000

Posted on

Data analysis with Matlab: distributions

Introduction:

Explore different distributions (normal, poisson, student,exponential) using matlab.


Normal distribution

  • normrnd = Normal random numbers

r = normrnd(mu,sigma) generates a random number from the normal distribution with mean parameter mu and standard deviation parameter sigma.

r = normrnd(mu,sigma,sz1,...,szN) generates an array of normal random numbers, where sz1,...,szN indicates the size of each dimension.

Arguments:
mu — Mean
sigma — Standard deviation

  • More on normal distribution on a separate article

Poisson distribution

  • poissrnd = generates random numbers from Poisson distribution.

  • r = poissrnd(lambda) generates random numbers from the Poisson distribution specified by the rate parameter lambda.

lambda can be a scalar, vector, matrix, or multidimensional array.

  • r = poissrnd(lambda,sz1,...,szN) generates an array of random numbers from the Poisson distribution with the scalar rate parameter lambda, where sz1,...,szN indicates the size of each dimension.

Examples:

1)

r_scalar = poissrnd(20)
r_scalar = 9
Enter fullscreen mode Exit fullscreen mode

2)
Generate a 2-by-3 array of random numbers from the same distribution by specifying the required array dimensions.

lambda = 20
r_array = poissrnd(lambda ,2,3)
r_array = 2×3

    13    14    18
    26    16    21

Enter fullscreen mode Exit fullscreen mode

3)

lambda = 10:2:20
lambda = 1×6

    10    12    14    16    18    20
Enter fullscreen mode Exit fullscreen mode

Generate random numbers from the Poisson distributions.

r = poissrnd(lambda)
r = 1×6

    14    13    14     9    14    31
Enter fullscreen mode Exit fullscreen mode
  • poisspdf= Poisson probability density function

Parameters:

x — Values at which to evaluate Poisson pdf
lambda — Rate parameters

Compute the Poisson probability density function values at each value from 0 to 10. These values correspond to the probabilities that a disk has 0, 1, 2, .., 10 flaws.

Examples:

1)

flaws = 0:10;
y = poisspdf(flaws,2);
Enter fullscreen mode Exit fullscreen mode

2)

tV = [0:100]';
lambda = 10; 
pois = poisspdf(tV,lambda);
Enter fullscreen mode Exit fullscreen mode

Student's distribution

  • tcdf = Student's t cumulative distribution function

p = tcdf(x,nu) returns the cumulative distribution function (cdf) of the Student's t distribution with nu degrees of freedom, evaluated at the values in x.

Arguments:
x — Values at which to evaluate cdf
nu — Degrees of freedom

  • tinv = Student's t inverse cumulative distribution function

Examples:

1)
Find the 95th percentile of the Student's t distribution with 50 degrees of freedom.

p = .95;   
nu = 50;   
x = tinv(p,nu)
x = 1.6759
Enter fullscreen mode Exit fullscreen mode

Exponential distribution

  • exprnd = calculates exponential random numbers

Examples:

1)

r = exprnd(mu)
r = exprnd(mu) generates a random number from the exponential distribution with mean mu.
Enter fullscreen mode Exit fullscreen mode

2)

r = exprnd(mu,sz1,...,szN)
 generates an array of random numbers from the exponential distribution, where sz1,...,szN indicates the size of each dimension.

Enter fullscreen mode Exit fullscreen mode
  • exppdf = Exponential probability density function

1) y = exppdf(x) returns the probability density function (pdf) of the standard exponential distribution, evaluated at the values in x.

Compute the density of the value 5 in the standard exponential distribution.

y1 = exppdf(5) 
y1 = 0.0067
Enter fullscreen mode Exit fullscreen mode

2) y = exppdf(x,mu) returns the pdf of the exponential distribution with mean mu, evaluated at the values in x.

Compute the density of the value 5 in the exponential distributions specified by means 1 through 5.

y2 = exppdf(5,1:5)
y2 = 1×5

    0.0067    0.0410    0.0630    0.0716    0.0736
Enter fullscreen mode Exit fullscreen mode

Arguments:

x — Values at which to evaluate pdf
mu — mean

Histograms

1) histogram = Histogram plot

2) histfit = Histogram with a distribution fit

histfit(data) plots a histogram of values in data using the number of bins equal to the square root of the number of elements in data and fits a normal density function.

Construct a histogram with a normal distribution fit.
(bins are the number of bars you will see in your hist)

r = normrnd(10,1,100,1); 
histfit(r)
Enter fullscreen mode Exit fullscreen mode

example
histfit(data,nbins) plots a histogram using nbins bins and fits a normal density function.
Construct a histogram using six bins with a normal distribution fit.

r = normrnd(10,1,100,1);
histfit(r,6)

Enter fullscreen mode Exit fullscreen mode

Clearing environment

  • clc = clears command window
  • clear = Remove items from workspace, freeing up system memory

General Useful Commands

  • mean = calculates average or mean value of array
  • nargin = returns the number of function input arguments
  • ttest = One-sample and paired-sample t-test

More on ttest

The one-sample t-test is a parametric test of the location parameter when the population standard deviation is unknown.

The test statistic is:

t= (x−μ)/(s/squar(n))  
Enter fullscreen mode Exit fullscreen mode

where x is the sample mean, μ is the hypothesized population mean, s is the sample standard deviation, and n is the sample size. Under the null hypothesis, the test statistic has Student’s t distribution with n – 1 degrees of freedom.

Arguments of ttest:

1)

Alpha — Significance level

0.05 (default) | scalar value in the range (0,1)

2)

Dim — Dimension

first nonsingleton dimension (default) | positive integer value
Dimension of the input matrix along which to test the means consisting of 'Dim' and a positive integer value. For example, specifying 'Dim',1 tests the column means, while 'Dim',2 tests the row means.
3)

Tail — Type of alternative hypothesis

'both' (default) | 'right' | 'left'
Type of alternative hypothesis to evaluate, specified as the comma-separated pair consisting of 'Tail' and one of:

'both' — Test against the alternative hypothesis that the population mean is not m.

'right' — Test against the alternative hypothesis that the population mean is greater than m.

'left' — Test against the alternative hypothesis that the population mean is less than m.

Output Arguments of ttest:

1)

Hypothesis test result

Hypothesis test result, returned as 1 or 0.

If h = 1, this indicates the rejection of the null hypothesis at the Alpha significance level.

If h = 0, this indicates a failure to reject the null hypothesis at the Alpha significance level.

2)

p — p-value,scalar value in the range [0,1]

p-value of the test, returned as a scalar value in the range [0,1]. p is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. Small values of p cast doubt on the validity of the null hypothesis.

3)

ci — Confidence interval

vector
Confidence interval for the true population mean, returned as a two-element vector containing the lower and upper boundaries of the 100 × (1 – Alpha)% confidence interval.

4)

stats — Test statistics

structure
Test statistics, returned as a structure containing the following:

tstat — Value of the test statistic.

df — Degrees of freedom of the test.

sd — Estimated population standard deviation. For a paired t-test, sd is the standard deviation of x – y.

Top comments (0)