sofaki000

Posted on

# Data analysis with Matlab: distributions

Introduction:

Explore different distributions (normal, poisson, student,exponential) using matlab.

### Normal distribution

• normrnd = Normal random numbers

r = normrnd(mu,sigma) generates a random number from the normal distribution with mean parameter mu and standard deviation parameter sigma.

r = normrnd(mu,sigma,sz1,...,szN) generates an array of normal random numbers, where sz1,...,szN indicates the size of each dimension.

Arguments:
mu — Mean
sigma — Standard deviation

• More on normal distribution on a separate article

### Poisson distribution

• poissrnd = generates random numbers from Poisson distribution.

• r = poissrnd(lambda) generates random numbers from the Poisson distribution specified by the rate parameter lambda.

lambda can be a scalar, vector, matrix, or multidimensional array.

• r = poissrnd(lambda,sz1,...,szN) generates an array of random numbers from the Poisson distribution with the scalar rate parameter lambda, where sz1,...,szN indicates the size of each dimension.

Examples:

1)

``````r_scalar = poissrnd(20)
r_scalar = 9
``````

2)
Generate a 2-by-3 array of random numbers from the same distribution by specifying the required array dimensions.

``````lambda = 20
r_array = poissrnd(lambda ,2,3)
r_array = 2×3

13    14    18
26    16    21

``````

3)

``````lambda = 10:2:20
lambda = 1×6

10    12    14    16    18    20
``````

Generate random numbers from the Poisson distributions.

``````r = poissrnd(lambda)
r = 1×6

14    13    14     9    14    31
``````
• poisspdf= Poisson probability density function

Parameters:

x — Values at which to evaluate Poisson pdf
lambda — Rate parameters

Compute the Poisson probability density function values at each value from 0 to 10. These values correspond to the probabilities that a disk has 0, 1, 2, .., 10 flaws.

Examples:

1)

``````flaws = 0:10;
y = poisspdf(flaws,2);
``````

2)

``````tV = [0:100]';
lambda = 10;
pois = poisspdf(tV,lambda);
``````

### Student's distribution

• tcdf = Student's t cumulative distribution function

p = tcdf(x,nu) returns the cumulative distribution function (cdf) of the Student's t distribution with nu degrees of freedom, evaluated at the values in x.

Arguments:
x — Values at which to evaluate cdf
nu — Degrees of freedom

• tinv = Student's t inverse cumulative distribution function

Examples:

1)
Find the 95th percentile of the Student's t distribution with 50 degrees of freedom.

``````p = .95;
nu = 50;
x = tinv(p,nu)
x = 1.6759
``````

### Exponential distribution

• exprnd = calculates exponential random numbers

Examples:

1)

``````r = exprnd(mu)
r = exprnd(mu) generates a random number from the exponential distribution with mean mu.
``````

2)

``````r = exprnd(mu,sz1,...,szN)
generates an array of random numbers from the exponential distribution, where sz1,...,szN indicates the size of each dimension.

``````
• exppdf = Exponential probability density function

1) y = exppdf(x) returns the probability density function (pdf) of the standard exponential distribution, evaluated at the values in x.

Compute the density of the value 5 in the standard exponential distribution.

``````y1 = exppdf(5)
y1 = 0.0067
``````

2) y = exppdf(x,mu) returns the pdf of the exponential distribution with mean mu, evaluated at the values in x.

Compute the density of the value 5 in the exponential distributions specified by means 1 through 5.

``````y2 = exppdf(5,1:5)
y2 = 1×5

0.0067    0.0410    0.0630    0.0716    0.0736
``````

Arguments:

x — Values at which to evaluate pdf
mu — mean

### Histograms

1) histogram = Histogram plot

2) histfit = Histogram with a distribution fit

histfit(data) plots a histogram of values in data using the number of bins equal to the square root of the number of elements in data and fits a normal density function.

Construct a histogram with a normal distribution fit.
(bins are the number of bars you will see in your hist)

``````r = normrnd(10,1,100,1);
histfit(r)
``````

example
histfit(data,nbins) plots a histogram using nbins bins and fits a normal density function.
Construct a histogram using six bins with a normal distribution fit.

``````r = normrnd(10,1,100,1);
histfit(r,6)

``````

### Clearing environment

• clc = clears command window
• clear = Remove items from workspace, freeing up system memory

## General Useful Commands

• mean = calculates average or mean value of array
• nargin = returns the number of function input arguments
• ttest = One-sample and paired-sample t-test

#### More on ttest

The one-sample t-test is a parametric test of the location parameter when the population standard deviation is unknown.

The test statistic is:

``````t= (x−μ)/(s/squar(n))
``````

where x is the sample mean, μ is the hypothesized population mean, s is the sample standard deviation, and n is the sample size. Under the null hypothesis, the test statistic has Student’s t distribution with n – 1 degrees of freedom.

Arguments of ttest:

1)

###### Alpha — Significance level

0.05 (default) | scalar value in the range (0,1)

2)

###### Dim — Dimension

first nonsingleton dimension (default) | positive integer value
Dimension of the input matrix along which to test the means consisting of 'Dim' and a positive integer value. For example, specifying 'Dim',1 tests the column means, while 'Dim',2 tests the row means.
3)

###### Tail — Type of alternative hypothesis

'both' (default) | 'right' | 'left'
Type of alternative hypothesis to evaluate, specified as the comma-separated pair consisting of 'Tail' and one of:

'both' — Test against the alternative hypothesis that the population mean is not m.

'right' — Test against the alternative hypothesis that the population mean is greater than m.

'left' — Test against the alternative hypothesis that the population mean is less than m.

Output Arguments of ttest:

1)

###### Hypothesis test result

Hypothesis test result, returned as 1 or 0.

If h = 1, this indicates the rejection of the null hypothesis at the Alpha significance level.

If h = 0, this indicates a failure to reject the null hypothesis at the Alpha significance level.

2)

###### p — p-value,scalar value in the range [0,1]

p-value of the test, returned as a scalar value in the range [0,1]. p is the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis. Small values of p cast doubt on the validity of the null hypothesis.

3)

###### ci — Confidence interval

vector
Confidence interval for the true population mean, returned as a two-element vector containing the lower and upper boundaries of the 100 × (1 – Alpha)% confidence interval.

4)

###### stats — Test statistics

structure
Test statistics, returned as a structure containing the following:

tstat — Value of the test statistic.

df — Degrees of freedom of the test.

sd — Estimated population standard deviation. For a paired t-test, sd is the standard deviation of x – y.