DEV Community

Cover image for Conjugate Prior Distribution and Non-Informative Prior Distribution in Bayesian Statistics
rhyizm
rhyizm

Posted on • Edited on

Conjugate Prior Distribution and Non-Informative Prior Distribution in Bayesian Statistics

In Bayesian statistics, certain combinations of a prior distribution and a likelihood function guarantee that the posterior distribution is the same distribution family as the prior distribution. The prior distribution is called a conjugate prior for the likelihood function.

If you don’t choose a prior distribution, you may need to take a numerical approach to deal with the posterior distribution. The numerical approach in this context tends to be more complex than the analytical approach. Additionally, it requires computer resources.

Mathmatical background of conjugate prior Beta distribution

(1)

fBeta(θα,β)  Prior distributionfBinomial(xθ)  Likelihood function \begin{aligned} f_{Beta}(\theta \mid \alpha, \beta)\space &\cdot\cdot\cdot \space \verb|Prior distribution| \\ f_{Binomial}(x \mid \theta)\space &\cdot\cdot\cdot \space \verb|Likelihood function| \end{aligned}
fPosterior(θx)=fBinomial(xθ)fBeta(θα,β)fBinomial(x)=fBinomial(xθ)fBeta(θα,β)01fBinomial(xθ)fBeta(θα,β)dθ=nCxθx(1θ)nx1B(α,β)θα1(1θ)β101nCxθx(1θ)nx1B(α,β)θα1(1θ)β1dθ=nCx1B(α,β)θα+x1(1θ)β+nx1nCx1B(α,β)01θα+x1(1θ)β+nx1dθ=1B(α+x,β+nx)θα+x1(1θ)β+nx1 \begin{aligned} f_{Posterior}(\theta \mid x) &=\frac{f_{Binomial}(x \mid \theta)f_{Beta}(\theta \mid \alpha, \beta)}{f_{Binomial}(x)} \\\\ &=\frac{f_{Binomial}(x \mid \theta)f_{Beta}(\theta \mid \alpha, \beta)}{\int_{0}^{1}f_{Binomial}(x \mid \theta)f_{Beta}(\theta \mid \alpha, \beta)d\theta} \\\\ &=\frac{{n}C{x}\theta^{x}(1-\theta)^{n-x}\frac{1}{B(\alpha, \beta)}\theta^{\alpha-1}(1-\theta)^{\beta-1}}{\int_{0}^{1}{{n}C{x}\theta^{x}(1-\theta)^{n-x}\frac{1}{B(\alpha, \beta)}\theta^{\alpha-1}(1-\theta)^{\beta-1}}d\theta} \\\\ &=\frac{{n}C{x}\frac{1}{B(\alpha, \beta)}\theta^{\alpha+x-1}(1-\theta)^{\beta+n-x-1}}{{n}C{x}\frac{1}{B(\alpha, \beta)}\int_{0}^{1}{\theta^{\alpha+x-1}(1-\theta)^{\beta+n-x-1}}d\theta} \\\\ &=\frac{1}{B(\alpha+x, \beta+n-x)}\theta^{\alpha+x-1}(1-\theta)^{\beta+n-x-1} \\ \end{aligned}

because

1B(m,n)01θm1(1θ)n1dθ=1 \begin{aligned} \frac{1}{B(m, n)} \int_{0}^{1}{\theta^{m-1}(1-\theta)^{n-1}d\theta}&=1 \end{aligned}

therefore

01θα+x1(1θ)β+nx1dθ=Beta(α+x,β+nx) \begin{aligned} \int_{0}^{1}{\theta^{\alpha+x-1}(1-\theta)^{\beta+n-x-1}d\theta}&=Beta(\alpha+x, \beta+n-x) \\\\ \end{aligned}

As the above, a posterior distribution becomes the same distribution as prior distribution the at the end of the operation.

Non-informative prior distribution

Non-informative prior distributions are often used in Bayesian analysis when there is little or no prior knowledge about the parameters of interest. In this case, the non-informative prior distribution is the Beta(1, 1) distribution, which is also known as the Uniform distribution.

The Uniform distribution assigns equal probability to all values within its support, which in this case is the interval [0, 1]. The probability density function of the Beta(1, 1) distribution is a constant function with value 1, which reflects the fact that all values within the support are equally likely.

fBeta(1,1)(x)=1B(1,1)x0(1x)0dθ=1 \begin{aligned} f_{Beta(1, 1)}(x) = \frac{1}{B(1, 1)} x^{0}(1-x)^{0}d\theta=1 \end{aligned}

When the non-informative prior distribution is used, the posterior distribution after observing data  x  is simply the Beta distribution with parameters *** 1+x *** and  1+n-x . This means that the posterior distribution is updated in a straightforward manner, without being influenced by any strong prior beliefs.

The reason that the Uniform distribution is considered non-informative in this Bayesian context is that it does not emphasize any particular possibility. Instead, it assigns equal probability to all possible values within its support, which makes it a suitable choice when there is no prior knowledge or when one wants to avoid introducing any bias towards particular values.

The graph below shows the probability density function of the Beta(1, 1) distribution, which is a flat line with value 1 over the interval [0, 1].

Graph image for Beta(1, 1)

The following graph shows the case where the likelihood function has n = 10 and x = 2.

Graph image for changed Beta function

The shape of the distribution changes, suggesting that the non-informative prior distribution has been influenced by the likelihood function and the observed data.
As if we update our beliefs based on newly obtained information.

This is a brief mathematical explanation of the prior distribution and its updates.

Top comments (0)