The Gaussian distribution has many interesting properties, many of which make
it useful in various different applications. Before moving further, let us just
define the univariate PDF with a mean and variance
In the general multi-dimensional case, the mean becomes a mean vector, and the variance turns into
a covariance matrix. The PDF then becomes
where is the determinant of the covariance matrix . The
term in the exponent is called Mahalanobis distance and is useful to study in
The first property of the Gaussian states that if , then is also a Gaussian, specifically . We can prove this using the definition of
mean and covariance. The mean of (denoted ) can be derived simply
from the linearity of expectation, that is
And now the covariance we again substitute into the definition
of covariance and get
and thus , which gives the final result of
Sampling from a Gaussian
We can immediately make use of the affine property to define how to sample from
a multivariate Gaussian. We’ll make use of Cholesky decomposition, which for
a positive-definite matrix returns a lower triangular matrix , such
This together with the affine property defined above gives us
Sampling from the former is thus equivalent to sampling from the latter. Since
and are constant factors with respect to sampling, we simply have to
figure out how to draw samples from and then do the affine
transform back to our original distribution.
Observe that since the covariance of is diagonal, the individual
values in the random vector are independent. Note that this property is special to the
Gaussian and is a little bit tricky,
but holds in our case, because in this case we’re inferring that individual random variables
which are jointly Gaussian but uncorrelated are independent.
Finally, because the variables are independent, we can sample them independently,
which can be done easily using the Box-Muller transform.
Once we obtain our independent samples, we simply multiply by and add
to obtain correlated samples from our original distribution.
Sum of two independent Gaussians is a Gaussian
If and random variables with a Gaussian distributions, where and , then
This can be proven many different ways, the simplest of which is probably using
moment generating functions. With the moment generating function of a Gaussian
and using the property of moment generating functions which says how to combine
two independent variables and , specifically
we can simply plug in our moment generating function for the Gaussian and get
Deriving the normalizing constant
We can compute the Gaussian integral using polar coordinates. Consider the zero mean unit variance case.
And now comes an important trick, we’ll do a polar coordinate substitution,
since in .
now substituting and , giving us
Finally, combining this with the initial integral we get
and as a result
Deriving the mean and standard deviation
Lastly, while not necessarily a property of the Gaussian, it is a useful
exercise to derive the mean and standard deviation from the PDF. Once again,
the PDF is
and the general formula for is
We can pull out the constant outside of the integral and substitute and , giving us
Here we note that the function being integrated is odd, which means the
integral adds up to , and we’re left with only , that is
which is what we wanted to prove.
Now for the variance, which is defined as
which written again as an integral gives us
again pulling out the constant and substituting and we get
Integrating by parts using the where we set
To get we have to compute the integral of , which we can easily do substituting
and , giving us
Now finishing our integration by parts we can write out the final formula
That is, .
If you have any questions, feedback, or suggestions, please do share them in the comments! I'll try to answer each and every one. If something in the article wasn't clear don't be afraid to mention it. The goal of these articles is to be as informative as possible.