What is a Moment?
The \(n^{th}\) moment is the expected value of \(X^n\), denoted \(E(X^n)\).
The first moment is \(E(X)\) which represents the mean of the distribution. The second moment, \(E(X^2)\) forms part of the expression for variance: \(E(X^2)-E(X)^2\).
They are, in essence, measures of the shape of the distribution. If your distribution is bounded (A.K.A. it doesn’t stretch infinitely to either side), it has a unique collection of moments from \(E(X)\) to \(E(X^{\infty})\).
What is a Moment Generating Function?
A Moment Generating Function (MGF) is a generating function to find each moment.
The MGF for a continuous random variable is
\( M(t) = E[e^{tx}] \implies \int_{x} e^{tx} f(x) \; dx \)
and the MGF for a discrete random variable is
\( M(t) = E[e^{tx}] \implies \sum_{x} e^{tx} f(x) \; dx \)
Once you find $M(t)$, take the $n^{th}$ derivative and plug in $0$ for $t$ to get your $n^{th}$ moment.
Proof
So why and how does this work? We can look towards the taylor series for \(e^x\) for our answer.
We know that
\[\begin{align*} e^x &= 1 + x + \frac{x^2}{2!} + \dots + \frac{x^n}{n!} + \dots\\ \implies e^{tx} &= 1 + tx + \frac{(tx)^2}{2!} + \dots + \frac{(tx)^n}{n!} + \dots \end{align*}\]If we take the expected value of \(e^{tx}\), we get
\[\begin{align*} E(e^{tx}) &= E(1) + E(tx) + E\left(\frac{(tx)^2}{2!}\right) + \dots + E\left(\frac{(tx)^n}{n!}\right) + \dots \\ &= E(1) + tE(x) + \frac{t^2}{2!}E(x^2) + \dots + \frac{t^n}{n!}E(x^n) + \dots\\ \end{align*}\]and then if we take the derivative of the expression (with respect to \(t\)), we get
\[\begin{align*} \frac{d}{dt}E(e^{tx}) &= \frac{d}{dt}E(1) + \frac{d}{dt}tE(x) + \frac{d}{dt}\frac{t^2}{2!}E(x^2) + \dots + \frac{d}{dt}\frac{t^n}{n!}E(x^n) + \dots\\ &= 0 + E(X) + tE(X^2) + \dots + \frac{nt^{n-1}}{n!}E(X^n) + \dots\\ & \textit{plug in $t=0$} \\ &= 0 + E(X) + 0 + \dots + 0 + \dots\\ &= E(X) \end{align*}\]Thus proving that \( M’(t)|_{t=0} = E(X) \). So, we’ve proven the first derivative, what about the rest? Look at what pattern is ocurring as we take the derivative. First, we are reducing the fraction to be equivalent to the previous term. The power rule means we take the $n$ in the exponent and put it in the numerator of the fraction:
\(\frac{d}{dt}\frac{t^n}{n!} = \frac{nt^{n-1}}{n!}\)
This $n$ now cancels with the first term in the factorial in the denominator and reduces it to the factorial of \(n-1\).
\(\frac{t^{n-1}}{(n-1)!}\)
This is now the coefficient of the $(n-1)^{th}$ term.
Second, notice how only one term “survives” after we plug in $t=0$. This will always be the first term that doesn’t evaluate to $0$ after taking the necessary number of derivatives. This is because it will be the only term not already $0$ or not with a $t$ that will turn the whole expression into $0$ once $t=0$ is plugged in.
So the $n^{th}$ derivative of the series is
\[\begin{align*} \frac{d}{dt}E(e^{tx}) &= 0 + E(X) + tE(X^2) + \dots + \frac{t^{n-1}}{(n-1)!}E(X^n) + \dots\\ \frac{d^2}{dt^2}E(e^{tx}) &= 0 + 0 + E(X^2) + \dots + \frac{t^{n-2}}{(n-2)!}E(X^n) + \dots\\ \frac{d^3}{dt^3}E(e^{tx}) &= 0 + 0 + 0 + E(X^3) + \dots + \frac{t^{n-3}}{(n-3)!}E(X^n) + \dots \end{align*}\]and then when we plug in $t=0$:
\[\begin{align*} \frac{d}{dt}E(e^{tx}) &= 0 + E(X) + 0 + \dots + 0 + \dots\\ &= E(X)\\ \frac{d^2}{dt^2}E(e^{tx}) &= 0 + 0 + E(X^2) + \dots + 0 + \dots\\ &= E(X^2)\\ \frac{d^3}{dt^3}E(e^{tx}) &= 0 + 0 + 0 + E(X^3) + \dots + 0 + \dots\\ &= E(X^3) \end{align*}\]Examples
Example 1: Continuous Random Variable
Let’s take a continuous variable \(f(x)=e^{-x}\) for \(x\ge 0\) and 0 otherwise. The moment generating function is then
\[\begin{align*} M(t) = E[e^{tx}] &= \int_{0}^{\infty} e^{tx} f(x) \; dx \\ &= \lim_{n\to \infty}\int_{0}^n e^{(t-1)x}\;dx\\ &= \lim_{n\to \infty} \left[ \frac{1}{t-1}e^{(t-1)x}\right]_{x=0}^n\\ &= \lim_{n\to \infty} \left(\frac{1}{t-1}e^{(t-1)n}-\frac{1}{t-1}\right)\\ \end{align*}\]From here we have three different solutions
- For \(t>1\), the term \(e^{(t-1)n}\) diverges to infinity as \(n\) increases. Because the limit diverges, the function is undefined
- For \(t = 1\), the terms \(\frac{1}{t-1}\) becomes undefined. Thus, the function is undefined here as well
- For \(t<1\), the term \(e^{(t-1)n}\) converges to \(0\), so the function is defined
So, the Moment Generating Function for \(t<1\) is
\( M(t) = \frac{1}{t-1}(0)-\frac{1}{t-1} = (1-t)^{-1} \)
Also for \(t<1\), the first and second derivatives are
\(M’(t) = \frac{d}{dt} (1-t)^{-1} = (1-t)^{-2}\)
\(M”(t) = \frac{d^2}{dt^2} (1-t)^{-1} = 2(1-t)^{-3}\)
If we are trying to find the expectation for this function in the defined range, we plug in 0 for \(t\) in \(M’(t)\) which gives us an expectation of \(1\)
For variance, we would use the equation
\( M”(0) - M’(0)^2 \)
which gives us a variance of \(\frac{1}{2} - 1 = -\frac12\)
Example 2: Discrete Random Variable
Given a random variable $Y$ such that
$y$ | $P(Y=y)$ |
---|---|
0 | 0.5 |
1 | 0.25 |
2 | 0.25 |
The MGF is
\[\begin{align*} M(t) = \sum e^{tx}f(x) = 0.5 + 0.25e^{t} + 0.25e^{2t} \end{align*}\]So the Expectation is
\[\begin{align*} M'(t)\|_{t=0} = \frac{d}{dt}\left[ \sum e^{tx}f(x) \right]\|_{t=0} = \left[0.25e^t + 0.5^{2t}\right]\|_{t=0} = 0.75 \end{align*}\]and the Variance is
\[\begin{align*} M"(t)\|_{t=0} - E(X)^2 &= \frac{d^2}{dt^2}\left[ \sum e^{tx}f(x) \right]\|_{t=0} - 0.75^2\\ &= \left[0.25e^t + e^{2t}\right]\|_{t=0} - 0.5625 = 0.6875 \end{align*}\]