Pre 1.4 Mean and variance
Course subject(s)
Pre-knowledge Mathematics
MEAN OR EXPECTATION
The empirical or sample mean based on a number of outcomes \( N \) can be computed as:
\[ \begin{align} \bar{y} &= \frac{1}{N} \sum_{i=1}^{N} y_i \\ &= \sum_{j=1}^{M} y_j \frac{n_j}{N} \\ &= \sum_{j=1}^{M} y_j f_{\underline{y}}(y_j)dy \end{align} \]
where \(M\) is the total number of intervals (or bins) in which the range of outcomes is divided, and \( n_j \) the number of times that an outcome falls in interval \(j\), and \(y_j\) is the center of the bin.
In analogy, the theoretical mean or expected value of a continuous random variable follows as:
\[ \bar{y} = \int_{-\infty}^{\infty} y f_{\underline{y}}(y)dy \]
The mean is the first moment and is also referred to as the expectation of the random variable, since the mean of a large number of outcomes is expected to be close to \( \bar{y} \).
Notation: \( E\{\underline{y}\} = \bar{y} \).
VARIANCE
The outcomes or realizations of a random variable will by definition inhibit a certain spread; they will fluctuate around the mean \( \bar{y} \). The variance \( \sigma_y^2\) or dispersion \( D\{\underline{y}\} \) of a random variable \(\underline{y} \) is a measure of these fluctuations around the mean, and is defined by:
\[ \begin{align} \sigma_y^2 &= D\{\underline{y} \} \\ &= E\{ (\underline{y} -\bar{y} )^2 \} \\ &= \int_{-\infty}^{\infty} (y -\bar{y} )^2 f_{\underline{y}}(y)dy \end{align} \]
Hence, the variance equals the expectation of the squared deviations from the mean value. The variance is the second central moment.
Based on the formula for the sample mean, we can also find the expression for the sample variance:
\[ \sigma_y^2 = \frac{1}{N-1} \sum_{i=1}^{N} (y_i -\bar{y} )^2 \]
Note that here we divide by \(N-1\) instead of \(N\). The reason is that otherwise you would get a sample variance which on average would deviate from the true value. In other words, by dividing by \(N-1\) we guarantee that if you would repeatedly determine the sample variance based on a new set of \(N\) observations, the average of these sample variances equals to the true variance.
The standard deviation \(\sigma_y \) of random variable \( \underline{y} \) is given by the square root of its variance.
Observation Theory: Estimating the Unknown by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://ocw.tudelft.nl/courses/observation-theory-estimating-unknown.