3.2 Summary
Course subject(s)
3. Least Squares Estimation (LSE)
Weighted Least Squares Estimation
In ordinary least squares estimation, we assume that all observations are equally important. In many cases this is not realistic, as observations may be obtained by different measurement systems, or under different circumstances. We want our methodology for least squares estimation to be able to take this into account.
We achieve this goal by introducing a weight matrix in the normal equation. In the unweighted least squares approach, we arrive at the normal equation by pre-multiplying both sides of \(y=Ax\) with the transpose of the design matrix \(A^T\):
\[ y=Ax \; \rightarrow \; A^T\; y = A^T\; A x \]
In the weighted least squares approach, we add weight matrix \(W\) to this pre-multiplication factor, i.e., \( A^T W\), to obtain the normal equation:
\[ y=Ax \; \rightarrow \; A^T W \; y = A^TW \; A x\]
The normal matrix is now defined as \(N=A^T W A\). From this, assuming that the normal matrix \(N\) is invertible (non-singular) we find the new weighted least squares estimate \( \hat{x} \),
\[ \hat{x} = (A^T W A)^{-1} A^T W y\]
We also find the derived estimate \( \hat{y} \) and \( \hat{e} \):
\[\begin{align} \hat{y} &= A \hat{x} \\ &= A (A^T W A )^{-1} A^T W y\\ \hat{e} &= y - \hat{y}\\ &= y - A \hat{x} = y-A (A^T W A )^{-1} A^T W y\\ &= (I- A (A^T W A )^{-1} A^T W) y\end{align} \]
Discussion on the weight matrix
The weight matrix \(W\) expresses the (relative) weights between the observations. It is always a square matrix. The size of the weight matrix depends on the number of observations, \(m\). The size of the weight matrix is \(m\times m\). If it is a unit matrix (\(W=I\)), this implies that all observations have equal weight. Note that in this case the equations are equal to the ordinary least squares solution. If it is a diagonal matrix, with different values on the diagonal, this implies that entries with a higher value correspond to the observations which are considered to be of more importance. If the weight matrix has non-zero elements on the off-diagonal positions, this implies that (some) observations are correlated.
Observation Theory: Estimating the Unknown by TU Delft OpenCourseWare is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://ocw.tudelft.nl/courses/observation-theory-estimating-unknown.