This article is a draft and as such there might be typos and other inaccuracies.
In this article we’ll derive the matrix inversion lemma, also known as the Sherman-Morrisson-Woodbury formula. At first it might seem like a very boring piece of linear algebra, but it has a few nifty uses, as we’ll see in one of the followup articles.
Let’s start with the following block matrix:
$$ M = \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} $$
We’ll do an LDU decomposition in two different ways, which basically direclty gives us the end formula. Eliminating the bototm left element we get the following:
$$
\begin{bmatrix}
I & 0 \\\\
-V A^{-1} & I
\end{bmatrix}
\begin{bmatrix}
A & U \\\\
V & B
\end{bmatrix} = \begin{bmatrix}
A & U \\\\
0 & B - V A^{-1} U
\end{bmatrix}
$$
The $B - V A^{-1} U$
is called a Schur complement and is generally defined as follows:
$$ M/A := V A^{-1} U $$
We’ll use this notation later to make things easier to read. Moving on with the decomposition, we’ll now eliminate $U$.
$$ \begin{bmatrix} A & U \\\\ 0 & B - V A^{-1} U \end{bmatrix} \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} = \begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix} $$
Putting the two equations above together we get the following:
$$ \underbrace{\begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} A & U \\\\ V & B \end{bmatrix}}_{M} \underbrace{\begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix}}_{Z} = \underbrace{\begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix}}_{W} $$
We could also write the matrix $W$ using the Schur complement notation: $$ W = \begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix} = \begin{bmatrix} A & 0 \\\\ 0 & M/A \end{bmatrix} $$
Now we just express $M$ in terms of $X, Z, W$ and take the inverse to get $M^{-1}$.
$$ \begin{align} X M Z &= W \\\\ M Z &= X^{-1} W \\\\ M &= X^{-1} W Z^{-1} \\\\ M^{-1} &= (X^{-1} W Z^{-1})^{-1} \\\\ M^{-1} &= Z W^{-1} X \end{align} $$
Substituting our matrices back in, we get:
$$ \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} = \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A^{-1} & 0 \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} $$
Now comes the fun part, we’ll multiply out the right side of the equation:
$$ \begin{align} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} &= \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A^{-1} & 0 \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} & -A^{-1} U (M/A)^{-1} \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} + A^{-1} U (M/A)^{-1} V A^{-1} & -A^{-1} U (M/A)^{-1} \\\\ -(M/A)^{-1} VA^{-1} & (M/A)^{-1} \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} + A^{-1} U (B - V A^{-1} U)^{-1} V A^{-1} & -A^{-1} U (B - V A^{-1} U)^{-1} \\\\ -(B - V A^{-1} U)^{-1} VA^{-1} & (B - V A^{-1} U)^{-1} \end{bmatrix} \end{align} $$
That’s it for the first part, now we’ll do the same, but eliminating the top-right element from the left first.
$$ \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} = \begin{bmatrix} A - UB^{-1}V & 0 \\\\ V & B \end{bmatrix} $$
Here we get the other Schur complement, which we’ll note as $M/B = A - UB^{-1}V$. We can substitute it in straight away this time.
$$ \begin{bmatrix} M/B & 0 \\\\ V & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} = \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} $$
As before, we’ll write it out as a single equation:
$$ \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} = \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} $$
Now we express the matrix $M$ in terms of the other two (notice the newly added inverse signs):
$$ \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} = \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix}^{-1} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix}^{-1} $$
Lastly, we just take the inverse of both sides:
$$ \begin{align} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} &= \left( \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix}^{-1} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix}^{-1} \right)^{-1} \\\\ &= \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix}^{-1} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} & \text{notice the inverses cancelling out} \\\\ &= \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} \begin{bmatrix} (M/B)^{-1} & 0 \\\\ 0 & B^{-1} \end{bmatrix} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \\\\ &= \begin{bmatrix} (M/B)^{-1} & 0 \\\\ -B^{-1} V (M/B)^{-1} & B^{-1} \end{bmatrix} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \\\\ &= \begin{bmatrix} (M/B)^{-1} & -(M/B)^{-1} UB^{-1} \\\\ -B^{-1} V (M/B)^{-1} & B^{-1} V (M/B)^{-1} UB^{-1} + B^{-1} \end{bmatrix} \\\\ &= \begin{bmatrix} (A - UB^{-1}V)^{-1} & -(A - UB^{-1}V)^{-1} UB^{-1} \\\\ -B^{-1} V (A - UB^{-1}V)^{-1} & B^{-1} V (A - UB^{-1}V)^{-1} UB^{-1} + B^{-1} \end{bmatrix} \end{align} $$
Linear Regression - Least Squares Without Orthogonal Projection
Linear Regression - least squares with orthogonal projection
Linear Algebra