Matrix Inversion Lemma | Jakub Arnold's Blog

This article is a draft and as such there might be typos and other inaccuracies.

In this article we’ll derive the matrix inversion lemma, also known as the Sherman-Morrisson-Woodbury formula. At first it might seem like a very boring piece of linear algebra, but it has a few nifty uses, as we’ll see in one of the followup articles.

Let’s start with the following block matrix:

$$ M = \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} $$

We’ll do an LDU decomposition in two different ways, which basically direclty gives us the end formula. Eliminating the bototm left element we get the following:

$$ \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} = \begin{bmatrix} A & U \\\\ 0 & B - V A^{-1} U \end{bmatrix}
$$

The $B - V A^{-1} U$ is called a Schur complement and is generally defined as follows:

$$ M/A := V A^{-1} U $$

We’ll use this notation later to make things easier to read. Moving on with the decomposition, we’ll now eliminate $U$.

$$ \begin{bmatrix} A & U \\\\ 0 & B - V A^{-1} U \end{bmatrix} \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} = \begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix} $$

Putting the two equations above together we get the following:

$$ \underbrace{\begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} A & U \\\\ V & B \end{bmatrix}}_{M} \underbrace{\begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix}}_{Z} = \underbrace{\begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix}}_{W} $$

We could also write the matrix $W$ using the Schur complement notation: $$ W = \begin{bmatrix} A & 0 \\\\ 0 & B - V A^{-1} U \end{bmatrix} = \begin{bmatrix} A & 0 \\\\ 0 & M/A \end{bmatrix} $$

Now we just express $M$ in terms of $X, Z, W$ and take the inverse to get $M^{-1}$.

$$ \begin{align} X M Z &= W \\\\ M Z &= X^{-1} W \\\\ M &= X^{-1} W Z^{-1} \\\\ M^{-1} &= (X^{-1} W Z^{-1})^{-1} \\\\ M^{-1} &= Z W^{-1} X \end{align} $$

Substituting our matrices back in, we get:

$$ \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} = \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A^{-1} & 0 \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} $$

Now comes the fun part, we’ll multiply out the right side of the equation:

$$ \begin{align} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} &= \begin{bmatrix} I & -A^{-1} U \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A^{-1} & 0 \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} & -A^{-1} U (M/A)^{-1} \\\\ 0 & (M/A)^{-1} \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -V A^{-1} & I \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} + A^{-1} U (M/A)^{-1} V A^{-1} & -A^{-1} U (M/A)^{-1} \\\\ -(M/A)^{-1} VA^{-1} & (M/A)^{-1} \end{bmatrix} \\\\ &= \begin{bmatrix} A^{-1} + A^{-1} U (B - V A^{-1} U)^{-1} V A^{-1} & -A^{-1} U (B - V A^{-1} U)^{-1} \\\\ -(B - V A^{-1} U)^{-1} VA^{-1} & (B - V A^{-1} U)^{-1} \end{bmatrix} \end{align} $$

That’s it for the first part, now we’ll do the same, but eliminating the top-right element from the left first.

$$ \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} = \begin{bmatrix} A - UB^{-1}V & 0 \\\\ V & B \end{bmatrix} $$

Here we get the other Schur complement, which we’ll note as $M/B = A - UB^{-1}V$. We can substitute it in straight away this time.

$$ \begin{bmatrix} M/B & 0 \\\\ V & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} = \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} $$

As before, we’ll write it out as a single equation:

$$ \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} = \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} $$

Now we express the matrix $M$ in terms of the other two (notice the newly added inverse signs):

$$ \begin{bmatrix} A & U \\\\ V & B \end{bmatrix} = \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix}^{-1} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix}^{-1} $$

Lastly, we just take the inverse of both sides:

$$ \begin{align} \begin{bmatrix} A & U \\\\ V & B \end{bmatrix}^{-1} &= \left( \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix}^{-1} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix} \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix}^{-1} \right)^{-1} \\\\ &= \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} \begin{bmatrix} M/B & 0 \\\\ 0 & B \end{bmatrix}^{-1} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} & \text{notice the inverses cancelling out} \\\\ &= \begin{bmatrix} I & 0 \\\\ -B^{-1}V & I \end{bmatrix} \begin{bmatrix} (M/B)^{-1} & 0 \\\\ 0 & B^{-1} \end{bmatrix} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \\\\ &= \begin{bmatrix} (M/B)^{-1} & 0 \\\\ -B^{-1} V (M/B)^{-1} & B^{-1} \end{bmatrix} \begin{bmatrix} I & -U B^{-1} \\\\ 0 & I \end{bmatrix} \\\\ &= \begin{bmatrix} (M/B)^{-1} & -(M/B)^{-1} UB^{-1} \\\\ -B^{-1} V (M/B)^{-1} & B^{-1} V (M/B)^{-1} UB^{-1} + B^{-1} \end{bmatrix} \\\\ &= \begin{bmatrix} (A - UB^{-1}V)^{-1} & -(A - UB^{-1}V)^{-1} UB^{-1} \\\\ -B^{-1} V (A - UB^{-1}V)^{-1} & B^{-1} V (A - UB^{-1}V)^{-1} UB^{-1} + B^{-1} \end{bmatrix} \end{align} $$

Previous:
Linear Regression - Least Squares Without Orthogonal Projection

Next:
Linear Regression - least squares with orthogonal projection