Eigenvalues and Eigenvectors: Basic Properties

9 min read • Published: December 08, 2018

Eigenvalues and eigenvectors of a matrix A\boldsymbol A tell us a lot about the matrix. On the other hand, if we know our matrix A\boldsymbol A is somehow special (say symmetric) it will tell us some information about how its eigenvalues and eigenvectors look like.

Let us begin with a definition. Given a matrix A\boldsymbol A, the vector xx is an eigenvector of A\boldsymbol A and has a corresponding eigenvalue λ\lambda, if

Ax=λx.\boldsymbol A \boldsymbol x = \lambda \boldsymbol x.

The eigenvectors of a matrix A\boldsymbol A are exactly those vectors which when transformed by the mapping defined by A\boldsymbol A are only scaled by λ\lambda, but their direction does not change.

Eigenvalues and eigenvectors of a projection matrix

To understand what eigenvectors are and how they behave, let us consider a projection matrix P\boldsymbol P. What are xx’s and λ\lambda’s for a projection matrix?

The key property we’ll use is P2=P\boldsymbol P^2 = \boldsymbol P. This is because when we project a vector xx onto a plane to get x^\hat x, that is Px=x^\boldsymbol P x = \hat x, we would expect that projecting x^\hat x again to do nothing, since it already lies in the plane, that is

x^=Px^=P(Px)=(PP)x=P2x.\hat x = \boldsymbol P \hat x = \boldsymbol P (\boldsymbol P x) = (\boldsymbol P \boldsymbol P) x = \boldsymbol P^2 x.

Now thinking about eigenvectors as those vectors which don’t change direction when a projection matrix is applied, we can deduce two cases:

  • Any xx already in the plane: Px=x,λ=1\boldsymbol P x = x, \lambda = 1.
  • Any xx perpendicular to the plane: Px=0x,λ=0\boldsymbol P x = 0 x, \lambda = 0.

As a result, a projection matrix P\boldsymbol P has two eigenvalues, λ=0\lambda = 0 and λ=1\lambda = 1, and two sets of eigenvectors. Those that lie in the projection plane, and those that are perpendicular to it.

Eigenvalues of a 2×22 \times 2 permutation matrix

One more small example, consider a 2×22 \times 2 permutation matrix A=(0110)\boldsymbol A = \begin{pmatrix}0 & 1 \\ 1 & 0 \end{pmatrix}.

We can find the eigenvectors straight away, at least the first one, which is simply x=(1 1)Tx = (1\ 1)^T, since Ax=x\boldsymbol A x = x, and so its corresponding eigenvalue is λ=1\lambda = 1.

If we think a little harder, we can guess the second eigenvector to be x=(1 1)Tx = (-1\ 1)^T, since A=x\boldsymbol A = -x with an eigenvalue λ=1\lambda = -1.

Computing eigenvalues and eigenvectors

We can re-arrange the terms in our definition to get a direct way to compute eigenvalues and eigenvectors of a matrix A\boldsymbol A. Simply move λx\lambda x to the left

Ax=λx(AλI)x=0\begin{aligned} \boldsymbol A x &= \lambda x \\ (\boldsymbol A - \lambda \boldsymbol I) x &= 0 \\ \end{aligned}

and then notice that AλI\boldsymbol A - \lambda \boldsymbol I must be singular, because xx lies in its nullspace. We know that singular matrices have a zero determinant, and we can use this to compute the eigenvalues λ\lambda simply by writing

det(AλI)=0.\det (\boldsymbol A - \lambda \boldsymbol I) = 0.

This is called the characteristic equation. The equation det(AλI)=0det(\boldsymbol A - \lambda \boldsymbol I) = 0 gives us a polynomial of degree nn, which we can use to find nn solutions λ\lambda. These need not be different, and can even be complex numbers. But once we obtain the λ\lambda’s we can plug them back into the characteristic equation det(AλI)x=0det(\boldsymbol A - \lambda \boldsymbol I) x = 0 and one by one obtain their corresponding eigenvectors xx.

Eigenvalues and eigenvectors of an upper triangular matrix

For a triangular matrix, the determinant is just the diagonal

det(A)=i=1nAiidet(\boldsymbol A) = \prod_{i=1}^n \boldsymbol A_{ii}

which means solving the characteristic equation of A\boldsymbol A simply amounts to multiplying out the diagonal

det(AλI)=i=1n(AλI)ii,det(\boldsymbol A - \lambda \boldsymbol I) = \prod_{i=1}^n (\boldsymbol A - \lambda \boldsymbol I)_{ii},

which gives us a factored polynomial (A11λ)(A22λ)(Annλ)(\boldsymbol A_{11} - \lambda)(\boldsymbol A_{22} - \lambda)\ldots(\boldsymbol A_{nn} - \lambda), from which we immediately see that the eigenvalues are the diagonal elements.

Diagonalization S1AS=Λ\boldsymbol S^{-1} \boldsymbol A \boldsymbol S = \boldsymbol \Lambda

Suppose we have nn linearly independent eigenvectors of A\boldsymbol A. Put them int the columns of S\boldsymbol S. We now write

\global\def\vertbar{{\rule[-1ex]{0.5pt}{2.5ex}}} AS=A[x1x2xn]=[λ1x1λ2x2λnxn]=SΛ\boldsymbol A \boldsymbol S = A \begin{bmatrix} \vertbar & \vertbar & & \vertbar \\ x_1 & x_2 & \cdots & x_n \\ \vertbar & \vertbar & & \vertbar \end{bmatrix} = \begin{bmatrix} \vertbar & \vertbar & & \vertbar \\ \lambda_1 x_1 & \lambda_2 x_2 & \cdots & \lambda_n x_n \\ \vertbar & \vertbar & & \vertbar \end{bmatrix} = \boldsymbol S \boldsymbol \Lambda

where Λ\boldsymbol \Lambda is a diagonal matrix of eigenvalues. Thus we get AS=SΛ\boldsymbol A \boldsymbol S = \boldsymbol S \boldsymbol \Lambda. If we have nn independent eigenvectors in A\boldsymbol A, we also get

AS=SΛS1AS=ΛA=SΛS1\begin{aligned} \boldsymbol A \boldsymbol S &= \boldsymbol S \boldsymbol \Lambda \\ \boldsymbol S^{-1} \boldsymbol A \boldsymbol S = \boldsymbol \Lambda \\ \boldsymbol A &= \boldsymbol S \boldsymbol \Lambda \boldsymbol S^{-1} \end{aligned}

The matrix A\boldsymbol A is sure to have nn independent eigenvectors (and be diagonalizable) if all the λ\lambda’s are different (no repeated λ\lambda’s). Repeated eigenvalues mean A\boldsymbol A may or may not have nn independent eigenvectors.

Proof (ref G. Strang, Introduction to LA): Suppose c1+x1+c2x2=0c_1 + x_1 + c_2 x_2 = 0. Multiply by A\boldsymbol A to find c1λ1x1+c2λ2x2=0c_1 \lambda_1 x_1 + c_2 \lambda_2 x_2 = 0. Multiply by λ2\lambda_2 to find c1λ2x1+c2λ2x2=0c_1 \lambda_2 x_1 + c_2 \lambda_2 x_2 = 0. Now subtracting these two equations gives us

(λ1λ2)c1x1=0.(\lambda_1 - \lambda_2) c_1 x_1 = 0.

Since λ1λ2\lambda_1 \neq \lambda_2 and x10x_1 \neq 0, we conclude c1=0c_1 = 0. We can derive c2=0c_2 = 0 the same way. Since c1=c2=0c_1 = c_2 = 0 are the only coefficients for which c1x1+c2x2=0c_1 x_1 + c_2 x_2 = 0, we see that x1x_1 and x2x_2 are linearly independent.

The same argument can be extended to nn eigenvectors and eigenvalues.

Sum of eigenvalues equlas the trace

Another very useful fact is that the sum of the eigenvalues equals the sum of the main diagonal (called the trace of A\boldsymbol A), that is

λ1+λ2++λn=A11+A22++Ann=Tr(A).\lambda_1 + \lambda_2 + \ldots + \lambda_n = \boldsymbol A_{11} + \boldsymbol A_{22} + \ldots + \boldsymbol A_{nn} = Tr(\boldsymbol A).

To prove this we’ll first show that Tr(AB)=Tr(BA)Tr(\boldsymbol A \boldsymbol B) = Tr(\boldsymbol B \boldsymbol A).

To get a single element on the diagonal of AB\boldsymbol A \boldsymbol B we simply write

(AB)jj=kAjkBkj(\boldsymbol A \boldsymbol B)_{jj} = \sum_{k} \boldsymbol A_{jk} \boldsymbol B_{kj}

and to get the trace we just sum over all possible jj as

Tr(AB)=jkAjkBkj.Tr(\boldsymbol A \boldsymbol B) = \sum_{j} \sum_{k} \boldsymbol A_{jk} \boldsymbol B_{kj}.

On the other hand, the kk-th element on the diagonal of BA\boldsymbol B \boldsymbol A is

(BA)kk=jBkjAjk(\boldsymbol B \boldsymbol A)_{kk} = \sum_{j} \boldsymbol B_{kj} \boldsymbol A_{jk}

and the trace Tr(BA)Tr(\boldsymbol B \boldsymbol A) is

Tr(BA)=kjBkjAjk.Tr(\boldsymbol B \boldsymbol A) = \sum_{k} \sum_{j} \boldsymbol B_{kj} \boldsymbol A_{jk}.

But since we can swap the order of summation and also swap the order of multiplication, we get

Tr(BA)=kjBkjAjk=jkAjkBkj=Tr(AB).Tr(\boldsymbol B \boldsymbol A) = \sum_{k} \sum_{j} \boldsymbol B_{kj} \boldsymbol A_{jk} = \sum_{j} \sum_{k} \boldsymbol A_{jk} \boldsymbol B_{kj} = Tr(\boldsymbol A \boldsymbol B).

Now consider we have nn different eigenvalues. We can diagonalize the matrix

S1AS=Λ\boldsymbol S^{-1} \boldsymbol A \boldsymbol S = \boldsymbol \Lambda

where Λ\boldsymbol \Lambda is a diagonal matrix of eigenvalues of A\boldsymbol A. Using our trace trick we can write

Tr(Λ)=Tr(S1AS)=Tr((S1A)S)=Tr(S(S1A))=Tr((SS1)A)=Tr(IA)=Tr(A)Tr(\boldsymbol \Lambda) = Tr(\boldsymbol S^{-1} \boldsymbol A \boldsymbol S) = Tr((\boldsymbol S^{-1} \boldsymbol A) \boldsymbol S) = Tr(\boldsymbol S (\boldsymbol S^{-1} \boldsymbol A)) = Tr((\boldsymbol S \boldsymbol S^{-1}) \boldsymbol A) = Tr(\boldsymbol I \boldsymbol A) = Tr(\boldsymbol A)

and thus the sum of eigenvalues is equal the trace of A\boldsymbol A. We’ve only shown this for the case of nn different eigenvalues. This property does hold in general, but requires some properties we haven’t proven yet (Jordan normal form), and thus we skip the rest of the proof.

If you’re interested, check out the following article which shows the whole proof, and possibly the Wikipedia article on Jordan normal form.

Powers of a matrix

If Ax=λx\boldsymbol A x = \lambda x, then we multiply by A\boldsymbol A and get

A2x=λAx=λ2x.\boldsymbol A^2 x = \lambda \boldsymbol A x = \lambda^2 x.

Continuing

A2=SΛS1SΛS1=SΛ2S1\boldsymbol A^2 = \boldsymbol S \boldsymbol \Lambda \boldsymbol S^{-1} \boldsymbol S \boldsymbol \Lambda \boldsymbol S^{-1} = \boldsymbol S \boldsymbol \Lambda^2 \boldsymbol S^{-1}

or in general

Ak=SΛkS1.\boldsymbol A^k = \boldsymbol S \boldsymbol \Lambda^k \boldsymbol S^{-1}.

Theorem: Ak0\boldsymbol A^k \rightarrow 0 as kk \rightarrow \infty if all λi<1|\lambda_i| < 1.

More properties

A\boldsymbol A and B\boldsymbol B share the same nn independent eigenvectors if and only if AB=BA\boldsymbol A \boldsymbol B = \boldsymbol B \boldsymbol A.

This is true because ABx=λβx\boldsymbol A \boldsymbol B x = \lambda \beta x and BAx=λβx\boldsymbol B \boldsymbol A x = \lambda \beta x since

ABx=Aβx=βAx=βλx.\boldsymbol A \boldsymbol B x = \boldsymbol A \beta x = \beta \boldsymbol A x = \beta \lambda x.

But this only holds if A\boldsymbol A and B\boldsymbol B share the same eigenvectors!

One last interesting fact we can show is what happens to the eigenvalues when we add a constant cc to the matrix A\boldsymbol A. The proof is rather trivial, if Ax=λx\boldsymbol A x = \lambda x, then

(A+c)x=(A+cI)x=(λ+c)x.(\boldsymbol A + c) x = (\boldsymbol A + c \boldsymbol I) x = (\lambda + c) x.

Adding a constant to a matrix causes its eigenvalues to increase by exactly that constant.

References and visualizations



Share on Twitter and Facebook



Discussion of "Eigenvalues and Eigenvectors: Basic Properties"

If you have any questions, feedback, or suggestions, please do share them in the comments! I'll try to answer each and every one. If something in the article wasn't clear don't be afraid to mention it. The goal of these articles is to be as informative as possible.

If you'd prefer to reach out to me via email, my address is loading ..