3
[latexpage]
Introduction
With each square matrix we can calculate a number, called the determinant of the matrix, which tells us whether or not the matrix is invertible. In fact, determinants can be used to give a formula for the inverse of a matrix. They also arise in calculating certain numbers (called eigenvalues) associated with the matrix. These eigenvalues are essential to a technique called diagonalization that is used in many applications where it is desired to predict the future behaviour of a system. For example, we use it to predict whether a species will become extinct.
Determinants were first studied by Leibnitz in 1696, and the term “determinant” was first used in 1801 by Gauss is his Disquisitiones Arithmeticae. Determinants are much older than matrices (which were introduced by Cayley in 1878) and were used extensively in the eighteenth and nineteenth centuries, primarily because of their significance in geometry. Although they are somewhat less important today, determinants still play a role in the theory and application of matrix algebra.
3.1 The Cofactor Expansion
In Section 2.4, we defined the determinant of a $2 \times 2$ matrix
$A = \left[
\begin{array}{cc}
a & b \\
c & d
\end{array}
\right]$
as follows:
\begin{equation*}
\func{det } A = \left| \begin{array}{cc}
a & b \\
c & d
\end{array} \right| = ad – bc
\end{equation*}
and showed (in Example 2.4.4) that $A$ has an inverse if and only if det $A \neq 0$. One objective of this chapter is to do this for any square matrix A. There is no difficulty for $1 \times 1$ matrices: If $A = \left[ a \right]$, we define $\func{det} A = \func{det} \left[ a \right] = a$ and note that $A$ is invertible if and only if $a \neq 0$.
If $A$ is $3 \times 3$ and invertible, we look for a suitable definition of $\func{det } A$ by trying to carry $A$ to the identity matrix by row operations. The first column is not zero ($A$ is invertible); suppose the (1, 1)-entry $a$ is not zero. Then row operations give
\begin{equation*}
A = \left[ \begin{array}{ccc}
a & b & c \\
d & e & f \\
g & h & i
\end{array} \right]
\rightarrow
\left[ \begin{array}{ccc}
a & b & c \\
ad & ae & af \\
ag & ah & ai
\end{array} \right]
\rightarrow
\left[ \begin{array}{ccc}
a & b & c \\
0 & ae-bd & af-cd \\
0 & ah-bg & ai-cg
\end{array} \right]
=
\left[ \begin{array}{ccc}
a & b & c \\
0 & u & af-cd \\
0 & v & ai-cg
\end{array} \right]
\end{equation*}
where $u = ae – bd$ and $v = ah – bg$. Since $A$ is invertible, one of $u$ and $v$ is nonzero (by Example 2.4.11); suppose that $u \neq 0$. Then the reduction proceeds
\begin{equation*}
A \rightarrow \left[ \begin{array}{ccc}
a & b & c \\
0 & u & af-cd \\
0 & v & ai-cg
\end{array} \right]
\rightarrow
\left[ \begin{array}{ccc}
a & b & c \\
0 & u & af-cd \\
0 & uv & u(ai-cg)
\end{array} \right]
\rightarrow
\left[ \begin{array}{ccc}
a & b & c \\
0 & u & af-cd \\
0 & 0 & w
\end{array} \right]
\end{equation*}
where $w = u(ai – cg)- v(af – cd) = a(aei + bfg + cdh – ceg – afh – bdi)$. We define
\begin{equation}\tag{3.1}
\func{det } A = aei + bfg + cdh – ceg – afh – bdi
\end{equation}
and observe that $\func{det } A \neq 0$ because $ \func{det } A = w \neq 0$ (is invertible).
To motivate the definition below, collect the terms in Equation 3.1 involving the entries $a$, $b$, and $c$ in row 1 of $A$:
\begin{align*}
\func{det } A = \left| \begin{array}{ccc}
a & b & c \\
d & e & f \\
g & h & i
\end{array} \right| &= aei + bfg + cdh – ceg – afh – bdi \\
&= a (ei-fh) – b(di-fg) + c(dh-eg) \\
&= a \left| \begin{array}{cc}
e & f \\
h & i
\end{array} \right| – b \left| \begin{array}{cc}
d & f \\
g & i
\end{array} \right|
+ c \left| \begin{array}{cc}
d & e \\
g & h
\end{array} \right|
\end{align*}
This last expression can be described as follows: To compute the determinant of a $3 \times 3$ matrix $A$, multiply each entry in row 1 by a sign times the determinant of the $2 \times 2$ matrix obtained by deleting the row and column of that entry, and add the results. The signs alternate down row 1, starting with $+$. It is this observation that we generalize below.
Example 3.1.1
\func{det}\left[ \begin{array}{rrr}
2 & 3 & 7 \\
-4 & 0 & 6 \\
1 & 5 & 0
\end{array} \right]
&= 2 \left| \begin{array}{rr}
0 & 6 \\
5 & 0
\end{array} \right|
– 3 \left| \begin{array}{rr}
-4 & 6 \\
1 & 0
\end{array} \right|
+ 7 \left| \begin{array}{rr}
-4 & 0 \\
1 & 5
\end{array} \right| \\
&= 2 (-30) – 3(-6) + 7(-20) \\
&= -182
\end{align*}
This suggests an inductive method of defining the determinant of any square matrix in terms of determinants
of matrices one size smaller. The idea is to define determinants of $3 \times 3$ matrices in terms of determinants of $2 \times 2$ matrices,
then we do $4 \times 4$ matrices in terms of $3 \times 3$ matrices, and so on.
To describe this, we need some terminology.
Definition 3.1 Cofactors of a matrix
Assume that determinants of $(n – 1) \times (n – 1)$ matrices have been defined. Given the $n \times n$ matrix $A$, let
$A_{ij}$ denote the $ (n – 1) \times (n – 1) $ matrix obtained from A by deleting row $i$ and column $ j.$
Then the $(i,j)$-cofactor $c_{ij}(A)$ is the scalar defined by
\begin{equation*}
c_{ij}(A) = (-1)^{i+j} \func{det}(A_{ij})
\end{equation*}
Here $(-1)^{i+j}$ is called the sign of the $(i, j)$-position.
The sign of a position is clearly $1$ or $-1$, and the following diagram is useful for remembering it:
\begin{equation*}
\left[ \begin{array}{ccccc}
+ & – & + & – & \cdots \\
– & + & – & + & \cdots \\
+ & – & + & – & \cdots \\
– & + & – & + & \cdots \\
\vdots & \vdots & \vdots & \vdots & \\
\end{array}\right]
\end{equation*}
Note that the signs alternate along each row and column with $+$ in the upper left corner.
Example 3.1.2
\begin{equation*}
A = \left[ \begin{array}{rrr}
3 & -1 & 6 \\
5 & 2 & 7 \\
8 & 9 & 4
\end{array} \right] \end{equation*}
Solution:
Here $A_{12}$ is the matrix $\left[ \begin{array}{rr}
5 & 7 \\
8 & 4
\end{array} \right]$
that remains when row $1$ and column $2$ are deleted. The sign of position $(1, 2)$ is $(-1)^{1+2} = -1$ (this is also the $(1, 2)$-entry in the sign diagram), so the $(1, 2)$-cofactor is
\begin{equation*}
c_{12}(A) = (-1)^{1+2} \left| \begin{array}{rr}
5 & 7 \\
8 & 4
\end{array} \right|
=
(-1)(5 \cdot 4 – 7 \cdot 8) = (-1)(-36)=36
\end{equation*}
Turning to position $(3, 1)$, we find
\begin{equation*}
c_{31}(A) = (-1)^{3+1}A_{31}= (-1)^{3+1} \left| \begin{array}{rr}
-1 & 6 \\
2 & 7
\end{array} \right|
=(+1)(-7-12)=-19
\end{equation*}
Finally, the $(2, 3)$-cofactor is
\begin{equation*}
c_{23}(A) = (-1)^{2+3}A_{23}= (-1)^{2+3} \left| \begin{array}{rr}
3 & -1 \\
8 & 9
\end{array} \right|
=
(-1)(27+8)=-35
\end{equation*}
Clearly other cofactors can be found—there are nine in all, one for each position in the matrix.
We can now define $\func{det } A$ for any square matrix $A$
Definition 3.2 Cofactor expansion of a Matrix
\begin{equation*}
\func{det } A = a_{11}c_{11}(A) + a_{12}c_{12}(A) + \cdots + a_{1n}c_{1n}(A)
\end{equation*}
This is called the cofactor expansion of $\func{det } A$ along row $1$.
It asserts that $\func{det } A$ can be computed by multiplying the entries of row $1$ by the corresponding
cofactors, and adding the results. The astonishing thing is that $\func{det } A$ can be computed by taking the cofactor expansion along $\textit{any row or column}$: Simply multiply each entry of that row or column by the corresponding cofactor and add.
Theorem 3.1.1 Cofactor Expansion Theorem
of $A$. That is $\func{det } A$ can be computed by multiplying each entry of the row or column by the corresponding cofactor and adding the results.
Example 3.1.3
$
A = \left[ \begin{array}{rrr}
3 & 4 & 5 \\
1 & 7 & 2 \\
9 & 8 & -6
\end{array}
\right]$.
Solution:
The cofactor expansion along the first row is as follows:
\begin{align*}
\func{det } A &= 3c_{11}(A) + 4c_{12}(A) + 5c_{13}(A) \\
&= 3 \left| \begin{array}{rr}
7 & 2 \\
8 & -6
\end{array} \right| – 4 \left| \begin{array}{rr}
1 & 2 \\
9 & -6
\end{array} \right| + 3
\left| \begin{array}{rr}
1 & 7 \\
9 & 8
\end{array} \right| \\
&= 3 (-58) – 4(-24) + 5(-55) \\
&= -353
\end{align*}
Note that the signs alternate along the row (indeed along $\textit{any} $ row or column). Now we compute $\func{det } A$ by expanding along the first column.
\begin{align*}
\func{det } A &= 3c_{11}(A) + 1c_{21}(A) + 9c_{31}(A) \\
&= 3 \left| \begin{array}{rr}
7 & 2 \\
8 & -6
\end{array} \right| – \left| \begin{array}{rr}
4 & 5 \\
8 & -6
\end{array} \right| + 9
\left| \begin{array}{rr}
4 & 5 \\
7 & 2
\end{array} \right| \\
&= 3 (-58) – (-64) + 9(-27) \\
&= -353
\end{align*}
The reader is invited to verify that $\func{det } A$ can be computed by expanding along any other row or column.
The fact that the cofactor expansion along $\textit{any row or column}$ of a matrix $A$ always gives the same result (the determinant of $A$) is remarkable, to say the least. The choice of a particular row or column can simplify the calculation.
Example 3.1.4
$A = \left[ \begin{array}{rrrr}
3 & 0 & 0 & 0 \\
5 & 1 & 2 & 0 \\
2 & 6 & 0 & -1 \\
-6 & 3 & 1 & 0
\end{array}
\right]$.
Solution:
The first choice we must make is which row or column to use in the
cofactor expansion. The expansion involves multiplying entries by
cofactors, so the work is minimized when the row or column contains as
many zero entries as possible. Row $1$ is a best choice in this matrix
(column $4$ would do as well), and the expansion is
\begin{align*}
\func{det } A &= 3c_{11}(A) + 0c_{12}(A) + 0c_{13}(A) + 0c_{14}(A) \\
&= 3 \left| \begin{array}{rrr}
1 & 2 & 0 \\
6 & 0 & -1 \\
3 & 1 & 0
\end{array}
\right|
\end{align*}
This is the first stage of the calculation, and we have succeeded in expressing the determinant of the $4 \times 4$ matrix $A$
in terms of the determinant of a $3 \times 3$ matrix. The next stage involves
this $3 \times 3$ matrix. Again, we can use any row or column for the cofactor
expansion. The third column is preferred (with two zeros), so
\begin{align*}
\func{det } A &= 3 \left( 0 \left| \begin{array}{rr}
6 & 0 \\
3 & 1
\end{array}
\right| – (-1)
\left| \begin{array}{rr}
1 & 2 \\
3 & 1
\end{array}
\right|
+ 0
\left| \begin{array}{rr}
1 & 2 \\
6 & 0
\end{array}
\right|
\right) \\
&= 3 [ 0 + 1(-5) + 0] \\
&= -15
\end{align*}
This completes the calculation.
This example shows us that calculating a determinant is simplified a great deal when a row or column consists mostly of zeros. (In fact, when a row or column consists $\textit{entirely}$ of zeros, the determinant is zero—simply expand along that row or column.) We did learn that one method of $\textit{creating}$ zeros in a matrix is to apply elementary row operations to it. Hence, a natural question to ask is what effect such a row operation has on the determinant of the matrix. It turns out that the effect is easy to determine and that elementary $\textit{column}$ operations can be used in the same way. These observations lead to a technique for evaluating determinants that greatly reduces the labour involved. The necessary information is given in Theorem 3.1.2.
Theorem 3.1.2
Let $A$ denote an $n \times n$ matrix.
- If A has a row or column of zeros, $\func{det } A = 0$.
- If two distinct rows (or columns) of $A$ are interchanged, the determinant of the resulting matrix is $- \func{det } A$.
- If a row (or column) of $A$ is multiplied by a constant $u$, the determinant of the resulting matrix is $u(\func{det } A)$.
- If two distinct rows (or columns) of $A$ are identical, $\func{det } A = 0$.
- If a multiple of one row of $A$ is added to a different row (or if a multiple of a column is added to a different column), the determinant of
the resulting matrix is $\func{det } A$.
The following four examples illustrate how Theorem 3.1.2 is used to evaluate determinants.
Example 3.1.5
$A = \left[ \begin{array}{rrr}
1 & -1 & 3 \\
1 & 0 & -1 \\
2 & 1 & 6
\end{array}
\right]$.
Solution:
The matrix does have zero entries, so expansion along (say) the second row would involve somewhat less work. However, a column operation can be
used to get a zero in position $(2, 3$)—namely, add column 1 to column 3. Because this does not change the value of the determinant, we obtain
\begin{equation*}
\func{det } A = \left| \begin{array}{rrr}
1 & -1 & 3 \\
1 & 0 & -1 \\
2 & 1 & 6
\end{array} \right| = \left| \begin{array}{rrr}
1 & -1 & 4 \\
1 & 0 & 0 \\
2 & 1 & 8
\end{array} \right|
= – \left| \begin{array}{rr}
-1 & 4 \\
1 & 8
\end{array} \right|
=12
\end{equation*}
where we expanded the second $3 \times 3$ matrix along row 2.
Example 3.1.6
a & b & c \\
p & q & r \\
x & y & z
\end{array}
\right] = 6$,
evaluate $\func{det } A$ where $A = \left[ \begin{array}{ccc}
a+x & b+y & c+z \\
3x & 3y & 3z \\
-p & -q & -r
\end{array}\right]$.
Solution:
First take common factors out of rows 2 and 3.
\begin{equation*}
\func{det } A = 3(-1) \func{det} \left[ \begin{array}{ccc}
a+x & b+y & c+z \\
x & y & z \\
p & q & r
\end{array}
\right]
\end{equation*}
Now subtract the second row from the first and interchange the last two rows.
\begin{equation*}
\func{det } A = -3 \func{det} \left[ \begin{array}{ccc}
a & b & c \\
x & y & z \\
p & q & r
\end{array}
\right]
= 3 \func{det} \left[ \begin{array}{ccc}
a & b & c \\
p & q & r \\
x & y & z
\end{array}
\right]
= 3 \cdot 6 = 18
\end{equation*}
The determinant of a matrix is a sum of products of its entries. In particular, if these entries are polynomials in $x$, then the determinant itself is a polynomial in $x$. It is often of interest to determine which values of $x$ make the determinant zero, so it is very useful if the determinant is given in factored form. Theorem 3.1.2 can help.
Example 3.1.7
$A = \left[ \begin{array}{ccc}
1 & x & x \\
x & 1 & x \\
x & x & 1
\end{array}
\right]$.
Solution:
To evaluate $\func{det } A$, first subtract $x$ times row 1 from rows 2 and 3.
\begin{equation*}
\func{det } A = \left| \begin{array}{ccc}
1 & x & x \\
x & 1 & x \\
x & x & 1
\end{array} \right|
=
\left| \begin{array}{ccc}
1 & x & x \\
0 & 1-x^2 & x-x^2 \\
0 & x-x^2 & 1-x^2
\end{array} \right|
=
\left| \begin{array}{cc}
1-x^2 & x-x^2 \\
x-x^2 & 1-x^2
\end{array} \right|
\end{equation*}
At this stage we could simply evaluate the determinant (the result is $2x^3-3x^2+1$). But then we would have to factor this polynomial to find the values of $x$ that make it zero. However, this factorization can be obtained directly by first factoring each entry in the determinant and taking a common
factor of $(1-x)$ from each row.
\begin{align*}
\func{det } A = \left| \begin{array}{cc}
(1-x)(1+x) & x(1-x) \\
x(1-x) & (1-x)(1+x)
\end{array}
\right| &=
(1-x)^2 \left| \begin{array}{cc}
1+x & x \\
x & 1+x
\end{array} \right| \\
&= (1-x)^2(2x+1)
\end{align*}
Hence, $\func{det } A = 0$ means $(1 – x)^2(2x + 1) = 0$, that is $x = 1$ or $x = -\frac{1}{2}$.
Example 3.1.8
\begin{equation*}
\func{det}\left[ \begin{array}{ccc}
1 & a_1 & a_1^2 \\
1 & a_2 & a_2^2 \\
1 & a_3 & a_3^2
\end{array}
\right] = (a_3-a_1)(a_3-a_2)(a_2-a_1)
\end{equation*}
Solution:
Begin by subtracting row 1 from rows 2 and 3, and then expand along column 1:
\begin{equation*}
\func{det} \left[ \begin{array}{ccc}
1 & a_1 & a_1^2 \\
1 & a_2 & a_2^2 \\
1 & a_3 & a_3^2
\end{array}
\right] = \func{det} \left[ \begin{array}{ccc}
1 & a_1 & a_1^2 \\
0 & a_2-a_1 & a_2^2-a_1^2 \\
0 & a_3-a_1 & a_3^2-a_1^2
\end{array}
\right]
= \left[ \begin{array}{cc}
a_2-a_1 & a_2^2-a_1^2 \\
a_3-a_1 & a_3^2-a_1^2
\end{array}
\right]
\end{equation*}
Now $(a_2 – a_1)$ and $(a_3 – a_1)$ are common factors in rows 1 and 2, respectively, so
\begin{align*}
\func{det} \left[ \begin{array}{ccc}
1 & a_1 & a_1^2 \\
1 & a_2 & a_2^2 \\
1 & a_3 & a_3^2
\end{array}
\right] &= (a_2-a_1)(a_3-a_1)\func{det} \left[ \begin{array}{cc}
1& a_2+a_1 \\
1 & a_3+a_1
\end{array} \right] \\
&= (a_2-a_1)(a_3-a_1)(a_3-a_2)
\end{align*}
The matrix in Example 3.1.8 is called a Vandermonde matrix, and the formula for its determinant can be generalized to the $n \times n$ case.
If $A$ is an $n \times n$ matrix, forming $uA$ means multiplying $\textit{every} $row of $A$ by $u$. Applying property 3 of Theorem 3.1.2, we can take the common factor $u$ out of each row and so obtain the following useful result.
Theoerem 3.1.3
The next example displays a type of matrix whose determinant is easy to compute.
Example 3.1.9
$A = \left[ \begin{array}{rrrr}
a & 0 & 0 & 0 \\
u & b & 0 & 0 \\
v & w & c & 0 \\
x & y & z & d
\end{array} \right]$.
Solution:
Expand along row 1 to get $\func{det } A = a \left| \begin{array}{rrr}
b & 0 & 0 \\
w & c & 0 \\
y & z & d
\end{array} \right|$. Now expand this along the top row to get $\func{det } A = ab \left| \begin{array}{cc}
c & 0 \\
z & d
\end{array} \right| = abcd$, the product of the main diagonal entries.
A square matrix is called a $\textbf{lower triangular matrix}$ if all entries above the main diagonal are zero (as in Example 3.1.9). Similarly, an $\textbf{upper triangular matrix}$ is one for which all entries below the main diagonal are zero. A $\textbf{triangular matrix}$ is one that is either upper or lower triangular. Theorem 3.1.4 gives an easy rule for calculating the determinant of any triangular matrix.
Theorem 3.1.4
Theorem 3.1.4 is useful in computer calculations because it is a routine matter to carry a matrix to triangular form using row operations.
3.2 Determinants and Matrix Inverses
In this section, several theorems about determinants are derived. One consequence of these theorems is that a square matrix $A$ is invertible if and only if $\func{det } A \neq 0$. Moreover, determinants are used to give a formula for $A^{-1}$ which, in turn, yields a formula (called Cramer’s rule) for the
solution of any system of linear equations with an invertible coefficient matrix.
We begin with a remarkable theorem (due to Cauchy in 1812) about the determinant of a product of matrices.
Theorem 3.2.1 Product Theorem
The complexity of matrix multiplication makes the product theorem quite unexpected. Here is an example where it reveals an important numerical identity.
Example 3.2.1
If $A = \left[ \begin{array}{rr}
a & b \\
-b & a
\end{array} \right]$ and $B = \left[ \begin{array}{rr}
c & d \\
-d & c
\end{array} \right]$
then $AB = \left[ \begin{array}{cc}
ac-bd & ad+bc \\
-(ad+bc) & ac-bd
\end{array} \right]$.
Hence $\func{det } A \func{det } B = \func{det}(AB)$ gives the identity
\begin{equation*}
(a^2 + b^2)(c^2+d^2) = (ac-bd)^2 + (ad+bc)^2
\end{equation*}
Theorem 3.2.1 extends easily to $\func{det}(ABC) = \func{det } A \func{det } B \func{det } C$. In fact, induction gives
\begin{equation*}
\func{det}(A_1A_2 \cdots A_{k-1}A_k) = \func{det } A_1 \func{det } A_2 \cdots \func{det } A_{k-1} \func{det } A_k
\end{equation*}
for any square matrices $A_1, \dots, A_k$ of the same size. In particular, if each $A_i = A$, we obtain
\begin{equation*}
\func{det} (A^k) = (det A)^k, \mbox{ for any } k\geq 1
\end{equation*}
We can now give the invertibility condition.
Theorem 3.2.2
$ \func{det} (A^{-1}) = \frac{1}{\func{det } A}$
Proof:
If $A$ is invertible, then $AA^{-1}=I$; so the product theorem gives
\begin{equation*}
1 = \func{det } I = \func{det} (AA^{-1}) = \func{det } A \func{det } A^{-1}
\end{equation*}
Hence, $\func{det } A \neq 0$ and also $\func{det } A^{-1} = \frac{1}{\func{det } A}$.
Conversely, if $\func{det } A \neq 0$, we show that $A$ can be carried to $I$ by elementary row operations (and invoke Theorem 2.4.5). Certainly, $A$ can be carried to its reduced row-echelon form $R$, so $R = E_k \cdots E_2E_1A$ where the $E_i$ are elementary matrices (Theorem 2.5.1). Hence the product theorem gives
\begin{equation*}
\func{det } R = \func{det } E_k \cdots \func{det } E_2 \func{det } E_1 \func{det } A
\end{equation*}
Since $\func{det } E \neq 0$ for all elementary matrices $E$, this shows $\func{det } R \neq 0$. In particular, $R$ has no row of zeros, so $R = I$ because $R$ is square and reduced row-echelon. This is what we wanted.
Example 3.2.2
1 & 0 & -c \\
-1 & 3 & 1 \\
0 & 2c & -4
\end{array} \right]$
have an inverse?
Solution:
Compute $\func{det } A$ by first adding $c$ times column 1 to column 3 and then expanding along row 1.
\begin{equation*}
\func{det } A = \func{det} \left[ \begin{array}{rcr}
1 & 0 & -c \\
-1 & 3 & 1 \\
0 & 2c & -4
\end{array} \right] = \func{det} \left[ \begin{array}{rcc}
1 & 0 & 0 \\
-1 & 3 & 1-c \\
0 & 2c & -4
\end{array} \right] = 2(c+2)(c-3)
\end{equation*}
Hence, $\func{det } A = 0$ if $c = -2$ or $c = 3$, and $A$ has an inverse if $c \neq -2$ and $c \neq 3$.
Example 3.2.3
Solution:
We have $\func{det }A_1\func{det }A_2\cdots \func{det }A_k = \func{det}(A_1A_2\cdots A_k)$ by the product theorem, and $\func{det}(A_1A_2\cdots A_k) \neq 0$ by Theorem 3.2.2 because $A_1A_2\cdots A_k$ is invertible. Hence
\begin{equation*}
\func{det } A_1 \func{det } A_2 \cdots \func{det } A_k \neq 0
\end{equation*}
so $\func{det } A_i \neq 0$ for each $i$. This shows that each $A_i$ is invertible, again by Theorem 3.2.2.
Theorem 3.2.3
Proof:
Consider first the case of an elementary matrix $E$. If $E$ is of type I or II, then $E^T = E$; so certainly $\func{det } E^T = \func{det } E$. If $E$ is of type III, then $E^T$ is also of type III; so $\func{det } E^T = 1 = \func{det } E$ by Theorem 3.1.2. Hence, $\func{det } E^T = \func{det } E$ for every elementary matrix $E$.
Now let $A$ be any square matrix. If $A$ is not invertible, then neither is $A^T$; so $\func{det } A^T = 0 = \func{det } A$ by Theorem 3.1.2. On the other hand, if $A$ is invertible, then $A = E_k \cdots E_2E_1$, where the $E_i$ are elementary matrices (Theorem 2.5.2). Hence, $A^T = E^T_1E^T_2 \cdots E^T_k$ so the product theorem gives
\begin{align*}
\func{det } A^T = \func{det } E^T_1 \func{det } E^T_2 \cdots \func{det } E^T_k & = \func{det } E_1 \func{det } E_2 \cdots \func{det } E_k \\
&= \func{det } E_k \cdots \func{det } E_2 \func{det } E_1 \\
&= \func{det } A
\end{align*}
This completes the proof.
Example 3.2.4
Solution:
We use several of the facts just derived.
\begin{align*}
\func{det}(A^3 B^{-1}A^TB^2) &= \func{det} (A^3) \func{det}(B^{-1}) \func{det}(A^T) \func{det} (B^2)\\
&= (\func{det } A)^3 \frac{1}{\func{det } B} \func{det } A (\func{det } B)^2 \\
&= 2^3 \cdot \frac{1}{5} \cdot 2 \cdot 5^2 \\
&= 80
\end{align*}
Example 3.2.5
Solution:
If $A$ is orthogonal, we have $I = AA^T$. Take determinants to obtain
\begin{equation*}
1 = \func{det } I = \func{det}(AA^T) = \func{det } A \func{det } A^T = (\func{det } A)^2
\end{equation*}
Since $\func{det } A$ is a number, this means $\func{det } A = \pm 1$.
Adjugates
In Section 2.4 we defined the adjugate of a 2 $\times$ 2 matrix $A = \left[ \begin{array}{rr}
a & b \\
c & d
\end{array} \right]$
to be $\func{adj} (A) = \left[ \begin{array}{rr}
d & -b \\
-c & a
\end{array} \right]$.
Then we verified that $A (\func{adj } A) = (\func{det } A)I = (\func{adj } A)A$ and hence that, if $\func{det} A \neq 0$, $A^{-1} = \frac{1}{\func{det } A} \func{adj } A$. We are now able to define the adjugate of an arbitrary square matrix and to show that this formula for the inverse remains valid (when the
inverse exists).
Recall that the $(i, j)$-cofactor $c_{ij}(A)$ of a square matrix $A$ is a number defined for each position $(i, j)$ in the matrix. If $A$ is a square matrix, the $\textbf{cofactor matrix of}$ $A$ is defined to be the matrix $\left[ c_{ij}(A)\right]$ whose $(i, j)$-entry is the $(i, j)$-cofactor of $A$.
Definition 3.3 Adjugate of a Matrix
\begin{equation*}
\func{adj}(A) = \left[ c_{ij}(A) \right]^T
\end{equation*}
Example 3.2.6
1 & 3 & -2 \\
0 & 1 & 5 \\
-2 & -6 & 7
\end{array}\right]$
and calculate $A (\func{adj } A)$ and $(\func{adj } A)A$.
Solution:
We first find the cofactor matrix.
\begin{align*}
\left[ \begin{array}{rrr}
c_{11}(A) & c_{12}(A) & c_{13}(A) \\
c_{21}(A) & c_{22}(A) & c_{23}(A) \\
c_{31}(A) & c_{32}(A) & c_{33}(A)
\end{array}\right]
&= \left[ \begin{array}{ccc}
\left| \begin{array}{rr}
1 & 5 \\
-6 & 7
\end{array}\right| & -\left| \begin{array}{rr}
0 & 5 \\
-2 & 7
\end{array}\right| &
\left| \begin{array}{rr}
0 & 1 \\
-2 & -6
\end{array}\right| \\
& & \\
-\left| \begin{array}{rr}
3 & -2 \\
-6 & 7
\end{array}\right| & \left| \begin{array}{rr}
1 & -2 \\
-2 & 7
\end{array}\right| &
-\left| \begin{array}{rr}
1 & 3 \\
-2 & -6
\end{array}\right| \\
& & \\
\left| \begin{array}{rr}
3 & -2 \\
1 & 5
\end{array}\right| & -\left| \begin{array}{rr}
1 & -2 \\
0 & 5
\end{array}\right| &
\left| \begin{array}{rr}
1 & 3 \\
0 & 1
\end{array}\right|
\end{array}\right] \\
&= \left[ \begin{array}{rrr}
37 & -10 & 2 \\
-9 & 3 & 0 \\
17 & -5 & 1
\end{array}\right]
\end{align*}
Then the adjugate of $A$ is the transpose of this cofactor matrix.
\begin{equation*}
\func{adj } A = \left[ \begin{array}{rrr}
37 & -10 & 2 \\
-9 & 3 & 0 \\
17 & -5 & 1
\end{array}\right] ^T = \left[ \begin{array}{rrr}
37 & -9 & 17 \\
-10 & 3 & -5 \\
2 & 0 & 1
\end{array}\right]
\end{equation*}
The computation of $A (\func{adj} A)$ gives
\begin{equation*}
A(\func{adj } A) = \left[ \begin{array}{rrr}
1 & 3 & -2 \\
0 & 1 & 5 \\
-2 & -6 & 7
\end{array}\right] \left[ \begin{array}{rrr}
37 & -9 & 17 \\
-10 & 3 & -5 \\
2 & 0 & 1
\end{array}\right] = \left[ \begin{array}{rrr}
3 & 0 & 0 \\
0 & 3 & 0 \\
0 & 0 & 3
\end{array}\right] = 3I
\end{equation*}
and the reader can verify that also $(\func{adj } A)A = 3I$. Hence, analogy with the $2 \times 2$ case would indicate that $\func{det } A = 3$; this is, in fact, the case.
The relationship $A(\func{adj } A) = (\func{det }A)I$ holds for any square matrix $A$.
Theorem 3.2.4 Adjugate formula
\begin{equation*}
A(\func{adj } A) = (\func{det } A)I = (\func{adj } A)A
\end{equation*}
In particular, if det A $\neq$ 0, the inverse of A is given by
\begin{equation*}
A^{-1} =\frac{1}{\func{det } A}\func{adj } A
\end{equation*}
It is important to note that this theorem is $\textit{not} $an efficient way to find the inverse of the matrix $A$. For example, if $A$ were $10 \times 10$, the calculation of $\func{adj } A$ would require computing $10^{2} = 100$ determinants of $9 \times 9$ matrices! On the other hand, the matrix inversion algorithm would find $A^{-1}$ with about the same effort as finding $\func{det }A$. Clearly, Theorem 3.2.4 is not a $\textit{practical} $result: its virtue is that it gives a formula for $A^{-1}$ that is useful for $\textit{theoretical} $purposes.
Example 3.2.7
2 & 1 & 3 \\
5 & -7 & 1 \\
3 & 0 & -6
\end{array}\right]$.
Solution:
First compute
\begin{equation*}
\func{det } A = \left| \begin{array}{rrr}
2 & 1 & 3 \\
5 & -7 & 1 \\
3 & 0 & -6
\end{array} \right| = \left| \begin{array}{rrr}
2 & 1 & 7 \\
5 & -7 & 11 \\
3 & 0 & 0
\end{array} \right| =
3 \left| \begin{array}{rr}
1 & 7 \\
-7 & 11
\end{array}\right| = 180
\end{equation*}
Since $A^{-1} = \frac{1}{\func{det } A}\func{adj } A = \frac{1}{180} \left[ c_{ij}(A)\right] ^T$,
the $(2, 3)$-entry of $A^{-1}$ is the $(3, 2)$-entry of the matrix $\frac{1}{180}\leftB c_{ij}(A) \rightB$; that is, it equals
$\frac{1}{180} c_{32} (A) = \frac{1}{180} \left( – \left| \begin{array}{rr}
2 & 3 \\
5 & 1
\end{array}\right| \right) = \frac{13}{180}.$
Example 3.2.8
Solution:
Write $d = \func{det } A$; we must show that $\func{det}(\func{adj }A) = d^{n-1}$. We have $A(\func{adj }A) = dI$ by Theorem 3.2.4, so taking determinants gives $d\func{det}(\func{adj }A) = d^{n}$. Hence we are done if $d \neq 0$. Assume $d = 0$; we must show that $\func{det}(\func{adj }A) = 0$, that is, $\func{adj }A$ is not invertible. If $A \neq 0$, this follows from $A(\func{adj }A) = dI = 0$; if $A = 0$, it follows because then $\func{adj }A = 0$.
Cramer’s Rule
Theorem 3.2.4 has a nice application to linear equations. Suppose
\begin{equation*}
A\vec{x}=\vec{b}
\end{equation*}
is a system of $n$ equations in $n$ variables $x_{1}, x_{2}, \dots , x_{n}$. Here $A$ is the $n \times n$ coefficient matrix and $\vec{x}$ and $\vec{b}$ are the columns
\begin{equation*}
\vec{x} = \left[ \begin{array}{c}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{array} \right] \mbox{ and }
\vec{b} = \left[ \begin{array}{c}
b_1 \\
b_2 \\
\vdots \\
b_n
\end{array} \right]
\end{equation*}
of variables and constants, respectively. If $\func{det }A \neq 0$, we left multiply by $A^{-1}$ to obtain the solution $\vec{x} = A^{-1}\vec{b}$. When we use the adjugate formula, this becomes
\begin{align*}
\left[ \begin{array}{c}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{array} \right] &= \frac{1}{\func{det } A} (\func{adj } A)\vec{b} \\
&= \frac{1}{\func{det } A}
\left[ \begin{array}{cccc}
c_{11}(A) & c_{21}(A) & \cdots & c_{n1}(A) \\
c_{12}(A) & c_{22}(A) & \cdots & c_{n2}(A) \\
\vdots & \vdots & & \vdots \\
c_{1n}(A) & c_{2n}(A) & \cdots & c_{nn}(A)
\end{array}\right]
\left[ \begin{array}{c}
b_1 \\
b_2 \\
\vdots \\
b_n
\end{array} \right]
\end{align*}
Hence, the variables $x_{1}, x_{2}, \dots, x_{n}$ are given by
\begin{align*}
x_1 &= \frac{1}{\func{det } A} \left[ b_1c_{11}(A) + b_2c_{21}(A) + \cdots + b_nc_{n1}(A)\right]\\
x_2 &= \frac{1}{\func{det } A} \left[ b_1c_{12}(A) + b_2c_{22}(A) + \cdots + b_nc_{n2}(A)\right] \\
& \hspace{5em} \vdots \hspace{5em} \vdots\\
x_n &= \frac{1}{\func{det } A} \left[ b_1c_{1n}(A) + b_2c_{2n}(A) + \cdots + b_nc_{nn}(A)\right]
\end{align*}
Now the quantity $b_{1}c_{11}(A) + b_{2}c_{21}(A) + \cdots + b_{n}c_{n1}(A)$ occurring in the formula for $x_{1}$ looks like the cofactor expansion of the determinant of a matrix. The cofactors involved are $c_{11}(A), c_{21}(A), \dots, c_{n1}(A)$, corresponding to the first column of $A$. If $A_{1}$ is obtained from $A$ by replacing the first column of $A$ by $\vect{b}$, then $c_{i1}(A_{1}) = c_{i1}(A)$ for each $i$ because column $1$ is deleted when computing them. Hence, expanding $\func{det}(A_{1})$ by the first column gives
\begin{align*}
\func{det } A_1 &= b_1c_{11}(A_1) + b_2c_{21}(A_1) + \cdots + b_nc_{n1}(A_1) \\
&= b_1c_{11}(A) + b_2c_{21}(A) + \cdots + b_nc_{n1}(A)\\
&= (\func{det } A)x_1
\end{align*}
Hence, $x_1 = \frac{\func{det } A_1}{\func{det } A}$ and similar results hold for the other variables.
Theorem 3.2.5 Cramer’s Rule
\begin{equation*}
A\vec{x}=\vec{b}
\end{equation*}
of $n$ equations in the variables $x_{1}, x_{2}, \dots, x_{n}$ is given by
\begin{equation*}
x_1 = \frac{\func{det } A_1}{\func{det } A}, \; x_2 = \frac{\func{det } A_2}{\func{det } A}, \;\cdots, \; x_n = \frac{\func{det } A_n}{\func{det } A}
\end{equation*}
where, for each $k$, $A_k$ is the matrix obtained from $A$ by replacing column $k$ by $\vec{b}$.
Example 3.2.9
\begin{equation*}
\arraycolsep=1pt
\begin{array}{rrrrrrr}
5x_1 & + & x_2 & – & x_3 & = & 4\\
9x_1& + & x_2 & – & x_3 & = & 1 \\
x_1& – & x_2 & + & 5x_3 & = & 2 \\
\end{array}
\end{equation*}
Solution:
Compute the determinants of the coefficient matrix $A$ and the matrix $A_{1}$ obtained from it by replacing the first column by the column of constants.
\begin{align*}
\func{det } A &= \func{det} \left[ \begin{array}{rrr}
5 & 1 & -1 \\
9 & 1 & -1 \\
1 & -1 & 5
\end{array}\right] = -16 \\
\func{det } A_1 &= \func{det} \left[ \begin{array}{rrr}
4 & 1 & -1 \\
1 & 1 & -1 \\
2 & -1 & 5
\end{array}\right] = 12
\end{align*}
Hence, $x_1 = \frac{\func{det } A_1}{\func{det } A} = -\frac{3}{4}$ by Cramer’s rule.
Cramer’s rule is $ \textit{not}$ an efficient way to solve linear systems or invert matrices. True, it enabled us to calculate $x_{1}$ here without computing $x_{2}$ or $x_{3}$. Although this might seem an advantage, the truth of the matter is that, for large systems of equations, the number of computations needed to find $\textit{all}$ the variables by the gaussian algorithm is comparable to the number required to find $\textit{one}$ of the determinants involved in Cramer’s rule. Furthermore, the algorithm works when the matrix of the system is not invertible and even when the coefficient matrix is not square. Like the adjugate formula, then, Cramer’s rule is $\textit{not}$ a practical numerical technique; its virtue is theoretical.
3.3 Diagonalization and Eigenvalues
The world is filled with examples of systems that evolve in time—the weather in a region, the economy of a nation, the diversity of an ecosystem, etc. Describing such systems is difficult in general and various methods have been developed in special cases. In this section we describe one such method, called $\textit{diagonalization,} $ which is one of the most important techniques in linear algebra. A very fertile example of this procedure is in modelling the growth of the population of an animal species. This has attracted more attention in recent years with the ever increasing awareness that many species are endangered. To motivate the technique, we begin by setting up a simple model of a bird population in which we make assumptions about survival and reproduction rates.
Example 3.3.1
Consider the evolution of the population of a species of birds. Because the number of males and females are nearly equal, we count only females. We assume that each female remains a juvenile for one year and then becomes an adult, and that only adults have offspring. We make three assumptions about reproduction and survival rates:
- The number of juvenile females hatched in any year is twice the number of adult females alive the year before (we say the $\textbf{reproduction rate} $is 2).
- Half of the adult females in any year survive to the next year (the $\textbf{adult survival rate}$ is $\frac{1}{2}$).
- One-quarter of the juvenile females in any year survive into adulthood (the $\textbf{juvenile survival rate}$ is $\frac{1}{4}$).
If there were 100 adult females and 40 juvenile females alive initially, compute the population of females $k$ years later.
Solution:
Let $a_{k}$ and $j_{k}$ denote, respectively, the number of adult and juvenile females after $k$ years, so that the total female population is the sum $a_{k} + j_{k}$. Assumption 1 shows that $j_{k+1} = 2a_{k}$, while assumptions 2 and 3 show that $a_{k+1} = \frac{1}{2}a_{k} + \frac{1}{4}j_{k}$. Hence the numbers $a_{k}$ and $j_{k}$ in successive years are related by the following equations:
\begin{align*}
a_{k+1} & = \frac{1}{2} a_k + \frac{1}{4} j_k \\
j_{k+1} & = 2 a_k
\end{align*}
If we write $\vec{v}_k = \left[ \begin{array}{c}
a_k \\
j_k
\end{array}\right]$
and $A = \left[ \begin{array}{rr}
\frac{1}{2} & \frac{1}{4} \\
2 & 0 \end{array}\right]$
these equations take the matrix form
\begin{equation*}
\vec{v}_{k+1} = A\vec{v}_k, \mbox{ for each } k = 0,1,2,\dots
\end{equation*}
Taking $k = 0$ gives $\vec{v}_{1} = A\vec{v}_{0}$, then taking $k = 1$ gives $\vec{v}_{2} = A\vec{v}_{1} = A^{2}\vec{v}_{0}$, and taking $k = 2$ gives $\vec{v}_{3} = A\vec{v}_{2} = A^{3}\vec{v}_{0}$. Continuing in this way, we get
\begin{equation*}
\vec{v}_k = A^k \vect{v}_0, \mbox{ for each } k=0,1,2, \dots
\end{equation*}
Since $\vec{v}_0 = \left[ \begin{array}{c}
a_0 \\
j_0
\end{array}\right] = \left[ \begin{array}{c}
100 \\
40
\end{array}\right] $
is known, finding the population profile $\vec{v}_{k}$ amounts to computing $A^{k}$ for all $k \geq 0$. We will complete this calculation in Example 3.3.12 after some new techniques have been developed.
Let $A$ be a fixed $n \times n$ matrix. A sequence $\vec{v}_{0}, \vec{v}_{1}, \vec{v}_{2}, \dots$ of column vectors in $\RR^n$ is called a $\textbf{linear dynamical system}$. Many models regard $\vec{v}_{t}$ as a continuous function of the time $t$, and replace our condition between $\vec{b}_{k+1}$ and $A\vec{v}_k$ with a differential relationship viewed as functions of time if $\vec{v}_{0}$ is known and the other $\vec{v}_{k}$ are determined (as in Example 3.3.1) by the conditions
\begin{equation*}
\vec{v}_{k+1} = A\vec{v}_k \mbox{ for each } k = 0, 1, 2, \dots
\end{equation*}
These conditions are called a $\textbf{matrix recurrence}$ for the vectors $\vec{v}_{k}$. As in Example 3.3.1, they imply that
\begin{equation*}
\vec{v}_k = A^k \vec{v}_0 \mbox{ for all }k \ge 0
\end{equation*}
so finding the columns $\vec{v}_{k}$ amounts to calculating $A^{k}$ for $k \geq 0$.
Direct computation of the powers $A^{k}$ of a square matrix $A$ can be time-consuming, so we adopt an indirect method that is commonly used. The idea is to first $\textbf{diagonalize}$ the matrix $A$, that is, to find an invertible matrix $P$ such that
\begin{equation*} \tag{3.8}
P^{-1}AP=D \mbox{ is a diagonal matrix}
\end{equation*}
This works because the powers $D^{k}$ of the diagonal matrix $D$ are easy to compute, and Equation (3.8) enables us to compute powers $A^{k}$ of the matrix $A$ in terms of powers $D^{k}$ of $D$. Indeed, we can solve Equation (3.8) for $A$ to get $A = PDP^{-1}$. Squaring this gives
\begin{equation*}
A^2 = (PDP^{-1})(PDP^{-1}) = PD^2P^{-1}
\end{equation*}
Using this we can compute $A^{3}$ as follows:
\begin{equation*}
A^3 = AA^2 = (PDP^{-1})(PD^2P^{-1}) = PD^3P^{-1}
\end{equation*}
Continuing in this way we obtain Theorem 3.3.1 (even if $D$ is not diagonal).
Theorem 3.3.1
Hence computing $A^{k}$ comes down to finding an invertible matrix $P$ as in equation Equation (3.8). To do this it is necessary to first compute certain numbers (called eigenvalues) associated with the matrix $A$.
Eigenvalue and Eigenvectors
Definition 3.4 Eigenvalues and Eigenvectors of a Matrix
\begin{equation*}
A\vec{x} = \lambda \vec{x} \mbox{ for some column } \vec{x} \neq \vec{0} \mbox{ in } \RR^n
\end{equation*}
In this case, $\vec{x}$ is called an $\textbf{eigenvector}$ of $A$ corresponding to the eigenvalue $\lambda$, or a $\lambda$-$\textbf{eigenvector} $for short.
Example 3.3.2
3 & 5 \\
1 & -1
\end{array}\right]$ and $\vec{x} = \left[ \begin{array}{r}
5 \\
1
\end{array}\right]$ then $A\vec{x} = 4 \vec{x}$ so $\lambda = 4$ is an eigenvalue of $A$ with corresponding eigenvector $\vec{x}$.
The matrix $A$ in Example 3.3.2 has another eigenvalue in addition to $\lambda = 4$. To find it, we develop a general procedure for $\textit{any}$ $n \times n$ matrix $A$.
By definition a number $\lambda$ is an eigenvalue of the $n \times n$ matrix $A$ if and only if $A\vec{x} = \lambda\vec{x}$ for some column $\vec{x} \neq \vec{0}$. This is equivalent to asking that the homogeneous system
\begin{equation*}
(\lambda I – A)\vec{x} = \vec{0}
\end{equation*}
of linear equations has a nontrivial solution $\vec{x} \neq \vec{0}$. By Theorem 2.4.5 this happens if and only if the matrix $\lambda I – A$ is not invertible and this, in turn, holds if and only if the determinant of the coefficient matrix is zero:
\begin{equation*}
\func{det}(\lambda I -A)=0
\end{equation*}
This last condition prompts the following definition:
Definition 3.5 Characteristic Polynomial of a Matrix
\begin{equation*}
c_A(x) = \func{det}(xI – A)
\end{equation*}
Note that $c_{A}(x)$ is indeed a polynomial in the variable $x$, and it has degree $n$ when $A$ is an $n \times n$ matrix (this is illustrated in the examples below). The above discussion shows that a number $\lambda$ is an eigenvalue of $A$ if and only if $c_{A}(\lambda) = 0$, that is if and only if $\lambda$ is a $\textbf{root}$ of the characteristic polynomial $c_{A}(x)$. We record these observations in
Theorem 3.3.2
Let $A$ be an $n \times n$ matrix.
- The eigenvalues $\lambda$ of $A$ are the roots of the characteristic polynomial $c_{A}(x)$ of $A$.
- The $\lambda$-eigenvectors $\vec{x}$ are the nonzero solutions to the homogeneous system
\begin{equation*}
(\lambda I – A)\vec{x} = \vec{0}
\end{equation*}
of linear equations with $\lambda I – A$ as coefficient matrix.
In practice, solving the equations in part 2 of Theorem 3.3.2 is a routine application of gaussian elimination, but finding the eigenvalues can be difficult, often requiring computers. For now, the examples and exercises will be constructed so that the roots of the characteristic polynomials are relatively easy to find
(usually integers). However, the reader should not be misled by this into thinking that eigenvalues are so easily obtained for the matrices that occur in practical applications!
Example 3.3.3
3 & 5 \\
1 & -1
\end{array}\right]$
discussed in Example 3.3.2, and then find all the eigenvalues and their eigenvectors.
Solution:
Since $xI-A = \left[ \begin{array}{rr}
x & 0 \\
0 & x
\end{array}\right] – \left[ \begin{array}{rr}
3 & 5 \\
1 & -1
\end{array}\right] =
\left[ \begin{array}{cc}
x – 3 & -5 \\
-1 & x+1
\end{array}\right]$
we get
\begin{equation*}
c_A(x) = \func{det} \left[ \begin{array}{cc}
x – 3 & -5 \\
-1 & x+1
\end{array}\right] = x^2 – 2x – 8 = (x-4)(x+2)
\end{equation*}
Hence, the roots of $c_{A}(x)$ are $\lambda_{1} = 4$ and $\lambda_{2} = -2$, so these are the eigenvalues of $A$. Note that $\lambda_{1} = 4$ was the eigenvalue mentioned in Example 3.3.2, but we have found a new one: $\lambda_{2} = -2$.
To find the eigenvectors corresponding to $\lambda_{2} = -2$, observe that in this case
\begin{equation*}
(\lambda_2 I – A)\vec{x} = \left[ \begin{array}{cc}
\lambda_2 – 3 & -5 \\
-1 & \lambda_2+1
\end{array}\right] = \left[ \begin{array}{rr}
-5 & -5 \\
-1 & -1
\end{array}\right]
\end{equation*}
so the general solution to $(\lambda_{2} I – A)\vec{x} = \vec{0}$ is $\vec{x} = t \left[ \begin{array}{r}
-1 \\
1
\end{array}\right]$
where $t$ is an arbitrary real number. Hence, the eigenvectors $\vec{x}$ corresponding to $\lambda_{2}$ are $\vec{x} = t \left[ \begin{array}{r}
-1 \\
1
\end{array}\right]$ where $t \neq 0$ is arbitrary. Similarly, $\lambda_{1} = 4$ gives rise to the eigenvectors $\vec{x} = t \left[ \begin{array}{r}
5 \\
1
\end{array}\right], t \neq 0$ which includes the observation in Example 3.3.2.
Note that a square matrix $A$ has $\textit{many}$ eigenvectors associated with any given eigenvalue $\lambda$. In fact $\textit{every}$ nonzero solution $\vec{x}$ of $(\lambda I – A)\vec{x} = \vec{0}$ is an eigenvector. Recall that these solutions are all linear combinations of certain basic solutions determined by the gaussian algorithm (see Theorem 1.3.2). Observe that any nonzero multiple of an eigenvector is again an eigenvector, and such multiples are often more convenient. Any set of nonzero multiples of the basic solutions of $(\lambda I – A)\vec{x} = \vec{0}$ will be called a set of basic eigenvectors corresponding to $\lambda$.
GeoGebra Exercise: Eigenvalue and eigenvectors
https://www.geogebra.org/m/DJXTtm2k
Please answer these questions after you open the webpage:
1. Set the matrix to be\[M= \begin{bmatrix}
1 & 0 \\
0 & 2
\end{bmatrix}\]
2. Drag the point $u$ until you see the vector $\vec{u}$ and $M\vec{u}$ are on the same line. Record the value of $\lambda$. How many times do you see $\vec{u}$ and $M\vec{u}$ lying on the same line when $\vec{u}$ travel through the whole circle? Why?
3. Based on your observation, what can we say about the eigenvalue and eigenvector of $M$?
4. Set the matrix to be \[M= \begin{bmatrix}
3 & 5 \\
1 & -1
\end{bmatrix}\] and repeat what you did above.
5. Check your lecture notes about the eigenvalues and eigenvectors of this matrix. Are the results consistent with what you observe?
Example 3.3.4:
Find the characteristic polynomial, eigenvalues, and basic eigenvectors for
\begin{equation*}
A = \left[ \begin{array}{rrr}
2 & 0 & 0 \\
1 & 2 & -1 \\
1 & 3 & -2
\end{array}\right]
\end{equation*}
Solution:
Here the characteristic polynomial is given by
\begin{equation*}
c_A(x) = \func{det} \left[ \begin{array}{ccc}
x-2 & 0 & 0 \\
-1 & x-2 & 1 \\
-1 & -3 & x+2
\end{array}\right] = (x-2)(x-1)(x+1)
\end{equation*}
so the eigenvalues are $\lambda_{1} = 2$, $\lambda_{2} = 1$, and $\lambda_{3} = -1$. To find all eigenvectors for $\lambda_{1} = 2$, compute
\begin{equation*}
\lambda_1 I-A = \left[ \begin{array}{ccc}
\lambda_1-2 & 0 & 0 \\
-1 & \lambda_1-2 & 1 \\
-1 & -3 & \lambda_1+2
\end{array}\right] = \left[ \begin{array}{rrr}
0 & 0 & 0 \\
-1 & 0 & 1 \\
-1 & -3 & 4
\end{array}\right]
\end{equation*}
We want the (nonzero) solutions to $(\lambda_{1}I – A)\vec{x} = \vec{0}$. The augmented matrix becomes
\begin{equation*}
\left[ \begin{array}{rrr|r}
0 & 0 & 0 & 0 \\
-1 & 0 & 1 & 0 \\
-1 & -3 & 4 & 0
\end{array}\right] \rightarrow
\left[ \begin{array}{rrr|r}
1 & 0 & -1 & 0 \\
0 & 1 & -1 & 0 \\
0 & 0 & 0 & 0
\end{array}\right]
\end{equation*}
using row operations. Hence, the general solution $\vec{x}$ to $(\lambda_{1}I – A)\vec{x} = \vec{0}$ is $\vec{x} = t \left[ \begin{array}{r}
1 \\
1 \\
1
\end{array}\right]$
where $t$ is arbitrary, so we can use $\vec{x}_1 = \left[ \begin{array}{r}
1 \\
1 \\
1
\end{array}\right]$
as the basic eigenvector corresponding to $\lambda_{1} = 2$. As the reader can verify, the gaussian algorithm gives basic eigenvectors $\vec{x}_2 = \left[ \begin{array}{r}
0 \\
1 \\
1
\end{array}\right]$
and $\vec{x}_3 = \left[ \begin{array}{r}
0 \\
\frac{1}{3} \\
1
\end{array}\right]$
corresponding to $\lambda_{2} = 1$ and $\lambda_{3} = -1$, respectively. Note that to eliminate fractions, we could instead use $3\vec{x}_3 = \left[ \begin{array}{r}
0 \\
1 \\
3
\end{array}\right]$
as the basic $\lambda_{3}$-eigenvector.
Example 3.3.5
If $A$ is a square matrix, show that $A$ and $A^{T}$ have the same characteristic polynomial, and hence the same eigenvalues.
Solution:
We use the fact that $xI – A^{T} = (xI – A)^{T}$. Then
\begin{equation*}
c_{A^T}(x) = \func{det} \left( xI-A^T \right) = \func{det} \left[ (xI-A)^T \right] = \func{det} (xI-A) = c_A(x)
\end{equation*}
by Theorem 3.2.3. Hence $c_{A^{T}}(x)$ and $c_{A}(x)$ have the same roots, and so $A^{T}$ and $A$ have the same eigenvalues (by Theorem 3.3.2).
The eigenvalues of a matrix need not be distinct. For example, if $A = \left[ \begin{array}{rr}
1 & 1 \\
0 & 1
\end{array}\right]$
the characteristic polynomial is $(x – 1)^2$ so the eigenvalue 1 occurs twice. Furthermore, eigenvalues are usually not computed as the roots of the characteristic polynomial. There are iterative, numerical methods that are much more efficient for large matrices.