## Transcript — Introduction

In the last lecture, we proved that Hermitian, skew-Hermitian, and unitary matrices are all unitarily diagonalizable. However, these are only a very small portion of the matrices that are unitarily diagonalizable. So, we want to derive the condition which is equivalent to unitarily diagonalizability. How should we go about doing this? Instead of the guess-and-check approach we tried in the last lecture, we really should have been doing the same thing that we did in the real case: work backwards. That is, we should assume that A is unitarily diagonalizable and see what condition that puts on A.

Assume A is unitarily diagonalizable. Then there exists a unitary matrix U such that (U-star)AU = D is diagonal. Recall that the strategy for the three proofs in the last lecture was to show that because the upper triangular matrix T was similar to A, it had the same property as A. Since we are now working in reverse, our goal is to show that since A is similar to D, A must have the same property as D. But D is diagonal, and so has many, many nice properties. We need to try to pick the right one.

Thinking about the proofs in the last lecture, we observe that D(D-star) equals the diagonal matrix (lambda1 to lambda_n) times the diagonal matrix (lambda1-conjugate to lambda_n-conjugate), which is equal to the diagonal matrix with diagonal entries (the absolute value of lambda1)-squared up to (the absolute value of lambda_n)-squared. But this is just equal to (D-star)D. This is the desired property.

So, since (U-star)AU = D, we have that A = UD(U-star). Using this, we can show, with a little bit of effort, that A(A-star) = (A-star)A. We make the following definition. Definition: An n-by-n matrix A such that A(A-star) = (A-star)A is called normal.

## The Spectral Theorem for Normal Matrices

Theorem 11.5.9, The Spectral Theorem for Normal Matrices: A matrix A in M(n-by-n)(C) is unitarily diagonalizable if and only if it is normal. Proof: We already proved that every unitarily diagonalizable matrix is normal. To prove the other direction, we, of course, use the same strategy that we were using for the proofs in the last lecture. By Schur’s Theorem, there exists a unitary matrix U such that (U-star)AU = T is upper triangular. We now prove that T has the same property as A. In particular, that T is also normal. Observe that T(T-star) is equal to (U-star)AU times (U-star)(A-star)U, which equals (U-star)A(A-star)U,
since U is normal. This equals (U-star)(A-star)AU since A is normal, which
is (U-star)(A-star)U times U-star)AU, which equals (T-star)T. Hence, T is also normal.

We now need to prove that every upper-triangular normal matrix is diagonal.
Write out T as T equals t11, t12, up to t1n. 0, t22, up to
t2n all the way down to 0, all the way to 0, tnn. We next compare the diagonal entries of T(T-star) and (T-star)T. Comparing 1,1 entries gives (the absolute value of t11)-squared plus (the absolute value of t12)-squared plus up to (the absolute value of t1n)-squared is equal to (the absolute value of t11)-squared. Since these are all non-negative quantities, this is only possible if t12 equals up to t1n equals 0. Therefore, the matrix T must have the following form. Next, we compare the 2,2 entries of T(T-star) and (T-star)T. We get (the absolute value of t22)-squared plus up to (the absolute value of t2n)-squared equals (the absolute value of t22)-squared, and hence, once again, we find that t23 = up to t2n = 0. Continuing in this way, we get that T is diagonal as required.

Example: Determine which of the following matrices are normal. (a) The matrix A = [1, -i; i, i]. Solution: We have that A(A-star) = [2, -1 – i; -1 – i, 2], and then calculating (A-star)A, we get [2, 1 – i; ]. Well, at this point, we can already see that A(A-star) is not equal to (A-star)A, and so A is not normal. (b) B = [1, i; -i, 2]. Solution: We observe that B is Hermitian. Thus, as we proved last lecture, B is unitarily diagonalizable, and so by the Spectral Theorem for Normal Matrices, it is normal. (c) The matrix C = [1 + i, i; -i, -1 + i]. Solution: We have C(C-star) = [3, 0; 0, 3], which is equal to (C-star)C, and so C is normal.

## Properties of Normal Matrices

We get the following useful properties of normal matrices. Theorem 11.5.10: If A is a normal matrix, then

- (The length of Az) = (the length of (A-star)z) for all z in Cn.
- A – (lambda)I is normal for every complex scalar lambda.
- If Az = (lambda)z, then (A-star)z = (lambda-conjugate)z. That is, if lambda is an eigenvalue of A with corresponding eigenvector z, then z is also an eigenvector of (A-conjugate)-transpose with corresponding eigenvalue lambda-conjugate. And,
- If z1 and z2 are eigenvectors of A corresponding to distinct eigenvalues lambda1 and lambda2 of A, then z1 and z2 are orthogonal.

Proof: For (1), we have (the length of (A-star)z)-squared = the inner product of (A-star)z with itself, which is equal to (((A-star)z)-transpose)(the conjugate of (A-star)z). This is equal to (z-transpose)((A-star)-transpose)(the conjugate of A-star)(the conjugate of z). This is (z-transpose)(the conjugate of A)(the conjugate of A-star)(the conjugate of z), which gives (z-transpose)(the conjugate of (A(A-star)))(the conjugate of z). We can rearrange this, since A is normal, as (z-transpose)(the conjugate of ((A-star)A))(the conjugate of z), which is (z-transpose)((A-star)-conjugated)(the conjugate of A)(the conjugate of z), which is (z-transpose)(A-transpose)(the conjugate of (Az)), which is ((Az)-transposed)(the conjugate of (Az)), which is the inner product of Az with itself, which is (the length of Az)-squared. Hmm, that went by too fast. Take a minute to look over the proof, and carefully think about and justify each step.

For (2), we need to prove that A – (lambda)I is normal, and so we need to prove that (A – (lambda)I)(the conjugate transpose of (A – (lambda)I)) is equal to (the conjugate transpose of (A – (lambda)I))(A – (lambda)I). So, here we go. (A – (lambda)I)((A – (lambda)I)-star) = (A – (lambda)I)(A-star – (lambda-conjugate)I). Multiplying this out with the distributive property, this is equal to A(A-star) – (lambda)(A-star) – (lambda-conjugate)A + ((the absolute value of lambda)-squared)I. Since A is normal, A(A-star) = (A-star)A, and so we can rearrange this to give (A-star)A – (lambda-conjugate)A – (lambda)(A-star) + ((the absolute value of lambda)-squared)I, which we can refactor as (A-star – (lambda-conjugate)I)(A – (lambda)I), which is equal to ((A – (lambda)I)-star)(A – (lambda)I) as required.

Now we will show the real point of property 2 is to help us prove property 3. So for (3), suppose that Az = (lambda)z for some non-zero vector z in Cn, and let B = A – (lambda)I. Then B is normal by property (2), and Bz = (A – (lambda)I)z, which is equal to Az – (lambda)z, which is equal to the 0 vector. So, by property 1, we get 0 = (the length of Bz) = (the length of (B-star)z). But notice that B-star is just A-star – (lambda-conjugate)I. And so we have that (the length of (A-star – (lambda-conjugate)I)z) is equal to 0, which is (the length of ((A-star)z – (lambda-conjugate)z)). But the only vector with length 0 is the 0 vector. Consequently, (A-star)z – (the conjugate of lambda)z = the 0 vector, and thus, (A-star)z = (lambda-conjugate)z as required.

Take a minute to look over the proofs of properties 2 and 3, and make sure you understand them. I will leave the proof of property 4 as a recommended exercise.

Note that property 4 shows us the procedure for unitarily diagonalizing a normal matrix is exactly the same as the procedure for orthogonally diagonalizing a real symmetric matrix.

In the next lecture, we will look at one more very neat theorem. This ends this lecture.