Lesson: Hermitian Matrices

Question 2

1 point

Transcript

We end the course by looking at the complex equivalent of symmetric matrices. As usual, we add complex conjugation to the definition from the reals to get the complex equivalent.

An n-by-n matrix A with complex entries is called Hermitian if A-star = A, or equivalently, if the conjugate of A equals A-transpose.

For example, if we let A be the matrix seen here, then we know that A-star = A, so A is Hermitian. But if we look at the matrix B seen here, then B is symmetric, but B is not Hermitian since B-star would be this matrix, which does not equal B.

One thing worth noting about Hermitian matrices is that they must have real numbers on the main diagonal, as these entries must equal their own conjugate. So we can immediately see that B is not Hermitian by noting the complex entry 2i on the main diagonal.

As we did with the symmetric matrices in the reals, we will find not only that every Hermitian matrix is diagonalizable, but it can be diagonalized by a unitary matrix, just as symmetric matrices could be diagonalized by orthogonal matrices. It will simply take us a few steps to prove it.

First, we want to notice the following. An n-by-n matrix A is Hermitian if and only if for all z and w in Cn, we have that the product of <z with (Aw)> equals the product of <(Az) with w>.

To prove this, first let’s assume that A is Hermitian, and show that the product of <z with (Aw)> equals the product of <(Az) with w> for all vectors z and w. Well, this follows easily from our first property of the conjugate transpose, which says that for any matrix A, the product of <z with ((A-star)w)> equals the product of <(Az) with w> for all vectors z and w. And since A is Hermitian, we know that A = A-star, so we have our result.

Now, to show the other side of our if-and-only-if statement, let’s assume that the product of <z with (Aw)> equals the product of <(Az) with w> for all vectors z and w, and we’ll show that A is Hermitian. We do this by plugging in z = ej and w = ek (again, our standard basis vectors) to this equality. Why those vectors? Because these standard basis vectors will cause the inner products to single out the jkth entry of A. So we’ll see that the product of <(Aej) with ek> will equal (Aej) dotted with ek-conjugate, but this is simply (Aej) dotted with ek since ek has only real components. Well, this will equal the vector aj dotted with ek, where aj is the jth column of A, but ek has zeroes in every entry except for the kth entry, which is a 1, so this simply becomes (aj)k—that is to say, the kth component of aj, which you can also think of as ajk, the jkth entry of the matrix A.

Well, now let’s look at the product of <ej with (Aek)>. Well, this will equal ej dotted with (Aek)-conjugate, which we can write as ej dotted with (A-conjugate)(ek-conjugate). Well, again, since ek has only real entries, we can get rid of its conjugation symbol. Next, this becomes ej dotted with ak-conjugate, which is the conjugate of the kth column of A. Well, this will become (ak-conjugate)-sub-j, which is the jth component of ak-conjugate since our ej vector has zeroes everywhere except the jth component, and a 1 in that place. But we can think of this as akj-conjugate, the kjth entry of A-conjugate.

Now, since we started with the idea that the product of <(Aej) with ek> will equal the product of <ej with (Aek)>, this means we have that (A)jk = (A-conjugate)kj, but akj-conjugate is the jkth entry of A-star. So we see that the jkth entry of A is the same as the jkth entry of A-star for all j, k from 1 to n, and this means we’ve shown that A = A-star, which means that A is Hermitian.

Now we’ll use this theorem to prove the following key facts, which you should recognize from our work with symmetric matrices.

Suppose that A is an n-by-n Hermitian matrix. Then

  1. All eigenvalues of A are real. And,
  2. Eigenvectors corresponding to distinct eigenvalues are orthogonal to each other.

To prove part 1, let lambda be an eigenvalue of A with corresponding eigenvector z, so that Az = (lambda)z. Well, then by the theorem we’ve just proven, the product of <(Az) with z> should equal the product of <z with (Az)>. But we know that the product of <(Az) with z> equals the product of <(lambda)z with z>, which equals (lambda)<z, z>. Meanwhile, the product of <z with (Az)> will equal the product of <z with (lambda)z>, which equals ((lambda-conjugate) times the product of <z, z>). So we see that (lambda times the product of <z with z>) equals (lambda-conjugate times the product of <z with z>). Well, since z is an eigenvector, we know that z does not equal the 0 vector, so the product of <z with z> does not equal 0. So we can divide both sides by the product of <z with z> to get that lambda = lambda-conjugate, which means that lambda must be real.

To prove part 2, let lambda1 and lambda2 be eigenvalues of A such that lambda1 does not equal lambda2, and let z1 and z2 be eigenvectors corresponding to lambda1 and lambda2. Now since A is Hermitian, we’ll use our theorem that we just proved again to see that the product of <(Az1) with z2> equals the product of <z1 with (Az2)>. Well, this means that the product of <(lambda1)z1 with z2> equals the product of <z1 with (lambda2)z2>, and from this, we get that (lambda1 times the product of <z1 with z2>) equals (lambda2-conjugate times the product of <z1 with z2>). In part 1, we showed that the eigenvalues of A are real, so (the conjugate of lambda2) equals lambda2, which means that (lambda1 times the product of <z1 with z2>) equals (lambda2 times the product of <z1 with z2>). For this equality to hold, we either have to have that the product of <z1 with z2> equals 0 or that lambda1 = lambda2. But since we know that lambda1 does not equal lambda2, we get that the product of <z1 with z2> equals 0, which means that our eigenvectors are orthogonal.

And now we can prove the main result. This is known as the Spectral Theorem for Hermitian Matrices. Suppose that A is an n-by-n Hermitian matrix. Then there exists a unitary matrix U and a diagonal matrix D such that (U-star)AU = D.

The proof of this works exactly the same as its counterpart in the reals, which is the Principal Axis Theorem, so I won’t bother repeating it here. As with symmetric matrices, the main idea is that we can find an orthonormal basis for the eigenspaces of each eigenvalue, and use these eigenvectors (instead of any random eigenvector) to form the matrix U we use to diagonalize A, and this U will, in fact, be unitary.

So let’s just work out an example. Let’s let A be the matrix [6, (root 3) – i; (root 3) + i, 3]. Then A = A-star, so A is Hermitian. Let’s find the eigenvalues of A. We’ll need to compute the determinant of (A – (lambda)I), which is the determinant of this matrix. Well, this will equal (6 – lambda)(3 – lambda) – ((root 3) + i)((root 3) – i). Working through these calculations, we’ll see that this equals 14 – 9(lambda) + lambda-squared, which equals (7 – lambda)(2 – lambda). (Remember, of course, that our eigenvalues will be real.)

Now let’s find the eigenspaces for these eigenvalues. The eigenspace for lambda = 2 is the nullspace of [6 – lambda, (root 3) – i; (root 3) + i, 3 – lambda], which becomes this matrix when we plug in lambda = 2. Well, it wouldn’t take long to realize this is row equivalent to the matrix [4, (root 3) – i; 0, 0]. So the eigenvectors for lambda = 2 satisfy the equation 4z1 + ((root 3) – i)z2 = 0. So we have that z1 = (-1/4)((the square root of 3) – i)z2. If we replace the variable z2 with the parameter alpha, we see that the eigenvectors of lambda = 2 are the set of all vectors [(-1/4)((the square root of 3) – i)(alpha); alpha] for alpha in C, and we can see that we can write this as the span of the single vector {[((root 3) – i)/(root 20); -4/(root 20)], and so we have that this set forms an orthonormal basis for the eigenspace for lambda = 2. (Remember, we do need to make sure it’s orthonormal, not just a basis vector.)

Repeating this process again for lambda = 7, we’ll now be looking at the nullspace again of (A – (lambda)I), which is now this matrix. One can quickly find that it is row equivalent to the matrix [-1, (root 3) – i; 0, 0]. So the eigenvectors for lambda = 7 satisfy the equation –z1 + ((root 3) – i)z2 = 0. So we have that z1 = ((the square root of 3) – i)z2. If we replace the variable z2 with the parameter alpha, we see that the eigenvectors of lambda = 7 are the set of all vectors [((square root of 3) – i)(alpha); alpha] for alpha in C. We see that this means we can write it as the span of the vector {[((root 3) – i)/(root 5); 1/(root 5)], and so that the set containing only the vector [((root 3) – i)/(root 5); 1/(root 5)] is an orthonormal basis for the eigenspace for lambda = 7.

So now we’ll take our orthonormal basis vectors and create them into the columns of U, so that U equals the matrix [((root 3) – i)/(root 20), ((root 3) – i)/(root 5); -4/(root 20), 1/(root 5)], and this is a unitary matrix such that (U-star)AU equals the diagonal matrix [lambda1, 0; 0, lambda2], which, in our case, is [2, 0; 0, 7].

© University of Waterloo and others, Powered by Maplesoft