Lesson: Unitary Matrices

Question 2

1 point

Transcript

Now that we have defined orthogonality, and even used the Gram-Schmidt procedure, the time has come to define an orthogonal matrix.

An n-by-n matrix with complex entries is said to be unitary if its columns form an orthonormal basis for Cn.

The term “unitary” is used instead of “orthogonal” to emphasize that, thanks to the new definition of inner product, we do not end up with the same properties as we did with orthogonal real matrices. Specifically, we do not have that A is unitary if and only if A-inverse = A-transpose. Let’s look at why.

Let A be an n-by-n matrix with complex entries, and let a1 through an be the columns of A. Even in the complex numbers, we have that the jkth entry of a matrix product BA is the dot product of the jth row of B with the kth column of A. If we set B = A-transpose, then we have that the jth row of A-transpose is the same as the jth column of A, and so the jkth entry of (A-transpose)A is (aj dot ak).

Now, back when A had real entries, this dot product was the same as the standard inner product. But now, aj dot ak does not equal the product of <aj with ak>. So even if A is a unitary matrix, the fact that the product of <aj with aj> equals 1 does not necessarily mean that the dot product of aj with aj equals 1, which means that the diagonal entries of (A-transpose)A may not be 1. And also, just because the product of <aj with ak> equals 0 when j does not equal k, we do not necessarily have that the dot product of aj with ak will equal 0, which means that (A-transpose)A may not have zeroes off the main diagonal. It may not even be a diagonal matrix, that is.

All is not lost, however, because the product of <aj with ak> is not that different from the dot product of aj with ak. We are simply looking at aj dot (ak-conjugate) instead. And as we always take the conjugate of the right-hand side, this dot product is the jkth entry of the matrix (A-transpose)(A-conjugate). This general fact is true for any matrix A with complex entries, but when A is unitary, then we again have that aj dotted with (aj-conjugate) will equal 1, so the entries on the main diagonal of (A-transpose)(A-conjugate) are 1, and we’ll have that aj dotted with (ak-conjugate) will equal 0 when j does not equal k, so the entries off the main diagonal of (A-transpose)(A-conjugate) are all 0. And thus, using a similar argument to the one we used for orthogonal matrices in the real, we see that A is a unitary matrix if and only if A-transpose is the inverse of A-conjugate.

A different, but related, fact is that A is a unitary matrix if and only if the transpose of A-conjugate is the inverse of A. It turns out that the matrix (A-conjugate)-transpose is used in complex numbers a lot. This is not a surprise since it is a blend of the common transpose action from the reals with the common conjugate action necessary in the complex numbers, so we go ahead and give it its own symbol.

Let A be an n-by-n matrix with complex entries. We define the conjugate transpose, A-star, of A by A-star = (conjugate-A)-transpose.

So for example, if A was the matrix seen here, then we could first take the conjugate of it to get this matrix, and then we could take the transpose to get the matrix A-star, which equals this. But it’s worth noting that it does not matter whether we do the conjugate first and then the transpose (as the definition states), or if we were to do the transpose first, and then the conjugate. That is to say, we could also say that A-star equals the conjugate of A-transpose. So, using our same matrix A as before, we could first take the transpose of A, and then take the conjugate, and we still end up with our same A-star.

And in fact, once you become comfortable with this process, you usually just do both actions at the same time. So I could start with the matrix B, seen here, and to find B-star, I’ll simply look down the first column of B and take conjugates to create my first row of B-star, to become [-3i, 2 – 5i, 0]. Then I’ll go down my second column of B, and take conjugates to form my second row of B-star, so I’ll create [1 + i, 6 – 7i, 3 – 6i].

With this definition in hand, we can state the following theorem about unitary matrices. If U is an n-by-n matrix, then the following are equivalent.

  1. The columns of U form an orthonormal basis for Cn.
  2. The rows of U form an orthonormal basis for Cn.
  3. U-inverse = U-star.

Let’s prove “(1) if and only if (3)” first, and to that end, let’s look at the product of U-star with U. The jkth entry of (U-star)U is the dot product of the jth row of U-star with the kth column of U. Now, since the jth row of U-star is the conjugate of the jth column of U, we have that the jkth entry of (U-star)U is uj-conjugate dotted with uk. Well, this is the same as the conjugate of (uj dotted with uk-conjugate), which is the conjugate of the inner product <uj with uk>. If the columns of U form an orthonormal basis, we know that the inner product of <uj with uk> equals either 0 (if j is not equal to k) or 1 (if j = k), both of which are real numbers. So this means that they equal their conjugate, so we have shown that the jkth entry of (U-star)U is 1 if j = k and 0 if j is not equal to k. And this means that (U-star)U equals the identity matrix, so U-inverse is U-star. So we have shown that U-inverse = U-star if U is unitary.

To show the reverse direction of our if-and-only-if statement, let’s assume that U-inverse = U-star and show that the columns of U must form an orthonormal basis. Well, we still know that the jkth entry of (U-star)U is the conjugate of the product of <uj with uk>, but since we know that (U-star)U = I, we see that the conjugate of the product of <uj with uk> equals 0 for j not equal to k, while the conjugate of the product of <uj with uj> equals 1. As mentioned before, since these are real numbers, they equal their conjugate, and so, in fact, we’ve shown that the product of <uj with uk> equals 0 for j not equal to k while the product of <uj with uj> equals 1, and this means that the columns of U form an orthonormal basis for Cn.

We will actually use this result to show that property (2) holds if and only if property (1)—that is to say, that the rows of U form an orthonormal basis for Cn if and only if the columns of U form an orthonormal basis for Cn. We begin by noting that the rows of U form an orthonormal basis if and only if the columns of U-transpose form an orthonormal basis. And we now know that the columns of U-transpose form an orthonormal basis if and only if (U-transpose)-star = (U-transpose)-inverse.

So what is the jkth entry of (U-transpose)((U-transpose)-star)? Well, it is the dot product of the jth row of U-transpose with the kth column of ((U-transpose)-star). Well, the jth row of U-transpose is the jth column of U, while the kth column of ((U-transpose)-star) is the conjugate of the kth row of U-transpose. Thus, the kth column of ((U-transpose)-star) is the conjugate of the kth column of U. So, the jkth entry of (U-transpose)((U-transpose)-star) is, in fact, uj dotted with uk-conjugate, which is the product of <uj with uk>.

Well, this means that (U-transpose)((U-transpose)-star) = I if and only if the product of <uj with uk> equals 1 when j = k and equals 0 when j is not equal to k, which means that (U-transpose)-star = (U-transpose)-inverse if and only if the columns of U form an orthonormal basis. So we have shown that U-transpose is unitary if and only if U is unitary, and this proves that the rows of U form an orthonormal basis for Cn if and only if the columns of U form an orthonormal basis.

Let’s look at an example. Consider the matrix A = [i/(root 2), -1/(root 2); 1/(root 2), -i/(root 2)]. To determine if A is unitary, we want to look at the product (A-star)A. A-star is [-i/(root 2), 1/(root 2); -1/(root 2), i/(root 2)], and we’ll multiply by A. Doing our matrix product, we do, in fact, see that it equals the matrix [1, 0; 0, 1]. And since (A-star)A = I, Theorem 9.5.3 tells us that A is a unitary matrix.

What about the matrix B = [i/(root 3), -1/(root 3); (1 + i)/(root 3), (1 + i)/(root 3)]? To determine if B is unitary, we again look at our product (B-star)B. So B-star will be [-i/(root 3), (1 – i)/(root 3); -1/(root 3), (1 – i)/(root 3)]. Then we’ll multiply that by B, and if we perform our matrix multiplication, we end up with the matrix [1, (2 + i)/3; (2 – i)/3, 1]. Since (B-star)B does not equal the identity matrix, our theorem tells us that B is not unitary. But we actually know even more. Since the jkth entry of (U-star)U is the conjugate of the product of <uj with uk> for any complex matrix U, we see from our particular (B-star)B that the conjugate of the product of <b1 with b1> equals 1, and the conjugate of the product of <b2 with b2> equals 1, so both columns of B are unit vectors. Moreover, though, we see that the conjugate of the product of <b1 with b2> does not equal 0, and also that the conjugate of the product of <b2 with b1> does not equal 0. So the reason that the columns of B fail to form an orthonormal basis is that the columns are not orthogonal to each other.

For one last example, let’s look at our matrix C, seen here. Again, to see if C is unitary, we’ll look at (C-star)C. So C-star will be the matrix [1 – i, 0, -1 – i; 1, 1 + i, i; 1, -1 – i, i], and then we’ll want to multiply that times the matrix C, so we’ll do all of our matrix multiplication, and we end up with the matrix [4, 0, 0; 0, 4, 0; 0, 0, 4]. Again, C is not unitary, but again, we can use the product (C-star)C to learn why the columns of C do not form an orthonormal basis. This time, we get that conjugate of the product of <cj with ck> does equal 0 when j does not equal k, so we know that the columns of C are orthogonal. But we have that the conjugate of the product of <cj with cj> equals 4, which does not equal 1, so the columns of C are not unit vectors. Of course, this can be easily changed by simply dividing the columns by their length, which we have already calculated to be the square root of 4, which equals 2. And so we find that our new matrix D, which is simply equal to C with all of its entries divided by 2, is a unitary matrix.

As previously mentioned, the matrix A-star appears often in the study of Cn, so it is worthwhile to note the following properties of the conjugate transpose, stated in the, this theorem. Let A and B be complex matrices, and let alpha be a complex number. Then we get the following properties:

  1. Property 1. The product of <Az with w> equals the product of <z with (A-star)w> for all vectors z and w in Cn.
  2. Property 2. (A-star)-star = A
  3. Property 3. (A + B)-starred = A-star + B-star
  4. 4. ((alpha)A)-star = (alpha-conjugate)(A-star)
  5. And 5. (AB)-quantity-star = (B-star)(A-star)

Now, the proofs of these properties are all simple “expand the terms and see that it works” types of proofs. So I’ll prove properties 1 and 2, and I’ll leave the others for you to do as a practice problem.

Well, actually, this proof is a bit harder to prove than the others. You can painstakingly expand all the dot products involved to determine the terms you are summing, but we get a neater proof if we think about it differently, and that is to think of the dot product as a form of matrix multiplication since, for any vectors z and w in Cn, we have that (z dot w) is the same as the matrix product (z-transpose)w. With that in mind, we see the following. The inner product of <Az with w> equals (Az) dot w-conjugate, which equals ((Az)-transpose)(w-conjugate). Well, we can distribute the transpose to become (z-transpose)(A-transpose)(w-conjugate). Well, we can rewrite this as (z-transpose)(((A-transpose)-conjugate)-conjugate)(w-conjugate), so we can make this into ((z-transpose) times the conjugate of (((A-transpose)-conjugate)w)), which, of course, equals ((z-transpose) times the conjugate of ((A-star)w)), which equals z dotted with the conjugate of ((A-star)w), which we can write as the product of <z with (A-star)w>.

Now note that, in this proof, I made use of the fact that for any n-by-n matrices A and B with complex entries, we have that the product of A-conjugate with B-conjugate equals the conjugate of (AB). This follows from the fact that matrix multiplication simply involves multiplying and adding complex numbers, and we have already seen that for any complex numbers z and w that (the conjugate of z) + (the conjugate of w) equals the conjugate of (z + w), and (the conjugate of z)(the conjugate of w) equals the conjugate of (zw). For the same reason, we also know that (the conjugate of z) dotted with (the conjugate of w) equals the conjugate of (z dot w) for any vectors z and w in Cn. Again, all these simply follow from the fact that you can distribute complex conjugation over complex addition, and you can distribute it over complex multiplication, so any time you’re simply doing some combination of complex addition and complex multiplication, you can distribute your conjugation as well.

So let’s look now at the proof of Theorem 9.5.2, part 2. Let’s let B = A-star, and C = B-star, which is (A-star)-star. Well, then cjk is the conjugate of bkj, which is the conjugate of (the conjugate of ajk), which, of course, is ajk. Since this is true for any j and k from 1 to n, we’ve shown that the matrix C equals the matrix A, and that is to say that (A-star)-star = A.

© University of Waterloo and others, Powered by Maplesoft