Lesson: Hermitian Inner Products and Unitary Matrices

Question 2

1 point

Transcript — Introduction

We would like to have an inner product defined for complex vector spaces because the concepts of length, orthogonality, and projections are powerful tools for solving certain problems. Moreover, it is a necessary step towards our goal of mimicking orthogonal diagonalization in the complex case.

But how are we going to define an inner product on a complex vector space? What we will do is essentially repeat how we figured out how to define an inner product on a real vector space. Recall that what we did is we first defined the standard inner product on Rn—the dot product—and then used the important properties of the dot product to define a general real inner product.

So, our first goal should be to define a standard inner product on Cn. But how are we going to do that? Our first thought would be to extend the dot product to Cn. Definition: Let w = [w1 to wn] and z = [z1 to zn] be vectors in Cn. We define (z dot product w) to be equal to z1w1 + up to znwn.

Does this define an inner product on Cn? Well, let z = x + iy be vectors in Cn where x and y are vectors in Rn. Then we have (z dot product z) = z1-squared + up to zn-squared, which equals (x1-squared + up to xn-squared – y1-squared – up to yn-squared) + (2i)(x1y1 + up to xnyn). To match the real case, we will want to define the length of a vector z to be the square root of an inner product of z with itself. In this case, we observe that z dot product with itself does not even need to be a real number, and so the condition (the dot product of z with itself) is greater than or equal to 0 does not even make sense. Thus, we cannot use the dot product as a rule for defining an inner product on Cn.

How do we fix this? We need to ensure that the inner product of z with itself is a non-negative real number. Hmm. When, in multiplication of complex numbers, can we guarantee that we have a non-negative real number? When multiplying a number by its complex conjugate. Recall that if z = a + bi is a complex number, then z times its complex conjugate is equal to a-squared + b-squared.

Hence, we make the following definition. Definition: The standard Hermitian inner product for Cn is defined by (the inner product of z and w) is equal to (z dot product with the conjugate of w), which is equal to z1(w1-conjugate) + up to zn(wn-conjugate) for all vectors w and z in Cn. Note: This is the mathematics definition of the standard Hermitian inner product on Cn. In engineering, they use the definition (the inner product of z and w) is equal to ((the conjugate of z) dot product w). In this course, you are always expected to use the math definition. Note, though, that most computer programs like Maple and MATLAB use the engineering definition, and so these will not necessarily give you the correct answers for this course.

Let’s demonstrate the definition of the standard Hermitian inner product on Cn with an example. Example: Let z = [1 + i; 2 – i] and w = [-2 + i; 3 + 2i] be vectors in C2. Determine the inner product of z and w, and the inner product of w and z. Solution: We have (the inner product of z and w) equals (z dot product w-conjugate), which is the dot product of [1 + i; 2 – i] and [-2 – i; 3 – 2i], which, by the normal dot product formula, is (1 + i)(-2 – i) + (2 – i)(3 – 2i), which is 3 – 10i. Now we find that (the inner product of w and z) is (w dot product z-conjugate), which is [-2 + i; 3 + 2i] dot product with [1 – i; 2 + i], which is 3 + 10i.

This example shows us that, unlike a real inner product, the standard Hermitian inner product on Cn is not symmetric. Let’s look at the properties that the standard Hermitian inner product on Cn does have.

Theorem 11.4.1: If the vectors v, z, and w are in Cn, and alpha is a complex scalar, then

  1. The inner product of z with itself is real, the inner product of z with itself is greater than or equal to 0, and the inner product of z with itself equals 0 if and only if z is the 0 vector.
  2. (The inner product of z and w) is equal to the conjugate of (the inner product of w and z).
  3. (The inner product of (v + z) and w) equals (the inner product of v and w) + (the inner product of z and w). And,
  4. (The inner product of (alpha)z and w) is equal to (alpha)(the inner product of z and w).

The proof of Theorem 11.4.1 is left as an easy exercise.

We now want to define Hermitian inner products on general complex vector spaces. As we did in the real case, we will use the properties of the standard Hermitian inner product on Cn to do this. Definition: Let V be a vector space over C. A Hermitian inner product on V is a function from V-cross-V to C such that for all vectors v, w, and z in V and alpha in C, we have

  1. The inner product of z with itself is real, the inner product of z with itself is greater than or equal to 0, and the inner product of z with itself equals 0 if and only if z is the 0 vector.
  2. (The inner product of z and w) is equal to the conjugate of (the inner product of w and z).
  3. (The inner product of (v + z) and w) is equal to (the inner product of v and w) + (the inner product of z and w). And,
  4. (The inner product of (alpha)z and w) is equal to (alpha)(the inner product of z and w).

A complex vector space with a Hermitian inner product is called a Hermitian inner product space.

Notes:

  1. The second property, (the inner product of z and w) = the conjugate of (the inner product of w and z), is called the Hermitian property. We will see throughout the rest of the course that “Hermitian” is the complex equivalent of “symmetric”.
  2. Observe that the second property only tells us how to factor out a complex scalar from the first entry of the inner product. Let’s figure out how to factor out a scalar from the second component. Using the definition of the Hermitian inner product, we have (the inner product of z and (alpha)w) is equal to the conjugate of (the inner product of (alpha)w and z), which is equal to the conjugate of ((alpha)(the inner product of w and z)), which, by properties of conjugates, is (alpha-conjugate)(the conjugate of (w and z)), which is (alpha-conjugate)(the inner product of z and w). So, a Hermitian inner product is also not bilinear.
  3. Since a Hermitian inner product is not symmetric or bilinear, you may wonder if it is a true generalization of the real inner product. However, observe that if t and (the inner product of z and w) are both real, then we have (the inner product of z and w) = the conjugate of (the inner product of w and z), but if that’s a real number, then this is just equal to (the inner product of w and z). And (the inner product of z and tw) equals (the conjugate of t)(the inner product of z and w), which is just equal to (t)(the inner product of z and w) since t is real. So it is symmetric and bilinear in this case. In particular, we could actually use this as the definition for a real inner product. So it is a true generalization.

Let’s look at a couple of examples. Example: Theorem 11.4.1 shows us that the standard Hermitian inner product on Cn is, in fact, a Hermitian inner product.

Example: Consider the complex vector space M(m-by-n)(C). How should we define the standard Hermitian inner product on this vector space? We saw that for the real case, we define the inner product of A and B to be equal to the trace of ((B-transpose)A), and that this inner product was equivalent to the dot product on Rmn. Of course, we want to define the standard Hermitian inner product on M(m-by-n)(C) in a similar way.

Because of the way the Hermitian inner product is defined on Cn, we see that we also need to take a complex conjugate of the second component. Thus, we define the standard Hermitian inner product on M(m-by-n)(C) by (the inner product of matrices Z and W) is equal to the trace of (((W-conjugate)-transposed)(Z)). Note: As we did in the real case, whenever we use Cn or M(m-by-n)(C), we mean with the standard Hermitian inner product unless specified otherwise.

Example: Let Z = [2 + i, 1; i, 1 – i], W = [3, 2 – 3i; -2i, 1 + 2i] be matrices in M(2-by-2)(C). Find the inner product of Z and W. Solution: As in the real case, we would not actually calculate the trace of (((W-conjugate)-transpose)(Z)). We instead use the fact that it mimics the standard Hermitian inner product on C4, remembering to take the conjugate of the second vector. So we have (the inner product of Z and W) is equal to (2 + i)(3) + (1)(2 + 3i) + (i)(2i) + (1 – i)(1 – 2i). Calculating this out, we get 5 + 3i.

Length and Orthogonality

We can now define length and orthogonality to match what we did in the real case. Definition: Let V be a Hermitian inner product space. For any vector z in V, we define the length of z by (the length of z) is equal to the square root of (the inner product of z with itself). For any vectors z and w in V, we say that w and z are orthogonal if the (inner product of z and w) is equal to 0. A set B = {z1 to zl} in V is said to be orthogonal if (the inner product of zj and zk) = 0 whenever j is not equal to k. And B is said to be orthonormal if it is orthogonal and every vector in the set is a unit vector—that is, has length 1.

Of course, we get the same results as in the real case. Theorem 11.4.2: Let V be a Hermitian inner product space. For any z and w in V, and alpha in C, we have the length of ((alpha)z) is equal to (the absolute value of alpha)(the length of z), and the length of (z + w) is less than or equal to (the length of z) + (the length of w).

Theorem 11.4.3: If {z1 to zk} is an orthonormal set in a Hermitian inner product space, then {z1 to zk} is linearly independent, and (the length of (z1 + up to zk))-squared is equal to (the length of z1)-squared + up to (the length of zk)-squared.

Note that all of our concepts and theory for orthogonal complements, coordinates with respect to an orthonormal basis, and projections are the same, except we now must be careful because the inner product is no longer symmetric. That is, we must make sure that we get the vectors in the inner products in the correct order.

Let’s look at one example showing the good old Gram-Schmidt procedure is also still the same. Note that since we intend to mimic orthogonal diagonalization, it will be necessary for us to use the Gram-Schmidt procedure to find orthogonal bases of eigenspaces. Example: Use the Gram-Schmidt procedure to transform the basis vectors z1 = [i; 0; 1], z2 = [0; 1; i], z3 = [-1; 1; i] into an orthonormal basis for C3. Solution: Let w1 = z1 = [i; 0; 1]. Then we find that z2 minus ((the inner product of z2 and w1) divided by (the length of w1)-squared)(w1) is equal to [1/2; 1; i/2]. So, for simplicity, we pick w2 = [1; 2; i]. And we find w3 equals z3 minus ((the inner product of z3 and w1) divided by (the length of w1)-squared)(w1) minus ((the inner product of z3 and w2) divided by (the length of w2)-squared)(w2), which is equal to [-1/3; 1/3; -i/3]. Normalizing the vectors, we get the orthonormal basis {[i/(root 2); 0; 1/(root 2)], [1/(root 6); 2/(root 6); i/(root 6)], [-1/(root 3); 1/(root 3); -i/(root 3)]}.

Unitary Matrices

Now we can continue our goal of trying to mimic orthogonal diagonalization by defining the complex equivalent of an orthogonal matrix. Definition: A matrix U in M(n-by-n)(C) is said to be unitary if its columns form an orthonormal basis for Cn.

Notice that if P in M(n-by-n)(R) is orthogonal, then P is also unitary, so unitary is the direct extension of orthogonal.

Of course, we get many of the same properties. Theorem 11.4.4: If U is in M(n-by-n)(C), then the following are equivalent.

  1. The columns of U form an orthonormal basis for Cn.
  2. U-inverse equals the conjugate-transpose of U.
  3. The rows of U form an orthonormal basis for Cn.

The proof is essentially the same as the corresponding theorem for orthogonal matrices, and so is omitted.

Theorem 11.4.5: If U1 and U2 are n-by-n unitary matrices, then

  1. (The length of (U1z)) equals (the length of z) for all vectors z in Cn.
  2. The absolute value of the determinant of U1 is equal to 1. And,
  3. U1U2 is also unitary.

I will prove the first property and leave the other two as exercises. Proof: For any vector z in Cn, we have (the length of (U1z))-squared equals (the inner product of (U1z) with itself), which, by definition of the standard Hermitian inner product on Cn, is equal to ((U1z) dot product (the conjugate of (U1z))), which is, as usual, ((U1z)-transpose)(the conjugate of (U1z)). This equals (z-transpose)(U1-transpose)(the conjugate of (U1z)). Taking a double conjugate, this is equal to (z-transpose)(((U1-transpose)-conjugate)-conjugate)(U1-conjugate)(z-conjugate). We can rearrange this as (z-transpose)((((U1-conjugate)-transpose)(U1))-all-conjugated)(z-conjugate). By Theorem 11.4.4, ((U1-conjugate)-transpose) is the inverse of U1, and so ((U1-conjugate)-transpose)(U1) is the identity matrix. Thus, we have this equals (z-transpose)(the conjugate of the identity matrix)(the conjugate of z). Simplifying, we find that this does equal (the length of z)-squared.

Notice the difference between the proof of this property and the proof of the corresponding property for orthogonal matrices. Because of the complex conjugate in the definition of the standard inner product, we needed to include a few extra steps. Take a minute to read over the proof, and make sure that you understand each step. Also, it is important to remember that we are now using complex numbers. So the fact that the absolute value of (the determinant of U) equals 1 means that the determinant of U can be any complex number with absolute value 1. So, for example, it can take values of 1, -1, as well as i, -i, (1 + i)/(root 2), and infinitely many other choices.

We have now seen a couple of times where we have needed the conjugate transpose of a matrix. Thus, we invent some notation for this. Definition: Let A be a matrix in M(m-by-n)(C). We call (A-conjugate)-transpose the conjugate transpose of A, and denote it by A-star.

Example: If A is the matrix [1, -2; -i, 1 + 3i; 4 – 2i, 5i], then (A-conjugate)-transpose is [1, i, 4 + 2i; -2, 1 – 3i, -5i].

Notice that if A has all real entries, then A-star = A-transpose. Thus, A-star is the extension of the transpose to the complex case, so it should not be surprising that the conjugate transpose has many of the same properties.

Theorem 11.4.6: If A and B are complex matrices, and alpha is a complex scalar, then

  1. (The inner product of (Az) and w) is equal to (the inner product of z and ((A-star)w)) for all z and w in Cn.
  2. The conjugate transpose of (A-conjugate)-transpose just equals A.
  3. ((A + B)-conjugate)-transpose = ((A-conjugate)-transpose) + ((B-conjugate)-transpose).
  4. (((alpha)A)-conjugate)-transpose is equal to (the conjugate of alpha)((A-conjugate)-transpose). And,
  5. ((AB)-conjugate)-transpose = ((B-conjugate)-transpose)((A-conjugate)-transpose).

Properties 2 and 5 follow immediately from properties of the complex conjugate and the transpose.

Even though the proof of (1) is very similar to the real case, I will again prove it so that you can see the difference of adding in the complex conjugate. Proof: We have (the inner product of (Az) and w) is equal to ((Az)-transpose)(the conjugate of w), which is equal to (z-transpose)(A-transpose)(the conjugate of w). Once again, using a double conjugate, this is equal to (z-transpose)(((A-transpose)-conjugate)-conjugate)(w-conjugate), which is equal to (z-transpose)(the conjugate of ((A-star)w)), which, by definition of the inner product, is equal to (the inner product of z and ((A-star)w)).

We are now ready to start looking at unitary diagonalization. We will do this in the next lecture. This concludes this lecture.

© University of Waterloo and others, Powered by Maplesoft