Lesson: Change of Coordinates with Orthonormal Bases

Question 2

1 point

Transcript

Because it is so easy to find the coordinates with respect to an orthonormal basis B of vectors {v1 through vn}, we will easily be able to find the change of coordinates matrix from the standard basis S to B. In fact, it will be easier than we could have hoped. After all, the change of coordinates matrix Q from S to B satisfies the following. Q is the matrix whose columns are the B-coordinates of e1 through the B-coordinates of en. But what are the B-coordinates of ej? Well, if we set the B-coordinates of ej equal to [b1j through bnj], then we know that bij, which is the ith coordinate of ej, must be ej dot vi. So note that I’ve chosen this numbering system so that bij is qij, the ijth entry of Q. But ej dot vi is simply the jth component of vi since all other components of ej equal 0. So the vector e1 will select all the first components of our vectors in B, the vector e2 will select all the second components of our vectors in B, and so on.

So we see that the matrix Q looks like this. Thinking of it columnwise, we see that the first column consists of the first entry of v1, then the first entry of v2, through to the first entry of vn. Our second column will look at the second entry of v1, then the second entry of v2, through to the second entry of vn, and we’ll go all the way down to our nth column would be the nth entry of v1, then the nth entry of v2, through to the nth entry of vn.

But now that we’ve created Q, let’s turn our attention to its rows. We’ll see that the first row of Q simply lists the components of v1, right, the first entry is the first entry of v1, then the second entry of v1, through to the nth entry of v1. Our second row is the first entry of v2, then the second entry of v2, then through the nth entry of v2. So using our notation, we see that the ith row of Q is the transpose of [bi1 through bin], where the jth entry of the ith row, bij, is the jth component of vi. So that is, the ith row of Q is vi-transpose.

Now, this means that if we were to set P equal to the matrix whose columns are v1 through vn, which, amongst other things, is the B to S change of basis matrix, then we’ll see that Q = P-transpose. But we also know that Q = P-inverse, and so we have that P-inverse = P-transpose. With this result in mind, we make the following definition.

An n-by-n matrix P such that (P-transpose)P equals the identity matrix is called an orthogonal matrix. It follows that P-inverse = P-transpose, and that P(P-transpose) equals the identity matrix equals (P-transpose)P.

Theorem 7.1.3: The following are equivalent for an n-by-n matrix P.

  1. That P is orthogonal.
  2. The columns of P form an orthonormal set.
  3. The rows of P form an orthonormal set.

Let’s look at the proof. Now usually, if I were proving a the-following-are-equivalent statement such as this one, I would prove it in a circular fashion, such as showing that (1) implies (2), which implies (3), which implies (1). But in this particular case, I believe it is easier to prove (1) if and only if (2) and (1) if and only if (3) separately. But first, we want to recall our definition of matrix multiplication.

Let B be an m-by-n matrix with rows b1-transpose through bm-transpose, and let A be an n-by-p matrix with columns a1 through ap. Then we define the matrix product BA to be the matrix whose ijth entry is bi dot aj. So that is, when we multiply two matrices, we take the dot product of the ith row of the matrix on the left with the jth column of the matrix on the right.

With that in mind, let’s proceed with our proof. To show that P is orthogonal if and only if the columns of P form an orthonormal set, let’s let v1 through vn be the columns of P. Then substituting P-transpose for B and P for A in the definition of matrix multiplication, we see that the ijth entry of (P-transpose)P is the dot product of the ith row of P-transpose with the jth column of P. But by the definition of the transpose, we have that the ith row of P-transpose is the same as the ith column of P. And so we see that the ijth entry of (P-transpose)P is vi dot vj. So, P is an orthogonal matrix if and only if (P-transpose)P = I. But (P-transpose)P is the identity matrix if and only if its ii entries equal 1 and its ij entries equal 0 for i not equal to j. Well, this happens if and only if vi dot vi = 1, and vi dot vj = 0 when i is not equal to j. By definition, this happens if and only if the set {v1 through vn} is orthonormal.

Now let’s show that P is orthogonal if and only if the rows of P form an orthonormal set. Well, this proof proceeds as above, except that we will want to look at the product P(P-transpose). Let’s let w1-transpose through wn-transpose be the rows of P. Well, then the ijth entry of P(P-transpose) is the dot product of the ith row of P with the jth column of P-transpose. But by the definition of transpose, we know that the jth column of P-transpose is the same as the jth row of P. As such, the ijth entry of P(P-transpose) is wi dot wj. So, P is an orthogonal matrix if and only if (P-transpose)P equals the identity matrix, which happens if and only if P(P-transpose) equals the identity matrix. But P(P-transpose) is the identity matrix if and only if its ii entries equal 1 and its ij entries equal 0 for i not equal to j. This happens if and only if wi dot wi = 1 and wi dot wj = 0 when i is not equal to j. And by definition, this happens if and only if the set {w1 through wn} is orthonormal.

Now, based on this theorem, you would think that we would call these matrices orthonormal matrices, but unfortunately, that is not standard practice. So, please remember that an orthogonal matrix must have orthonormal columns and rows. However, this theorem gives us our definition of orthonormal, but our definition of an orthogonal matrix makes it quite easy to check if a set is orthogonal.

Let’s look at some uses of the theorem. To see that the set B shown here is an orthonormal set, we look at the matrix (P-transpose)P, where P is the matrix whose columns are the vectors in B. So we’ll look at the matrix P-transpose, whose rows are the vectors in B, times the matrix P, whose columns are the vectors in B, and if we perform the calculation, we do, in fact, see that we get the identity matrix. Since (P-transpose)P equals the identity matrix, we see by definition that P is an orthogonal matrix, and from our theorem, this means that the columns of P are an orthonormal set.

We can also use this to see that a set is not orthonormal. For example, let’s look at a new set B, shown here. Again, we can look at the matrix (P-transpose)P, where P-transpose is the matrix whose rows are the vectors in B, times the matrix P, whose columns are the vectors in B. And when we do our matrix product, we get this, which is not the identity matrix. So, this means the columns of P do not form an orthonormal set.

Of more interest, though, is the fact that if we recall that the ijth entry of (P-transpose)P is vi dot vj, we can use the product (P-transpose)P to tell us why B is not an orthonormal set by looking at where our matrix product fails to be the identity matrix. The first thing we’ll notice is that entry p13 is not equal to 0. Well, this means that v1 dot v3 is not equal to 0—i.e., that the first and third vectors in B are not orthogonal. We also see that the entry p31 is not equal to 0, but this simply also tells us that v3 dot v1 is not equal to 0, and again, that simply means that the first and third vectors in B are not orthogonal, so we knew this already.

However, there is another reason that B is not an orthonormal set, which we discover when we look at the entry p22. Since p22 = (the norm of v2)-squared, and since p22 is not equal to 1, we see that the norm of v2 is not equal to 1, and so our second basis vector is not a normal vector.

© University of Waterloo and others, Powered by Maplesoft