# Lesson: Orthonormal Bases

1 point

## Transcript

Let’s stop and think about the features we really want in a basis. By its definition, a basis is a linearly independent spanning set, but the reason we wanted those features is that we wanted to be able to uniquely write every vector as a linear combination of the basis vectors. That is, we wanted to be able to assign coordinates based on a basis.

Once we have coordinates, we can use all of the results from Rn, including finding a matrix to represent any linear transformation. So while any basis will lead to coordinates, it would be nice if those coordinates were easy to find. That’s one of the great things about our standard bases.

But sometimes the standard basis doesn’t fit with our situation. Various geometrical transformations are better defined with respect to other bases, for example. So let’s see if we can figure out a more general class of basis whose coordinates are easy to find, but that aren’t necessarily the standard basis—or just some rearrangement of it.

First, I want to point out that from this point forward, we will again focus our attention purely on Rn since we now know that we can extend these results to any finite-dimensional vector space. And once we are focused on Rn, we can turn our attention to the dot product.

Let x be the vector whose entries are x1 through xn, and let y be the vector whose entries are y1 through yn both be vectors in Rn. Then the dot product of x and y is (x dot y), which equals the sum of x1y1 through xnyn. Again, we could easily extend this definition to matrices and polynomials, but we only need to define it in Rn, and through the magic of coordinates, we extend this concept and its result to all finite-dimensional vector spaces, even ones we haven’t studied yet.

We also want to recall that two vectors x and y are orthogonal if x dot y = 0. We now wish to extend this notion of orthogonality to sets of vectors. A set of vectors {v1 through vk} in Rn is orthogonal if vi dot vj = 0 whenever i is not equal to j.

For example, the set {[2; -3], [6; 4]} is orthogonal because [2; -3] dot [6; 4] = 12 – 12, which is 0. The set {[7; 4], [-1; 2]} is not orthogonal because [7; 4] dot [-1; 2] = -7 + 8, which is not 0.

This larger set containing the vectors [3; 1; 4], [-1; -1; 1], [5; -7; –2] is orthogonal. We have to check three dot products to see this. First, we see that [3; 1; 4] dot [-1; -1; 1] = -3 – 1 + 4, which is 0. Next, we check that [3; 1; 4] dot [5; -7; -2], which equals 15 – 7 – 8, is equal to 0. And last, we check that [-1; -1; 1] dot [5; -7; -2] = -5 + 7 – 2, which equals 0 as well.

This even larger set is not orthogonal. We can show this by simply showing that one pair of vectors is not orthogonal. For example, the vector [8; -3; -12; -2] dot [6; -5; -9; 4] = 48 + 15 + 108 – 8, which is 163, not 0.

As one last example, we of course note that the standard basis {e1 through en} for Rn is orthogonal since ei dot ej = 0 if i does not equal j. This example shows that orthogonality is one of the properties of our standard basis, which we are trying to emulate.

Well, why are orthogonal sets so great? Well, here’s the first thing. If {v1 through vk} is an orthogonal set of non-zero vectors in Rn, then it is linearly independent.

Note first that we need the vectors to be non-zero since any set containing the 0 vector is linearly dependent. And so let’s assume that {v1 through vk} is an orthogonal set of non-zero vectors in Rn. To see that {v1 through vk} is linearly independent, let’s assume that we have scalars c1 through ck such that the sum of c1v1 through ckvk equals 0. Well, then for any i from 1 to k, we can take the dot product of vi with both sides of this equation. This would give us the dot product of (the sum of c1v1 through ckvk) and vi would equal 0 dot vi. Well, of course, a 0 dot vi is simply 0. Meanwhile, using the distributive properties of the dot product on the left, we see that it equals ((c1v1) dot vi) + through to ((ckvk) dot vi). We can also use associative properties of the dot product to pull out our scalars c, so that our dot product becomes c1(v1 dot vi) + through to ci(vi dot vi) + through to ck(vk dot vi).

Now, since the set {v1 through vk} is orthogonal, then we know that vj dot vi will equal to 0 whenever j does not equal i. So, for example, our v1 dot vi = 0, and our vk dot vi = 0, and in fact, the only dot product that is not equal to 0 is vi dot vi, and this becomes (the norm of vi)-squared. So this means that our dot product on the left becomes simply ci((the norm of vi)-squared), and that this must equal 0. But remember that we chose our set to have only non-zero vectors, so this means that vi is not equal to 0, and that this means that the norm of vi cannot equal 0. This means that we can divide by the norm and get that ci = 0. And since we could have done this for any i from 1 to k, we have shown that all the ci = 0, and this means that our set {v1 through vk} is linearly independent.

In addition to being orthogonal, the standard basis has one other nice property. The length, or norm, of each of the vectors is 1. If we add that requirement to a general orthogonal set, we say that the set is orthonormal.

A set {v1 through vk} of vectors in Rn is orthonormal if it is orthogonal and each vector vi is a unit vector—that is, each vector is normalized.

Note that since the 0 vector does not have length 1, it can never be in an orthonormal set. So, using Theorem 7.1.1, we see that all orthonormal sets are linearly independent.

Let’s look at some examples. The set {[2/(root 13); -3/(root 13)], [6/(root 52); 4/(root 52)] is an orthonormal set since the dot product of our vectors ends up being 12/(the root 676) – 12/(the root 676), so that is 0. We see that the norm of our first vector equals the square root of (4/13 + 9/13), which is 1, and the norm of our second vector ends up being the square root of (36/52 + 16/52), which also equals 1.

The larger set seen here is also orthonormal. Again, this time we have to check three pairs of dot products. The dot product of our first vector and our second vector equals -3/(root 78) – 1/(root 78) + 4/(root 78), which equals 0. The dot product of our first vector and our third vector equals 15/(root 2028) – 7/(root 2028) – 8/(root 2028), which equals 0. And the dot product of our second vector and our third vector equals -5/(root 234) + 7/(root 234) – 2/(root 234), which equals 0. That shows orthogonality.

Now we look at the normal part, and we see that the norm of our first vector is the square root of (9/26 + 1/26 + 16/26), which does, in fact, equal 1. The norm of our second vector equals the square root of (1/3 + 1/3 + 1/3), which equals 1. And the norm of our third vector is the square root of (25/78 + 49/78 + 4/78), which equals 1.

Now, note that these two examples were both obtained from the orthogonal sets in our first example by simply finding the unit vectors that correspond to the original vectors. This is, in fact, the usual way that orthonormal sets are created. First, worry about creating an orthogonal set, and then worry about making it orthonormal.