Lesson: Orthogonal Complement

Question 2

1 point

Transcript

We have now seen that an orthonormal basis is a nice way to describe a subspace, but knowing that we want an orthonormal basis doesn’t make one fall into our lap. In theory, the process for finding an orthonormal basis is easy. Start with one vector, add a vector orthogonal to your first vector, then add a vector that’s orthogonal to the first two vectors, and so on. The problem comes when you start to deal with very large spaces, and you are trying to find a vector that is, for example, orthogonal to the thirty-seven vectors that came before it, especially if you were to consider that those vectors must have at least thirty-eight entries.

On the face of it, we already have all the techniques we need since the vector equation x dot v = 0 is simply a homogeneous linear equation whose variables are the entries of x. So, if we needed to make x orthogonal to thirty-seven vectors, we would be looking at solving a system of thirty-seven linear equations. Well, that would be a really big matrix to row reduce. So our first step on this journey will be to add some more definitions to our vocabulary, for while we have already defined what it means for one vector to be orthogonal to another, we are now looking at having our vector be orthogonal to a set of vectors.

Definition: Let S be a subspace of Rn. We shall say that a vector x is orthogonal to S if x dot s = 0 for all s in S.

For example, let S be the span of the vectors {[1; 0; 0; 0] and [0; 0; 0; 1]}. Then the vector [0; 1; 0; 0] is orthogonal to S. To see that, consider an arbitrary element of S, which we could write as a[1; 0; 0; 0] + b[0; 0; 0; 1], which equals [a; 0; 0; b]. Well, then we can easily see that [0; 1; 0; 0] dotted with [a; 0; 0; b] = 0, as desired. Similarly, we can see that [0; 3; 0; 0] is orthogonal to S since [0; 3; 0; 0] dotted with [a; 0; 0; b] is simply equal to 3[0; 1; 0; 0] dotted with [a; 0; 0; b], which is 3(0), which is 0. It is also easy to notice that [0; 0; 1; 0] is orthogonal to S since we also have that [0; 0; 1; 0] dot [a; 0; 0; b] = 0. For example, the vector [1; 1; 1; 1] is not orthogonal to S since [1; 0; 0; 0] is an element of S, but [1; 1; 1; 1] dot [1; 0; 0; 0] = 1, not 0.

As usual, the best way to show that x is not orthogonal to a subspace S is to find a specific element s of S such that x dot s does not equal 0.

One thing to note is that in order to show that x is orthogonal to a subspace S, we only need to show that x is orthogonal to the basis vectors for S. In fact, we could just show that x is orthogonal to any spanning set for S; we don’t need to worry about the independence of the vectors for this property. If this set A of vectors {a1 through ak} is a spanning set for S, then every element s of S can be written as s1a1 + through to skak for some scalars s1 through sk. Now, if we know that x is orthogonal to every vector ai, then we’ll see that x dot s, which equals x dot (s1a1 + through to skak), well, we can distribute the dot product to get that this is equal to (x dot s1a1) + through to (x dot skak), and then we can pull out the scalars to see that this equals s1(x dot a1) + through to sk(x dot ak). And we’ve chosen x to be orthogonal to the {a1 through ak}, so we see that this is s1(0) + through to sk(0), which is 0.

We also see that if x is orthogonal to S, then tx is orthogonal to S for any scalar t since, of course, (tx) dot s will equal t(x dot s), which would be t(0), which equals 0. Also, if both x and y are orthogonal to S, then so is x + y, since (x + y) dot s can be broken into (x dot s) + (y dot s), which would be 0 + 0, which is 0. If we add to these facts the fact that the 0 vector is orthogonal to S, since, of course, 0 dot v = 0 for any vector v, and thus definitely for the ones in S, then we have shown that the set of all vectors orthogonal to S is never the empty set. And this means that we have shown that the set of all vectors orthogonal to S is itself a subspace of Rn, since it is a non-empty subset of Rn that is closed under addition and scalar multiplication.

We give this subset the following name. We call the set of all vectors orthogonal to S the orthogonal complement of S, and denote it S-perp. That is, S-perp equals the set of all x in Rn such that x dot s = 0 for all s in S.

Let’s find a basis for S-perp where S is the span of the vectors {[1; 2; 0; 1] and [3; 6; 1; 4]}. Again, a vector is orthogonal to S if it is orthogonal to the vectors in its spanning set, so we are looking for vectors x (that is, [x1; x2; x3; x4]) such that [x1; x2; x3; x4] dotted with [1; 2; 0; 1] equals 0, and [x1; x2; x3; x4] dotted with [3; 6; 1; 4] equals 0. This is the same as looking for solutions to the following system of equations: x1 + 2x2 + x4 = 0, and 3x1 + 6x2 + x3 + 4x4 = 0. To solve this homogeneous system, we row reduce its coefficient matrix. From the reduced row echelon form matrix, we see that our system is equivalent to x1 + 2x2 + x4 = 0, x3 + x4 = 0. Now if we replace the variable x2 with the parameter s, and the variable x4 with the parameter t, we get that the general solution to this system is that x = s[-2; 1; 0; 0] + t[-1; 0; -1; 1]. The general solution to our system is a list of all the vectors x that are orthogonal to S, so we see that the set {[-2; 1; 0; 0], [-1; 0; -1; 1]} is a spanning set for S-perp. Moreover, we see that these vectors are not a scalar multiple of each other, and thus are linearly independent, so, in fact, our set is a basis for S-perp.

Now, in this example, we were easily able to show that the spanning set was linearly independent because it only had two vectors in it. But what if our spanning set was bigger? Would we need to row reduce a matrix to prove that our spanning set is linearly independent? No, because our basis for S-perp can always be thought of as the nullspace for a matrix, and in Linear Algebra I, we showed that the technique for finding the spanning set for the nullspace, in fact, results in a basis for the nullspace. And which matrix are we finding the nullspace of? It is the matrix whose rows are the vectors in the spanning set for our subset—not the columns, the rows. I recommend going ahead and setting up the system of equations using the defining dot products to avoid confusion on this matter. But we will use this fact when we prove the following theorem.

Let S be a k-dimensional subspace of Rn. Then the following are true.

  1. That S intersect S-perp = {the 0 vector}.
  2. The dimension of S-perp equals n – k. And,
  3. If {v1 through vk} is an orthonormal basis for S, and if {v(k+1) through vn} is an orthonormal basis for S-perp, then the combined set of {v1 through to vk, and v(k+1) through to vn} is an orthonormal basis for Rn.

First, let’s look at the fact that S intersect S-perp is only {the 0 vector}. To see this, assume that x is an element of (S intersect S-perp). Well, this means that x is an element of S-perp, so x must be orthogonal to every element of S. But we also have that x is an element of S, so this means that x must be orthogonal to itself. That is to say that x dot x = 0, but we know that this means that x equals the 0 vector.

Next, we want to look at the fact that the dimension of S-perp will equal n – k. To see this, let A be the matrix whose rows are the basis vectors of S. Then A is a k-by-n matrix, and S is the rowspace of A. This means that the rank of A is the same as the dimension of S, so the rank of A equals k. But we also have that S-perp is the nullspace of A, and thus, the dimension of S-perp is the nullity of A. By the Rank Theorem, we know that rank(A) + nullity(A) = n, so this means that the dimension of S-perp, which equals the nullity of A, must equal n minus the rank of A, which is n – k.

Finally, to see that our combined set of {v1 all the way through vn} is an orthonormal basis for Rn, remember that we, in fact, only need to show that the set is an orthonormal set, as it will then automatically be a basis. That means we need to show that vi dot vj = 0 whenever i does not equal j. We break this into four different scenarios. Case (a): i and j are both between 1 and k. Then that means the vectors vi and vj are both in the set {v1 through vk}, which is an orthonormal basis for S, and thus is an orthonormal set, which means that vi dot vj must equal 0. Case (b): i is from 1 to k, and j is from k+1 to n. Well, then vi is in S and vj is in S-perp, and so by the definition of S-perp, we know that vi dot vj = 0. Similarly, if j is between 1 and k, and i is between k+1 and n, then vj is in S and vi is in S-perp, so by the definition of S-perp, we know that vi dot vj = 0. And finally, with case (d), we’re looking at the case where both i and j are between k+1 and n. This means that both vi and vj are in our second set {v(k+1) through vn}. But again, this was an orthonormal basis for S-perp, which means it is an orthonormal set, and that means that vi dot vj must equal 0.

© University of Waterloo and others, Powered by Maplesoft