Lesson: Invariant Subspace

Question 2

1 point


While it is nice that we can diagonalize more matrices over C than we could over R, when we do so, we end up with a diagonal matrix with complex entries instead of real entries. There’s clearly no magic solution to this problem, as if we could have diagonalized the matrix over R, we would have done so. So, instead of expanding the numbering system we use in our matrices, what if we loosen the restriction that we need to end up with a diagonal matrix? Perhaps it is time we search for the next best thing. Well, if that’s our conclusion, why did we bother with the complex numbers in the first place? Because we will still be making use of the complex eigenvalues and eigenvectors of our real matrix, simply in a different way.

The first thing we need to do is start thinking of our complex vectors as having a real and imaginary part, much as we think of a complex number as having a real and imaginary part. So, for any vector z in Cn, there are vectors x and y in Rn such that we can write that z equals the vector [z1 through zn], which equals the vector [(x1 + y1i) through (xn + yni)], which we can write as the vector [x1 through xn] + (i times the vector [y1 through yn]), or we could write it as x + iy.

Next, we want to notice that if z is an eigenvector for a matrix A with real entries, and with corresponding eigenvalue lambda = a + bi, then we have that Az = (lambda)z, which means that Az, which equals A(x + iy), equals Ax + i(Ay), and that Az, which equals (lambda)z, equals (a + bi)(x + iy), which equals ax + (ai)y + (bi)x + (b(i-squared))y, which equals (ax – by) + i(bx + ay). So we see that Ax + i(Ay) = (ax – by) + i(bx + ay).

Now, since a and b are real numbers, and x and y are real vectors, the vectors (ax – by) and (bx + ay) are in Rn, not Cn, which means that we have really written Az as its real and imaginary parts in two ways. Now since these ways must be the same, we get that our real part Ax must equal ax – by, and our imaginary part Ay must equal bx + ay.

Now, I know you were hoping we would get a result like Ax = ax, but I did warn you that we were going to have to pull back from an ideal situation. If we look past this initial disappointment, we can still find something useful from this result, and that is the fact that both Ax and Ay are linear combinations of x and y. That is to say that we have that Ax and Ay are in the span of the set {x, y}. Moreover, since multiplication by A is linear, given any element w in the span of {x, y}, we know that Aw must also still be in the span of {x, y}.

So while the ideal situation would have been to find a real vector x such that Ax is in the span of x, instead we will be able to use our complex eigenvalues and eigenvectors to find a pair of real vectors such that for any w in the span of {x, y}, we will have that Aw is still in the span of {x, y}. We will state this result as a theorem, but first, we want to introduce a new term.

If T from V to V is a linear operator, and U is a subspace of V such that T(u) is an element of U for all u in U, then U is called an invariant subspace of T.

So our theorem states, suppose that lambda = a + bi, where b is not equal to 0, is an eigenvalue of an n-by-n real matrix A, with corresponding eigenvector z = x + iy. Then the span of {x and y} is a 2-dimensional subspace of Rn that is invariant under A and contains no real eigenvectors of A.

Now, we’ve already done the work to show that the span of {x, y} is invariant under A, but our theorem states two additional facts about the span of {x, y}: one, that the span of {x, y} is 2-dimensional; and two, that the span of {x, y} contains no real eigenvectors of A.

To see (1), we need to show that x and y are linearly independent, so that the set {x, y} is actually a basis for the span of {x, y}. We will prove this by contradiction, so assume by way of contradiction that the set {x, y} is linearly dependent. Then we must have that x = sy for some scalar s from the real numbers. From our earlier work, this would mean that Ax = ax – by, which equals asy – by, which equals (as – b)y. But we also get that Ax = A(sy), which equals s(Ay), which equals s(bx + ay), which equals s(bsy + ay), which equals (b(s-squared) + as)y.

Now, combining these two facts, we get that (as – b)y = (b(s-squared) + as)y. Well, these vectors are equal if either as – b = b(s-squared) + as or if y is the 0 vector. Let’s examine the first possibility. If as – b = b(s-squared) + as, then we must have that –b = b(s-squared). Well, since b not equal to 0 is one of the assumptions of our theorem, we can divide both sides by b, and get that s-squared = -1. But s is a real number, so this is not possible. Well, so we must have that y is the 0 vector. But since x = sy, this would also mean that x is the 0 vector, which means that our eigenvector was 0 + i(0), which is the 0 vector. But remember that the 0 vector is not allowed to be an eigenvector, so again we have a contradiction. And this means that our original assumption that the set {x, y} is linearly dependent must be wrong, so we have that {x, y} is linearly independent, as desired.

And to see property 2, we will again employ a proof by contradiction. So, assume by way of contradiction that there is a vector w in the span of {x, y} that is an eigenvector for A, with corresponding eigenvalue lambda. The first thing we want to note is that lambda must be a real number, since w is a real vector, and that (lambda)w = Aw is in Rn. Next, since we’ve already shown that {x, y} is a basis for the span of {x, y}, we know that there are scalars s and t in the real numbers such that w = sx + ty. So we have that Aw, which equals (lambda)w, equals (lambda)(sx + ty), which equals ((lambda)s)x + ((lambda)t)y. And, to look at things differently, Aw must equal A(sx + ty), which equals s(Ax) + t(Ay). Well, this equals s(ax – by) + t(bx + ay), which equals (sa + tb)x + (ta – sb)y. And so we see that ((lambda)s)x + ((lambda)t)y = (sa + tb)x + (ta – sb)y.

Recalling that the set {x, y} is a basis for the span of {x, y}, we know that the elements of Span{x, y} can be written uniquely as a linear combination of x and y. So this means that (lambda)s must equal sa + tb, and (lambda)t must equal ta – sb. Solving both of these equations for lambda, we find that lambda = (sa + tb)/s, and equals (ta – sb)/t. Cross-multiplying gives us that tsa + (t-squared)b = tsa – (s-squared)b, so this means that (t-squared)b = –(s-squared)b. And again, since b does not equal 0, we can divide by b to get that t-squared = –(s-squared). But this statement can only be true of real numbers s and t if s and t are both 0. But as before, this would mean that our eigenvector is the 0 vector, which can never be an eigenvector, and so we have reached our contradiction. This means that our original assumption was incorrect, so it must be that the span of {x, y} does not contain any real eigenvectors of A.

© University of Waterloo and others, Powered by Maplesoft