## Transcript

So far, we have taken the essential properties of Rn and used them to form general vector spaces, and now we’ve taken the essential properties of the standard basis and created orthonormal bases. But the definition of an orthonormal basis was dependent on the properties of the dot product. But what was so special about the dot product? Let’s take the essential properties of the dot product and use them to define the more general concept of an inner product.

Let V be a vector space over R. An inner product on V is a function angle-angle from (V cross V) to R such that the product <v, v> will be greater than or equal to 0 for all v in our vector space, and the product <v, v> will equal 0 if and only if v is the 0 vector. This is known as the positive definite property. And 2, we need that the product <v, w> to equal the product <w, v> for all v and w in our vector space. This is known as the symmetric property. Finally, we want that the product of <v with (sw + tz)> to equal (s times the product <v, w>) + (t times the product <v, z>) for all s and t in R, and all v, w, and z in our vector space. This is known as the bilinear property.

Now, note that if we combine properties 2 and 3, then we’ll have that the product of <(sv + tw) and z> will equal (s times the product <v, z>) + (t times the product <w, z>). In fact, in its full generality, we will have that the product of <(s1x + s2v) with (t1w + t2z)> will equal, well, first we can pull out our s1 and our s2 to get (s1 times the product <x with (t1w + t2z)>) + (s2 times the product of <v with (t1w +t2z)>), and now we can expand out our t sides, and we’ll see that this equals (s1t1 times the product <x, w>) + (s1t2 times the product <x, z>) + (s2t1 times the product <v, w>) + (s2t2 times the product <v, z>). So basically, we get all these extra linearity properties.

Now another definition. A vector space V with an inner product is called an inner product space.

As we do so frequently, let’s start our study of inner product spaces by looking at Rn. For example, let’s show that the function defined by the product of <x and y> will equal x1y1 + 4x2y2 + 9x3y3 is an inner product on R3. So to show that our function is an inner product, we need to show that it satisfies the three properties of an inner product.

First, let’s look at the product of <x with x>. This will equal x1x1 + 4x2x2 + 9x3x3. That is to say, it equals x1-squared + 4(x2-squared) + 9(x3-squared). But since each of the x1-squared, x2-squared, and x3-squared are greater than or equal to 0, we see that the inner product <x, x> is, in fact, greater than or equal to 0. What if the product of <x and x> equals 0? Well, then that means we must have that x1-squared + 4(x2-squared) + 9(x3-squared) = 0. But since the terms in this sum are all greater than or equal to 0, the only way that their sum can equal 0 is if, in fact, each of the terms equals 0. So we must know that x1-squared = 0, that 4(x2-squared) = 0, and that 9(x3-squared) = 0. But of course, this must mean that x1 = 0, x2 = 0, and x3 = 0, which means that x is the 0 vector. And so we see that our function is positive definite.

Now let’s look to see if it’s symmetric. That means we need to look at any x and y in R3, and we’ll look at their product <x, y>, which will be x1y1 + 4x2y2 + 9x3y3. But of course, this equals y1x1 + 4y2x2 + 9y3x3, which is the product <y, x>. So, it is easy to see that our function is symmetric.

Lastly, let’s check the bilinear property. So we’ll assume that x, y, and z are vectors in R3, and that s and t are scalars. And let’s look at the product of <x with (sy + tz)>. Well, this will be (x1 times the quantity (sy1 + tz1)) + (4x2 times the quantity (sy2 + tz2)) + (9x3 times the quantity (sy3 + tz3)). Expanding this all out, we’ll see that this equals sx1y1 + tx1z1 + 4sx2y2 + 4tx2z2 + 9sx3y3 + 9tx3z3. Now let’s factor out our s terms and our t terms, and we’ll see that we have (s times the quantity (x1y1 + 4x2y2 + 9x3y3)), which is, of course (s times the product <x, y>), and then plus (t times the quantity (x1z1 + 4x2z2 + 9x3z3)), which is, of course, (t times the product <x, z>). And we have shown that our product is bilinear. And as such, we know that it is, in fact, an inner product.

Now let’s look at an example of a function that would not define an inner product. We can look at the function defined by the product of <x, y> = 2x1y1 + 3x2y2. The easiest way to see that this is not an inner product is to note that it is not positive definite. For example, the inner product of the vector [0; 0; 3] with the vector, with itself will be 2(0)(0) + 3(0)(0), which equals 0 even though our vector [0; 0; 3] does not equal the 0 vector. The function I defined does define an inner product on R2, so another purpose of this example is to show the importance of paying attention to which vector space you are working in.

To do something a bit less silly, let’s look at the function defined by the product of <x and y> will equal x1y1 + x1y2 + x1y3. Now, this one is not an inner product on R3 because it is not symmetric. For example, the product of <[1; 2; 3] with [2; 3; 4]> would be 1(2) + 1(3) + 1(4), which is 9, but the product of <[2; 3; 4] with [1; 2; 3]> would be 2(1) + 2(2) + 2(3), which is 12, so those products are not the same. It may interest you to know that, of course, the function is also not positive definite, but it is bilinear.

The dot product was only defined in Rn, but an inner product can be defined in any vector space. So what could be an inner product in a matrix space? Since our result is supposed to be a number, not a vector, matrix multiplication is not really a contender. Hmm, but what about the determinant?

The function which defines the product of <A and B> to be the determinant of AB is not an inner product on the space M(2, 2). The easiest thing to see is that it is not positive definite, since if we take the determinant of the product of the matrix [1, 0; 0, 0] with itself, then we’re still looking at the determinant of [1, 0; 0, 0], which equals 0, so this means that our product of [1, 0; 0, 0] with [1, 0; 0, 0] equals 0 even though the matrix [1, 0; 0, 0] is not the 0 matrix. It is also worth noting that the determinant is not bilinear, since in general, the determinant of sA would equal (s-to-the-n) times (the determinant of A), not s times (the determinant of A). So while the determinant is useful for many things, it is not useful for defining an inner product on matrix spaces.

There is another operation associated with matrices, though, that results in a number, and that is the trace. Recall that the trace of a matrix is the sum of the diagonal entries. The function defined by the product of <A and B> equals the trace of ((B-transpose)A) is an inner product on M(m, n). To see this, let’s first look at the definition in M(2, 3). So we could look at a generic vector A and a generic vector B. To compute their, this inner product, we would look at the trace of ((B-transpose)A). Now, since we’re looking at the trace, we really only care what values we get in the diagonal, so I can forget about the rest of them, but we’ll note that on the diagonals, we’ll get b11a11 + b21a21, then in our second diagonal, we’ll get b12a12 + b22a22, and in our last diagonal, we’ll get b13a13 + b23a23. And once we take the trace, we end up summing these values to get b11a11 + b21a21 + b12a12 + b22a22 + b13a13 + b23a23.

Another way of thinking about this, though, is that we could say that the entries on the diagonal are the vector [b11; b21] dotted with [a11; a21], the vector [b12; b22] dotted with the vector [a12; a22], and the vector [b13; b23] dotted with the vector [a13; a23]. So that is, the ith diagonal entry is the dot product of the ith column of B with the ith column of A. As you may now be guessing, this will be true in our general case.

Consider that our definition for matrix multiplication was that the ijth entry of the product BA is the dot product of the ith row of B with the jth column of A. But if we turn our B into the B-transpose, then the ith row of B-transpose is the same as the ith column of B, so the ijth entry of (B-transpose)A is the dot product of the ith column of B and the jth column of A. Now, since we are looking for the trace of (B-transpose)A, we need to take the sum of the diagonal entries (that is, the ii entries), so that will be the dot product of the ith column of B with the ith column of A. So if A is a matrix whose columns are a1 through an, and B is a matrix whose columns are b1 through bn, then we define the inner product <A, B> to be the trace of ((B-transpose)A), which will be the sum from i = 1 to n of (bi dotted with ai).

Now, with this formula in hand, let’s show that our product really is an inner product. So first, let’s look at the product of <A with A>. This will be the sum from i = 1 to n of (ai dotted with ai). Now, since (ai dotted with ai) must be greater than or equal to 0 for all i, as the dot product itself is positive definite, we know that the sum of all these values must be greater than or equal to 0. Moreover, if our sum, in fact, equals 0, then we know that each of the greater-than-or-equal-to-0 values are (ai dotted with ai) must equal 0, and again using the fact that our dot product is positive definite, this means that ai must equal 0 for all the i. And since the columns of A are all the 0 vectors, we get that A is the 0 matrix. So, we’ve shown that our product is positive definite.

Next, let’s look to see if it’s symmetric. Well, we already know that the product of <A and B> equals the sum from i = 1 to n of (bi dotted with ai), so now let’s look at our product <B, A>. Well, by the same arguments we used for the product <A, B>, we know that the ijth entry of (A-transpose)B is the dot product of the ith column of A with the jth column of B. And so this means that the trace of ((A-transpose)B) will be the sum from i = 1 to n of (ai dotted with bi). Now, since the dot product is symmetrical, we know that (ai dotted with bi) = (bi dotted with ai), and thus, we get the following, that the product of <B and A>, which is the trace of ((A-transpose)B), will be the sum from i = 1 to n of (ai dotted with bi), which is the same as the sum from i = 1 to n of (bi dotted with ai), which we know to be the product of <A and B>. So now we’ve shown that it’s symmetric.

Let’s use our formula to see if it is bilinear. So we’ll note that the product of <A with (sB + tC)> will simply be the sum from i =1 to n of ((sbi + tci) dotted with ai). Well now, within our sum, we can use all of our linearity properties of the dot product to know that this is going to be the sum from i = 1 to n ((sbi dotted with ai) + (tci dotted with ai)), which we can write as (s times the sum from i = 1 to n of (bi dotted with ai)) + (t times the sum from i = 1 to n of (ci dotted with ai)), which, of course, is just (s times the product <A, B>) + (t times the product <A, C>).

Before we leave this example, let’s take a further look at the formula <A, B> = trace of ((B-transpose)A), which equals the sum from i = 1 to n of (bi dotted with ai). Now, the dot product bi dot ai is the sum bi1ai1 + through to bimaim. But since we cycle through all possible i values when we take the inner product, what we end up with is that our inner product <A, B> will be the sum from i = 1 to n and j = 1 to n of bijaij. Now, don’t let the double sum scare you; the important fact is what’s inside—that we are taking the product of the corresponding entries of A and B, and adding them together. Again, we’ve got bijaij, so that’s going to be b11 with a11, then b12 with a12, then b13 with a13, but they’re always paired up. It’s the corresponding entries of A and B multiplied together, and we add them all up. We’ll eventually cycle through every possible aij. Well, this formula becomes quite reminiscent of our formula for the dot product in Rn, and in fact, under the standard isomorphism from M(m, n) to Rmn, this is the same as the dot product.

So whenever you need to calculate the trace of ((B-transpose)A), it’s actually easier to do it as this double sum instead of having to go through the matrix multiplication. So for example, if we wanted to look at the product of the matrix <[2, 1; 3, -5] and [1, 0; -3, -2]>, first we’ll multiply our 1,1 entries (that’s 2(1)), then we can multiply our 1,2 entries (that’s 1(0)). Next, we can multiply our 2,1 entries (that’s 3(-3)), and we can multiply our 2,2 entries, (-5)(-2). And add those all together, and we see that the answer is 3. So, see, no matrix multiplication required.

Now let’s look at some examples of the inner product in polynomial spaces. Again, we need to find an operation on polynomials that results in a number, not a polynomial. Most of the things we know how to do with polynomials from outside linear algebra still result in polynomials—whether we multiply them, factor them, take derivatives of them. But there is one thing that we do with polynomials that results in a number—plug in a number.

So, for example, let’s define an inner product by saying that the product of <p and q> = 2p(0)q(0) + 3p(1)q(1) + 4p(2)q(2). First, to see that this is an inner product, we need to check to see if it’s positive definite. Well, the product of <p with p> would be 2((p(0))-squared) + 3((p(1))-squared) + 4((p(2))-squared). This is clearly greater than or equal to 0 since all the terms in the sum are positive or 0. Moreover, the only way to have that 2((p(0))-squared) + 3((p(1))-squared) + 4((p(2))-squared) = 0 is if p(0) = 0, p(1) = 0, and p(2) = 0. Well, how do we use this to see that p must, in fact, be the 0 polynomial? If p(x) is the polynomial p0 + p1x + p2(x-squared), then if p(0) = 0, it means that p0 + p1(0) + p2(0) = 0. That is to say that p0 = 0. And now we use the fact that p(1) = 0 to see that that means that p0 + p1(1) + p2(1-squared) = 0. Well, that means that p1 + p2 = 0, or that p1 = -p2. Note that I already used the fact that p0 = 0 to determine this. Lastly, using the fact that p(2) = 0, we know that p0 + p1(2) + p2(2-squared) = 0, again, going ahead and using the fact that we already know that p0 = 0, and the fact that p1 = -p2, this becomes 2(-p2) + 4(p2) = 0. Well, this must also mean that p2 = 0, which means that p1 = 0 as well. And so we’ve shown that if the product <p, p> = 0, then our coefficients p0, p1, and p2 are all 0, which means that p is the 0 polynomial. We have now seen that our product is positive definite.

Now let’s look at symmetric. The product of <p and q> = 2p(0)q(0) + 3p(1)q(1) + 4p(2)q(2), but of course, our values p(0), q(0) are symmetric—they’re just real numbers. So this equals 2q(0)p(0) + 3q(1)p(1) + 4q(2)p(2), which is the product <q, p>, so our product is symmetric.

Lastly, for any p, q, and r in P2, and s and t in R, we get the following. Now, I’m not going to read all of this out, but as in step 2, it’s just a bit longer this time, you’re just trying to manipulate real numbers, so of course, you can factor them out, you can change their order, and if you do so, you will see that you eventually get (s times the product <p, q>) + (t times the product <p, r>). And so, our function is bilinear.

Now, the function I’ve just defined is also an inner product on P1, but it would not be an inner product on P3, or on Pn for any n greater than 2. The reason that it isn’t an inner product in the larger polynomial spaces is that it fails to be positive definite. Consider, for example, that if we had the function p(x) = 2x – 3(x-squared) + x-cubed, then p(0) would be 0, and p(1) would be 0, and p(2) would be 0, and this would mean that our inner product <p, p> would be 0 even though p is not the 0 polynomial.

How did I find a polynomial p(x) to demonstrate that it was not positive definite? Well, I knew that I needed p(0) to equal 0, p(1) to equal 0, and p(2) to equal 0. In P3, that forms the following systems of equations. So if p(0) = 0, that means that p0 = 0. If p(1) = 0, then that means that p0 + p1 + p2 + p3 = 0. And if p(2) = 0, then that means I need that p0 + 2p1 + 4p2 + 8p3 = 0. Now, going ahead and plugging in p(0) = 0, we focus our attention on finding a solution to this system. Again, I solve the system by row reducing the coefficient matrix—it’s a homogeneous system, after all—and I saw that it has non-trivial solutions, and that the general solution would be that [p1; p2; p3] = t[2; -3; 1]. So I just set t = 1 to get the solution that p1 should be 2, p2 should be -3, and p3 should be 1, which was the polynomial I used, 2x – 3(x-squared) + x-cubed.

So again, keep in mind that with polynomial spaces, you have to be paying attention to which polynomial space you’re working in, so that you know how many general coefficients you want your polynomial to have.