## Transcript

Now, all of the things that we previously defined with respect to the dot product, we can now define in any inner product space.

Let V be an inner product space. Then for any vector v in V, we define the norm (or length) of v to be the square root of the inner product of <v with v>. For any vectors v and w in V, the distance between v and w is the norm of the difference (v – w).

Now, I prefer to use the term “norm” instead of “length”, as it helps to emphasize that we may be dealing with something other than the usual length measurement. Unfortunately, there is no other term used to describe the general notion of the distance between two vectors. The best parallel for this situation that I can think of is that you are still measuring the distance between the two vectors, but your units of measurement have changed. If you are measuring the distance between two cities, there are many different quantities that might interest you other than simply the kilometres between them. Maybe you want to know how many gas stations there are between them, or how many stops on the train there are between them. So just keep in mind that when we are now talking about the distance between two vectors, we may not be talking about distance in the traditional sense, but we do still have some sort of distance in mind.

A vector v in an inner product space V is called a unit vector if the norm of v equals 1.

So, for example, we can normalize the polynomial p(x) = 1 – 3x + x-squared in P2 under the inner product <p, q> = 2p(0)q(0) + 3p(1)q(1) + 4p(2)q(2). First, we want to note that p(0) will equal 1 – 3(0) + 0-squared, which is 1, that p(1) = 1 – 3(1) + 1-squared, which is -1, and p(2) = 1 – 3(2) + 2-squared, which is -1. And so this means that (the norm of p)-squared would equal the inner product of <p with p>, which is going to be 2(1-squared) + 3((-1)-squared) + 4((-1)-squared), which is 9. And thus, the norm of p is the square root of 9, which is 3. So, q(x) = (1/3)p, which is 1/3 – x + (1/3)(x-squared), is a unit vector that is a scalar multiple of p.

Next, let’s look at the generalized definition of “orthogonal”. Let V be an inner product space. Then two vectors v and w from V are said to be orthogonal if the product of <v and w> equals 0. Similarly, the set of vectors {v1 through vk} in V is said to be orthogonal if the product of <vi and vj> equals 0 for all i not equal to j. This set is said to be orthonormal if we also have that the product of <vi with vi> equals 1 for all i.

For example, consider the inner product space M(2, 2) with our standard inner product, where <A, B> = the trace of ((B-transpose)A). Then the matrices A = [1, -5; -3, 1] and B = [1, 2; -2, 3] are orthogonal since the product of <[1, -5; -3, 1] with [1, 2; -2, 3]> will equal (again, I’ll do it entrywise), we’ll get 1(1) + (-5)(2) + (-3)(-2) + 1(3). Well, this is 1 – 10 + 6 + 3, which equals 0.

Note also that our same matrix A and a new matrix C = [7, 0; 2, -1] are orthogonal since, if we take their product, which becomes 1(7) + (-5)(0) + (-3)(2) + 1(-1), which is, we get 7 + 0 – 6 – 1, which equals 0. We can also look at the product of <B and C>. We’ll see that, this time, we get 1(7) + 2(0) + (-2)(2) + 3(-1), which is 0. So, in fact, the set {A, B, C} is orthogonal.

However, it is not orthonormal. If we look at the product of <A with A>, we see that it is 36. If we look at the product of <B with B>, we see that it is 18, and the product of <C with C> is 54. So none of them are normal matrices. We could, however, normalize those matrices, and so we could get that the set {(1/6)A, (1/(3(root 2)))B, and (1/(3(root 6)))C} is an orthonormal set.

Let’s look at another example. Let’s consider the inner product space of P2 with the inner product defined by <p product with q> is 2p(0)q(0) + 3p(1)q(1) + 4p(2)q(2). Well, then the polynomials p(x) = 3 + x – 2(x-squared) and q(x) = 2x – 3(x-squared) are not orthogonal. To see this, we’ll first plug in 0, 1, and 2 to these two polynomials. Well, we get that p(0) = 3, p(1) would be 3 + 1 – 2, which is 2, and p(2) will be 3 + 2 – 2(4), which is -3. q(0) will equal 0, q(1) will equal 2 – 3, which is -1, and q(2) will equal 4 – 3(4), which is -8. With those values in mind, we can now compute that the inner product of <p and q> will be 2(3)(0) + 3(2)(-1) + 4(-3)(-8), which is 90. And being not 0, it means that the polynomials are not orthogonal.

Now, what if we wanted to find a vector that is orthogonal to our polynomial p(x) = 3 + x – 2(x-squared)? Even more so, we might want to find a polynomial r that is orthogonal to p, and so such the span of {p, r} equals the span of {p, q}. Well, that’s what our Gram-Schmidt process was able to do in Rn, and we can simply extend that process to general inner product spaces. While we’re at it, let’s go ahead and define the proj(S) and the perp(S), too.

If V is an inner product space, and if the set B of vectors {v1 through vk} is an orthogonal basis for a subspace S, then for any vector x in V, we have proj(S)(x) will equal ((the inner product of <x and v1>)/(the inner product of <v1 and v1>))v1 + through to ((the product of <x and vk>)/(the product of <vk and vk>))vk, and that perp(S)(x) will equal x – proj(S)(x).

So, for example, let P2 be an inner product space, and again, we’re still going to have our inner product of <p and q> equal to 2p(0)q(0) + 3p(1)q(1) + 4p(2)q(2), and now let S be the span of the polynomial {3 + x – 2(x-squared)}. Well, then the proj(S)(2x – 3(x-squared)) will be ((the product of <(2x – 3(x-squared)) with (3 + x – 2(x-squared))>)/(the product of <(3 + x – 2(x-squared)) with (3 + x – 2(x-squared))>)) times our polynomial (3 + x – 2(x-squared)), noting that, of course, we just have that p(x) = 3 + x – 2(x-squared) and q(x) = 2x – 3(x-squared) are as in our previous example. Then we already know that the product <(2x – 3(x-squared)) with (3 + x – 2(x-squared))> equals 90. And using the values for p that we calculated in the previous example, namely that p(0) = 3, p(1) = 2, and p(2) = -3, we can compute that the inner product of <(3 + x – 2(x-squared)) with (3 + x – 2(x-squared))> equals 2(3-squared) + 3(2-squared) + 4((-3)-squared), which is 66. And so we have that the proj(S)(2x – 3(x-squared)) = (90/66)(3 + x – 2(x-squared)), which we could write as 45/11 + (15/11)x – (30/11)(x-squared). And we can now also find the perp(S)(2x – 3(x-squared)) will equal (2x – 3(x-squared)) minus our proj(S)(2x
- 3(x-squared)), which we can write as -45/11 + (7/11)x – (3/11)(x-squared).

Now, if we were to set the polynomial r(x) equal to our perp(S)(2x – 3(x-squared)), which we just calculated, and then we can note that r(0) = -45/11, r(1) = -41/11, and r(2) = -43/11, then we see that the product of <p and r> is 2(3)(-45/11) + 3(2)(-41/11) + 4(-3)(-43/11). If we factor out all those 1/11, and then we all see it’s -270 – 246 + 516, which equals 0. And so we have that product of <p with the perp(S)(q)> equals 0, as desired. And moreover, since r is of the form p + sq for some s in R, we know that the span of {p, q} will equal the span of {p, r}, and so {p, r} is an orthogonal basis for the span of {p, q}.

So, this example doubles as an easy example of the generalized Gram-Schmidt procedure. In general, all of our original arguments for the Gram-Schmidt procedure translate into the world of inner product spaces. That is, if V is an inner product space, and if the set A of vectors {w1 through wn} is a basis for a subspace S of V, then the set {v1 through vn} defined by the procedure where we set v1 = w1, and then v2 = w2 – ((the product of <w2, v1>)/(the product of <v1, v1>))v1, and so on, until we get that vn will be wn – (((the product of <wn with v1>)/<v1 with v1>) times the vector v1) – all the way through to ((the product of <wn, v(n-1)>)/(the product <v(n-1), v(n-1)>) times our vector v(n-1)), will be an orthogonal basis for S. And also, as before, we can loosen our conditions, and simply have that A is a spanning set as long as we throw out any 0 vectors that are created by our algorithm.

So basically, the Gram-Schmidt procedure will work exactly as before, except now we’re taking inner products instead of dot products.

Now, there are two more results about orthogonal and orthonormal sets that we want to translate into this general world of inner product spaces. The first is that any orthogonal set that does not contain the 0 vector is linearly independent. And thus, all orthonormal sets are linearly independent. The second is that, if our set B of vectors {v1 through vn} is an orthogonal basis for an inner product space V, and if v is any element of our vector space V, then the coordinates b of v with respect to our set B will satisfy the equation that bi equals (the inner product of <v with vi>)/(the inner product of <vi with vi>). And if B is an orthonormal basis, then we have that the inner product of <vi and vi> is 1, so we simply get that bi equals the inner product of <v with vi>.

Let’s demonstrate this with an example. Let’s let the set B as shown here, then it’s an orthonormal basis in M(2,
2) with the inner product (trace ((B-transpose)A)). Let’s find the coordinates of the matrix [1, 2; 0, 1] with respect to B. We simply need to compute the inner product of [1, 2; 0, 1] with each of our basis vectors. So, for b1, we’ll get the inner product of <[1, 2; 0, 1] with [1/2, 1/2; 1/2, 1/2]>. Again, we can compute our inner product by simply taking the product of each entries and summing them, so we get 1(1/2) + 2(1/2) + 0(1/2) + 1(1/2), which will equal 2. To get our second component, we look at the product of <[1, 2; 0, 1] with our second basis vector [1/2, -1/2; 1/2, -1/2]>, and if we perform the calculation, we’ll see that it equals -1. To get our third coordinate, we look at the product of <[1, 2; 0, 1] with our third basis vector>, and we’ll do the calculations and see that it is -1/(root 2). And lastly, to get our fourth coordinate, we will look at the inner product of <[1, 2; 0, 1] with our fourth basis vector>. We’ll perform our calculations and see that it equals 1/(root 2). And so we’ve found that the coordinates of [1, 2; 0, 1] with respect to the orthonormal basis B are the vector [2; -1; -1/(root 2); 1/(root 2)].