Lesson: Vector Spaces

1 point

Transcript

All of this leads us to notice that quite a lot of things that we frequently encounter have a common underlying structure, and so, instead of studying these things individually, we instead will study them in general, based only on this common structure. In a nod to Rn, that started it all, they are known as vector spaces.

Definition: A vector space over the real numbers is a set V, together with an operation of addition (usually denoted as x + y for any x and y in V) and an operation of scalar multiplication (usually denoted sx for any x in V and s in R) such that for any x, y, and z in V, and s and t in R, we have all of the following properties:

1. V1) x + y is in V, so V is closed under addition.
2. V2) Quantity (x + y) + z = x + the quantity (y + z), so addition is associative.
3. V3) There is an element 0 in V called the 0 vector such that x + 0 = x = 0 + x. This is an additive identity.
4. V4) For each x in V, there exists an element inverse-x such that x + (inverse-x) = 0. This is our additive inverse property.
5. V5) x + y = y + x, so addition is commutative.
6. V6) sx is in V, so V is closed under scalar multiplication.
7. V7) s(tx) = (st)x, so scalar multiplication is associative.
8. V8) Quantity (s + t)x = sx + tx. The scalar addition is distributive.
9. V9) s times quantity (x + y) = sx + sy, so scalar multiplication is distributive.
10. And V10) 1x = x, which is our scalar multiplicative identity.

So note, in general, an element of a vector space V is known as a vector. As elements of Rn are also known as vectors, this can be confusing, so to help, we use the notation bold-x to mean a vector from a general vector space and reserve the symbol x with an arrow over it to mean an element of Rn.

We have already seen that Rn is a vector space, as well as the set of m-by-n matrices, and polynomials of degree up to n. As notation, we will write M(m, n) for the vector space of m-by-n matrices, and we write Pn for the vector space of polynomials of degree up to n. Some other vector spaces are F, the set of all functions from R to R, and F(a, b), the set of all functions on the interval (a, b) to R.

There are many more common vector spaces, which we will explore throughout the course, but first I’d like to take a look at a non-standard vector space. Things get interesting when non-standard definitions of addition and scalar multiplication are used. In these cases, the usual notation for addition and scalar multiplication are replaced with the symbols circle-plus and circle-dot, and sometimes even circle-plus-V, circle-dot-V are used if we need to keep track of which vector space we are referring to.

So let’s look at this example. Let V be the set of all points (a, b) such that a and b are real numbers and b is greater than 0. We will define addition in V by saying that (a, b) circle-plus (c, d) = (ad + bc, bd), and we define scalar multiplication in V by t circle-dot (a, b) = (ta(b-to-the-(t-1)), b-to-the-t). Now let’s show that V is a vector space, paying close attention to how the axioms look with our unusual definitions. To that end, we will let (a, b), (c, d), and (e, f) all be in V, and we will let s and t be in R.

Property V1: Well, we know that (a, b) circle-plus (c, d) = (ad + bc, bd). Obviously, ad + bc is in R and bd is in R, and since both b is greater than 0 and d is greater than 0, we have that bd is greater than 0. So this means that (ad + bc, bd) is in V, and so our sum (a, b) circle-plus (c, d) is in V, so V is closed under addition.

Property V2: We want to look at the quantity ((a, b) circle-plus (c, d)) plus (e, f). Well, this is going to equal (ad + bc, bd) circle-plus (e, f). Well, this equals (quantity (ad + bc) times f + quantity (bd)e, (bd)f). If we multiply all this out, we get (adf + bcf + bde, bdf), which we could then rewrite as saying it’s (a(df) + b(cf + de), b(df)). This is the same thing as writing (a, b) circle-plus (cf + de, df), which equals (a, b) circle-plus the quantity ((c, d) circle-plus (e, f)). As a side note, since we’ve already proved V1, we don’t need to worry about whether or not any of our intermediate steps are in V, nor shall we, in the future.

V3: To prove this property, we need to find an element 0 of V such that (a, b) circle-plus 0 = (a, b). Let’s assume that 0 = (x, y). Then we want (a, b) circle-plus (x, y), which equals (ay + bx, by), to equal (a, b). Obviously, this means that we want b = by, which means that y must equal 1, and then we want ay + bx to equal a. Well, plugging in y = 1, this becomes a + bx = a, so we must have that bx = 0. Now, b does not equal 0 since b is greater than 0, so the only way to have that bx = 0 is to have x = 0. Now, all of this work leads us to the guess that our 0 should be (0, 1). Now we need to prove it. First, we will note that (0, 1) is in our vector space V since 0 and 1 are both real numbers and 1 is greater than 0. Next, we note that (a, b) circle-plus (0, 1) = (a(1) + b(0), b(1)), which does, in fact, equal (a, b), and that (0, 1) circle-plus (a, b) = (0(b) + 1(a), 1(b)), which again equals (a, b). And so we see that V3 holds, with (0, 1) as our 0 vector.

V4: We found in V3 that our 0 vector is (0, 1). Now, given an (a, b), we need to find inverse-(a, b) in our vector space. So let’s look for (w, z) such that (a, b) circle-plus (w, z) = (0, 1). Well, we know that (a, b) circle-plus (w, z) = (az + bw, bz), so we need that bz = 1 and az + bw = 0. From bz = 1, we get that z = 1/b. Note that we can divide by b since b is greater than 0. Plugging this into az + bw = 0, we get that a(1/b) + bw = 0, so w = –(a/(b-squared)). So now we guess that our inverse-(a, b) should equal (–(a/(b-squared)), 1/b). First, we note that since b is greater than 0, 1/b is also greater than 0, so (–(a/(b-squared)), 1/b) is in fact an element of V. And then next, we see that (a, b) circle-plus (–(a/(b-squared)), 1/b) = (a(1/b) + b(–(a/(b-squared))), b(1/b)), which equals (a/b – a/b, 1), which equals (0, 1) as desired.

V5: (a, b) circle-plus (c, d) = (ad + bc, bd), and rearranging, we can see that this is also equal to (cb + da, db), which is (c, d) circle-plus (a, b). So our circle-plus is commutative.

V6: s circle-dot (a, b) = (sa(b-to-the-(s-1)), b-to-the-s). Now since b is greater than 0, b-to-the-s is also greater than 0 for any s, and of course, sa(b-to-the-(s-1)) and b-to-the-s are all real numbers, so (sa(b-to-the-(s-1)), b-to-the-s) is in our vector space V, which means that V is closed under scalar multiplication.

V7: s circle-dot (t circle-dot (a, b)) = s circle-dot (ta(b-to-the-(t-1)), b-to-the-t). Well, this equals (sta(b-to-the-(t-1))((b-to-the-t)-to-the-(s-1)), (b-to-the-t)-to-the-s). This is equal to (sta(b-to-the-(t-1))(b-to-the-(ts-t)), b-to-the-(ts)), which equals (sta(b-to-the-(t-1+ts-t)), b-to-the-(ts)), which equals (sta(b-to-the-(ts-1)), b-to-the-(ts)). This equals (sta(b-to-the-(st-1)), b-to-the-(st)), which is (st) circle-dot (a, b).

V8: (s + t) circle-dot (a, b) = ((s + t)a(b-to-the-(s+t-1)), b-to-the-(s+t)), which equals (sa(b-to-the-(s-1))(b-to-the-t) + ta(b-to-the-(t-1))(b-to-the-s), (b-to-the-s)(b-to-the-t)). This equals (sa(b-to-the-(s-1)), b-to-the-s) circle-plus (ta(b-to-the-(t-1)), b-to-the-t), which equals (s circle-dot (a, b)) circle-plus (t circle-dot (a, b)).

V9: s circle-dot ((a, b) circle-plus (c, d)), well that’s equal to s circle-dot (ad + bc, bd). This is equal to (s(ad + bc)((bd)-to-the-(s-1)), (bd)-to-the-s). This is equal to (sad((bd)-to-the-(s-1)) + sbc((bd)-to-the-(s-1)), (b-to-the-s)(d-to-the-s)). This equals (sa(b-to-the-(s-1))(d-to-the-s) + sc(d-to-the-(s-1))(b-to-the-s), (b-to-the-s)(d-to-the-s)), which equals (sa(b-to-the-(s-1)), b-to-the-s) circle-plus (sc(d-to-the-(s-1)), d-to-the-s), which equals (s circle-dot (a, b)) circle-plus (s circle-dot (c, d)).

And finally, as V10, we see that 1 circle-dot (a, b) = (1a(b-to-the-(1-1)), b-to-the-1), which equals (a(b-to-the-0), b), which is (a(1), b), which is (a, b).

Examples such as this one force us to really pay attention to what the vector properties are saying. Thankfully, such situations rarely occur, as we will mostly focus on the standard vector spaces. With the standard vector spaces, our properties are obviously true, so we will not focus our attention so much on proving that these properties hold, but instead, we will use the properties, known to be true for all vector spaces, to prove other facts that will also be true for all vector spaces.

The first example of this is the following theorem. Let V be a vector space. Then we know that 0x = the 0 vector for all x in V, that (-1)x = inverse-x for all x in V, and that t times the 0 vector equals the 0 vector for all scalars t in R.

I will prove property 2 now, and I will leave the proof of property 3 as a practice problem. So, proof of statement 2. Before I actually dive into the proof, I want to talk about what it says because up until now, we’ve been taking it as a notational convention that (-1)x = inverse-x. But in this section, we introduce the notation inverse-x to mean the additive inverse of x, and not necessarily the scalar product of (-1)x. Of course, the point of this theorem is to show that these two values are, in fact, equal, thus justifying our earlier decision to set them equal.

Now, to start the proof, I’ll actually want to prove another fact first: that the additive inverse is unique. That is to say, if x + y = 0 and x + z = 0, then y = z. To see this, let x, y, and z be as stated, and we’ll notice that z = 0 + z by property V3, which equals (x + y) + z by our choice of y. Well, this equals (y + x) + z by property V5 (our commutative property). Well, this equals y + (x + z) by property V2 (our associative property). Well, this equals y + 0 by our choice of z, which equals y by property V3.

So, thanks to the uniqueness of the additive inverse, we now know that in order to show that some y equals the additive inverse of x (i.e., to show that y = inverse-x), we need to show that it satisfies the condition in V4 (i.e., that x + inverse-x = 0). In our particular case, we suspect that inverse-x is (-1)x, and so we will look at x + (-1)x. x + (-1)x = 1x + (-1)x using property V10, but this is equal to (1 + (-1))x by one of our distributive properties (V8). Well, this equals 0x—this is just an operation in the real numbers. And from the first part of our theorem, we know that this must equal the 0 vector. And so, since x + (-1)x = the 0 vector, we know that (-1)x is inverse-x.