Transcript — Introduction
So far in linear algebra, we have been entirely looking at linear forms a1x1 + up to anxn. However, a very important topic in many areas of mathematics is that of a quadratic form—that is, an equation which is a linear combination of all possible terms xixj for 1 less than or equal to i less than or equal to j less than or equal to n. For example, for two variables x1 and x2, we have a quadratic form has the form a1(x1-squared) + a2x1x2 + a3(x2-squared), and for three variables x1, x2, x3, we get a quadratic form has the form a1(x1-squared) + a2x1x2 + a3x1x3 + a4(x2-squared) + a5x2x3 + a6(x3-squared).
Although quadratic forms are not linear, we do have a connection between quadratic forms and linear algebra through matrix multiplication. For example, we have (x-transpose)[a, b; c, d]x = [(ax1 + cx2), (bx1 + dx2)][x1; x2], which is (ax1 + cx2)x1 + (bx1 + dx2)x2, which, after simplifying, gives us the quadratic form a(x1-squared) + (b + c)x1x2 + d(x2-squared). We can use this to make a more useful definition for a quadratic form on Rn.
Definition: We define a quadratic form on Rn with corresponding matrix A by Q(x) = (x-transpose)Ax for all x in Rn. Notice that the “x-transpose times a matrix” part looks familiar. In particular, there is a connection between the inner product—i.e., the dot product—and quadratic forms. Observe that Q(x) = (x-transpose)Ax, which, by our wonderful formula, is equal to x dot product with Ax, which is Ax dot x since the dot product is symmetric, which is ((Ax)-transpose)x, which gives us (x-transpose)(A-transpose)x. Hence, A and A-transpose give the same quadratic form. But it is very important to notice that this does not imply that A = A-transpose since we cannot apply Theorem 3.1.4 in this situation. Of course, this does show that it is most natural when A is symmetric. In fact, it can be shown that every quadratic form can be written as Q(x) = (x-transpose)Ax where A is symmetric. Moreover, each symmetric matrix A uniquely determines a quadratic form. Thus, as we will see, we often deal with a quadratic form and its corresponding symmetric matrix in the same way.
Example: What is the quadratic form corresponding to A = [2, 3; 3, -1]? Solution: We have Q(x1, x2) = (x-transpose)Ax, which is [x1, x2][2, 3; 3, -1][x1; x2], which gives us [x1, x2][2x1 + 3x2; 3x1 – x2], which is 2(x1-squared) + 6x1x2 – (x2-squared).
Example: What is the symmetric matrix corresponding to the quadratic form Q(x1, x2) = 4(x1-squared) – 2x1x2 + 7(x2-squared)? Solution: We carefully analyze how the calculations worked above. We notice that we can think of the corresponding symmetric matrix as a grid, with the rows and columns corresponding to x1 and x2. Then, each entry must sum up to the total of the corresponding coefficient. So the 1,1 entry in the symmetric matrix is the x1,x1 grid, and so it must have the coefficient of x1-squared, which is 4. The 1,2 entry of the symmetric matrix is the x1,x2 grid, but the 2,1 entry is also a x2,x1 grid, and so they have to split the coefficient of x1x2 between them, and so they each get 1/2 of -2, and hence, the 1,2 and 2,1 entries are both -1. The 2,2 entry of the symmetric matrix is the x2,x2 grid, and so it is the coefficient of x2-squared, 7. Hence, we see that the corresponding symmetric matrix for the quadratic form is A = [4, -1; -1, 7].
Example: Find the symmetric matrix corresponding to Q(x1, x2, x3) = 2(x1-squared) + 4x1x2 + 2x1x3 – 3(x2-squared) – 6x2x3 + 5(x3-squared). Solution: Using the grid method, we see that we must have A = [2, 2, 1; 2, -3, -3; 1, -3, 5]. Take a minute to look over this example to make sure that you understand where all the entries come from. As we will soon see, it is very important to be able to quickly convert back and forth between a quadratic form and its corresponding symmetric matrix.
If A is a diagonal matrix, then the quadratic form looks much nicer, and is much, much easier to use. Definition: If the symmetric matrix corresponding to the quadratic form Q(x) = (x-transpose)Ax is diagonal, then we say that Q(x) is in diagonal form.
We know by the Principal Axis Theorem that if we are given any real symmetric matrix, then we can orthogonally diagonalize it. We now prove that orthogonally diagonalizing a symmetric matrix corresponds to performing a change of variables on the quadratic form Q(x) = (x-transpose)Ax that brings Q(x) into diagonal form.
Theorem 10.3.1: If Q(x) = (x-transpose)Ax is a quadratic form in n variables with corresponding symmetric matrix A, and P is an orthogonal matrix such that (P-transpose)AP is equal to the diagonal matrix (lambda1 to lambda_n), where lambda1 to lambda_n are the eigenvalues of A, then performing the change of variables y = (P-transpose)x gives Q(x) = (lambda1)(y1-squared) + up to (lambda_n)(yn-squared).
Proof: Since P is orthogonal, we have that y = (P-transpose)x implies that x = Py, and so we have Q(x) = (x-transpose)Ax, which is ((Py)-transpose)A(Py), which is (y-transpose)((P-transpose)AP)y—that is, (y-transpose)Dy, which is the quadratic form in the variables y corresponding to the diagonal matrix D, which is (lambda1)(y1-squared) + up to (lambda_n)(yn-squared) as required.
Example: Find new variables y1, y2, y3, and y4 such that Q(x1, x2, x3, x4) = 3(x1-squared) + 2x1x2 – 10x1x3 + 10x1x4 + 3(x2-squared) + 10x2x3 – 10x2x4 + 3(x3-squared) + 2x3x4 + 3(x4-squared) has diagonal form. Solution: Let x = [x1; x2; x3; x4], and A be the corresponding symmetric matrix [3, 1, -5, 5; 1, 3, 5, -5; -5, 5, 3, 1; 5, -5, 1, 3]. We first want to find an orthogonal matrix P that orthogonally diagonalizes A. We find, with a little effort, that P = [1/2, 1/2, 1/2, 1/2; -1/2, -1/2, 1/2, 1/2; -1/2, 1/2, 1/2, -1/2; 1/2, -1/2, 1/2, -1/2] gives (P-transpose)AP is the diagonal matrix (12, -8, 4, 4).
Thus, according to the theorem, we take y = [y1; y2; y3; y4] to be equal to (P-transpose)x, which is [(1/2)x1 – (1/2)x2 – (1/2)x3 + (1/2)x4; (1/2)x1 – (1/2)x2 + (1/2)x3 – (1/2)x4; (1/2)x1 + (1/2)x2 + (1/2)x3 + (1/2)x4; (1/2)x1 + (1/2)x2 – (1/2)x3 – (1/2)x4], which gives Q(x1, x2, x3, x4) = 12(y1-squared) – 8(y2-squared) + 4(y3-squared) + 4(y4-squared). Notice that this diagonal form is much nicer than the original equation for Q(x1, x2, x3, x4).
As usual, there was a lot of calculations that went into this, and so it would be good to check our answer. How could you check your answer? Simple. We can use our formulas for y1, y2, y3, and y4 given by y = (P-transpose)x, and then just substitute each of those in for y1, y2, y3, and y4 in our equation, and simplify, and we should get the original definition of Q(x1, x2, x3, x4). Okay, in this case, I don’t actually recommend checking. However, thinking about this should help you understand the theory.
A few notes about all of this:
- The orthonormal eigenvectors we used to make up P are called the principal axes of A, which is why the theorem is called the Principal Axis Theorem. We will soon see why it is called this, and its geometric interpretation in R2 and R3.
- By changing the order of the eigenvectors in P, we also change the order of the eigenvalues in D, and hence the coefficients of the corresponding yi. For example, if we took P = [1/2, 1/2, 1/2, 1/2; -1/2, 1/2, 1/2, -1/2; 1/2, 1/2, -1/2, -1/2; -1/2, 1/2, -1/2, 1/2], then we would get Q(x1, x2, x3, x4) = -8(y1-squared) + 4(y2-squared) + 4(y3-squared) + 12(y4-squared). Notice that since we can pick any vector y in R4, this does, in fact, give us exactly the same set of values as our choice above. Alternately, you could think of just doing another change of variables z1 = y2, z2 = y3, z3 = y4, and z4 = y1. We will also see the geometric interpretation of doing this as well.