Lesson: Maximizing and Minimizing Quadratic Forms

Question 2

1 point

Transcript — Introduction

In some applications of quadratic forms, we want to find the maximum and/or minimum of the quadratic form subject to a constraint. Most of the time, it is possible to use a change of variables so that the constraint is the length of the vector x equals 1. That is, given a quadratic form Q(x) = (x-transpose)Ax on Rn, we want to find the maximum and minimum value of Q(x) subject to the length of x equals 1. For ease, observe that we can rewrite the constraint as 1 = x1-squared + up to xn-squared.

Examples

How are we going to find the maximum and minimum? Let’s see if we can figure out how this works by first looking at a simple example.

Example: Find the maximum and minimum of Q(x1, x2) = 2(x1-squared) + 3(x2-squared) subject to the constraint, the length of x is equal to 1. Solution: Let’s begin by trying out the simplest way of trying to find the maximum and minimum—by trying points (x1, x2) that satisfy the constraint. So, trying some points that lie on the unit circle, we find that Q(1, 0) = 2, Q(1/(root 2), 1/(root 2)) is 5/2, Q(1/2, (root 3)/2) is 11/4, and Q(0, 1) = 3.

Although we have only tried a few points, we may guess that 3 is the maximum value. How do we prove that it is the maximum value? Since we already know that Q(0, 1) = 3, we just need to show that all other values of Q(x) such that x satisfies the constraint must be less than or equal to 3. That is, we need to show that 3 is also an upper bound for Q. We have that Q(x1, x2) = 2(x1-squared) + 3(x2-squared). We can make this larger by adding the positive quantity x1-squared, so Q(x1, x2) is less than or equal to 3(x1-squared) + 3(x2-squared), which is 3(x1-squared + x2-squared). But this equals 3 since (the length of x)-squared equals 1 implies x1-squared + x2-squared = 1. Hence, 3 is an upper bound for Q(x1, x2), and Q(0, 1) = 3, so 3 is the maximum value of Q.

Similarly, we have that Q(x) = 2(x1-squared) + 3(x2-squared), which is always greater than or equal to 2(x1-squared) + 2(x2-squared), which is equal to 2. And so 2 is a lower bound for Q, and Q(1, 0) = 2, so the minimum value of Q is 2.

Now, let’s see if we can use what we did here to figure out how to solve a harder problem. Example: Find the maximum and minimum of the quadratic form Q(x1, x2, x3) = 4(x1-squared) + 4x1x2 + 4x1x3 + 4(x2-squared) + 4x2x3 + 4(x3-squared) subject to the constraint x1-squared + x2-squared + x3-squared = 1. The symmetric matrix corresponding to Q is A = [4, 2, 2; 2, 4, 2; 2, 2, 4]. We can find that the eigenvalues of A are lambda1 = 2, lambda2 = 2, and lambda3 = 8. Thus, by Theorem 10.3.1, there exists an orthogonal matrix P = [v1 v2 v3] such that the change of variables y = (P-transpose)x brings Q into diagonal form Q(x) = 2(y1-squared) + 2(y2-squared) + 8(y3-squared).

Let’s first show that 8 is an upper bound for Q. We have Q(x) is less than or equal to 8(y1-squared) + 8(y2-squared) + 8(y3-squared). Factoring out the 8, this is equal to 8(y1-squared + y2-squared + y3-squared). Argh! In the first example, we used the constraint to get that x1-squared + x2-squared = 1, but now this is in terms of y, but our constraint is still in terms of x. Are we stuck? Of course not, because math is awesome. Using the change of variables and the fact that P is orthogonal, we have that the length of y is equal to the length of ((P-transpose)x), which is just equal to the length of x, which is 1. Thus, we do have y1-squared + y2-squared + y3-squared = 1, and so Q(x) is less than or equal to 8 for all x satisfying the constraint. Hence, 8 is an upper bound for Q(x). However, we do still need to determine if (and where) this value is achieved. Clearly, taking y = [0; 0; 1] gives Q(y) = 8, and so 8 is indeed the maximum.

Similarly, we see that Q(x) is greater than or equal to 2(y1-squared) + 2(y2-squared) + 2(y3-squared), which equals 2. Thus, a lower bound of Q(x) is 2, and this value is achieved when y = [1; 0; 0], so 2 is the minimum.

Although we have finished the problem, the solution is not totally satisfactory. We want to know the value of x that gives the maximum value, and the value of x that gives the minimum value. We get the diagonal form of Q(x) by applying the change of variables y = (P-transpose)x, which we can rearrange to get x = Py. Thus, since y = [0; 0; 1] gives the maximum value of Q(x), we have that x = P[0; 0; 1], which is the third column of P, which is v3, gives the maximum value. But wait—since P is an orthogonal matrix that diagonalizes A, v3 is a unit eigenvector of A corresponding to the eigenvalue 8. Repeating this procedure, we see that the minimum, 2, is obtained when x is a unit eigenvector of A corresponding to the eigenvalue 2.

Theorem 10.5.1

Thus, the maximum and minimum occur at eigenvectors. We can use what we have derived here to prove the following theorem. Theorem 10.5.1: If Q(x) = (x-transpose)Ax is a quadratic form with symmetric matrix A, then the maximum of Q(x) subject to the length of x equals 1 is the largest eigenvalue of A, and the minimum of Q(x) subject to the length of x equals 1 is the smallest eigenvalue of A. Moreover, the maximum and minimum occur at unit eigenvectors of A corresponding to the respective eigenvalues.

Proof: Since A is symmetric, by the Principal Axis Theorem we can find an orthogonal matrix P = [v1 to vn] such that (P-transpose)AP is equal to the diagonal matrix (lambda1 to lambda_n) such that the eigenvalues lambda1 to lambda_n are arranged from greatest to least. Using Theorem 10.3.1, we get that the change of variables y = (P-transpose)x brings Q(x) into diagonal form Q(x) = (lambda1)(y1-squared) + up to (lambda_n)(yn-squared). Then, the length of y is equal to the length of ((P-transpose)x), which equals the length of x since P is orthogonal, which is 1.

We now see that Q(x) is less than or equal to (lambda1)(y1-squared) + up to (lambda1)(yn-squared), which is equal to lambda1, and Q(x) is greater than or equal to (lambda_n)(y1-squared) + up to (lambda_n)(yn-squared), which is equal to lambda_n. Moreover, we have that Q(x) is equal to lambda1 at y is equal to the first standard basis vector, which gives x = Pe1, which is v1, a unit eigenvector corresponding to lambda1. Similarly, Q(x) = lambda_n at y = en, which gives x = vn, a unit eigenvector corresponding to lambda_n. Therefore, the maximum of Q(x) is lambda1 and occurs at x = v1, and the minimum of Q(x) is lambda_n and occurs at x = vn.

Notice that the vector where the maximum and minimum of Q(x) occurs is not unique.

This ends this lecture.

© University of Waterloo and others, Powered by Maplesoft