Saturday, May 12, 2007

Bigger Means Better. Topic: Algebra/Inequalities/Sets. Level: Olympiad.

Definition: A set $S$ is said to be convex if, for any $X, Y \in S$ and $\lambda \in [0,1]$, we have $\lambda X+(1-\lambda)Y \in S$.

--------------------

Definition: Let $f: D \rightarrow \mathbb{R}$ be a real-valued function defined on a convex set $D$. We say that $f$ is convex if, for all $X, Y \in D$ and $\lambda \in [0, 1]$, we have

$\lambda f(X) + (1-\lambda) f(Y) \ge f(\lambda X+(1-\lambda)Y)$.

--------------------

Theorem: (Jensen's Inequality) Let $f$ be a real-valued, convex function defined on a convex set $D$. Given $x_1, x_2, \ldots, x_n \in D$ and nonnegative reals $w_1, w_2, \ldots, w_n$ such that $w_1+w_2+\cdots+w_n = 1$, we have

$w_1f(x_1)+w_2f(x_2)+\cdots+w_nf(x_n) \ge f(w_1x_1+w_2x_2+\cdots+w_nx_n)$

or, more concisely,

$\displaystyle \sum_{i=1}^n w_if(x_i) \ge f\left(\sum_{i=1}^n w_ix_i\right)$.

Proof: We proceed by induction on $n$.

Base Case - $n = 1$, we have $f(x_1) \ge f(x_1)$ which is trivially true. $n = 2$, we have $\lambda_1 f(x_1)+\lambda_2 f(x_2) \ge f(\lambda_1 x_1+\lambda_2 x_2)$ with $\lambda_1+\lambda_2 = 1$ which is true by the definition of convexity.

Induction Step - Suppose $\displaystyle \sum_{i=1}^k w_if(x_i) \ge f\left(\sum_{i=1}^k w_ix_i\right)$. We wish to show

$\displaystyle \sum_{i=1}^{k+1} u_if(x_i) \ge f\left(\sum_{i=1}^{k+1} u_ix_i\right)$

for nonnegative reals $u_1, u_2, \ldots, u_{k+1}$ that sum to $1$. Well,

$\displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right)$

by the definition of convexity. But since $\displaystyle \frac{1}{1-u_{k+1}} \sum_{i=1}^k u_i = \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} = 1$, by our inductive hypothesis (the $k$ term case) we must have

$\displaystyle f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) \le \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i)$.

Combining this into the above inequality, we obtain

$\displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})\sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) = \sum_{i=1}^{k+1} u_if(x_i)$

as desired, completing the induction. QED.

--------------------

Definition: The convex hull of a set $S$ is defined to be the smallest convex set containing $S$. It is the set of all points that can be written as $\displaystyle \sum_{i=1}^n a_ix_i$, where $n$ is a natural number, $a_1, a_2, \ldots, a_n$ are nonnegative reals that sum to $1$, and $x_1, x_2, \ldots, x_n \in S$.

--------------------

Definition: A vertex of a set $D$ is a point $v \in D$ such that for all $x \in D$ not equal to $v$ and $\lambda > 1$ we have $\lambda v+(1-\lambda)x \not\in D$.

--------------------

Theorem: Let $V_D = \{v_1, v_2, \ldots, v_k\}$ be the set of vertices of a compact convex set $D$. The convex hull of $V_D$ is $D$.

Example: The set of vertices of a convex polygon is, in fact, its vertices as traditionally defined in geometry. Any point inside the polygon can be written as a convex combination of its vertices.

--------------------

Theorem: If $f$ is a real-valued, convex function defined on a compact convex set $D$, then $\displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x)$, where $V_D$ is the set of vertices of $D$.

Proof: We will show that $\displaystyle f(x) \le \max_{x \in V_D} f(x)$ for all $x \in D$.

Let $\displaystyle x = \sum_{i=1}^k \lambda_i v_i$ where $\lambda_1, \lambda_2, \ldots, \lambda_k$ are nonnegative reals that sum to $1$ and $V_D = \{v_1, v_2, \ldots, v_k\}$. This is possible by the preceding theorem. Then by Jensen's Inequality we get

$\displaystyle \sum_{i=1}^k \lambda_i f(v_i) \ge f\left(\sum_{i=1}^k \lambda_i v_i\right) = f(x)$.

And since $\displaystyle \max_{x \in V_D} f(x) \ge \sum_{i=1}^k \lambda_i f(v_i)$, we have $\displaystyle \max_{x \in V_D} f(x) \ge f(x)$ for all $x \in D$. Furthermore, since $V_D$ is a subset of $D$, we know that this maximum is attained, thus

$\displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x)$

as desired. QED.

--------------------

Comment: This is a very useful result, as it allows us to limit our search for the maximum of a function to its set of vertices, which is usually a considerably smaller set, though it may still be infinite (think sphere). In any case, this works well for constrained optimization problems in which the domain is limited to a polygon, the easiest application of this theorem.

--------------------

Practice Problem: Let $x$ and $y$ be real numbers satisfying $|2x-y| \le 3$ and $|y-3x| \le 1$. Find the maximum value of $f(x, y) = (x-y)^2$.