**Definition**: A set $ S $ is said to be

*convex* if, for any $ X, Y \in S $ and $ \lambda \in [0,1] $, we have $ \lambda X+(1-\lambda)Y \in S $.

--------------------

**Definition**: Let $ f: D \rightarrow \mathbb{R} $ be a real-valued function defined on a convex set $ D $. We say that $ f $ is

*convex* if, for all $ X, Y \in D $ and $ \lambda \in [0, 1] $, we have

$ \lambda f(X) + (1-\lambda) f(Y) \ge f(\lambda X+(1-\lambda)Y) $.

--------------------

**Theorem**: (Jensen's Inequality) Let $ f $ be a real-valued, convex function defined on a convex set $ D $. Given $ x_1, x_2, \ldots, x_n \in D $ and nonnegative reals $ w_1, w_2, \ldots, w_n $ such that $ w_1+w_2+\cdots+w_n = 1 $, we have

$ w_1f(x_1)+w_2f(x_2)+\cdots+w_nf(x_n) \ge f(w_1x_1+w_2x_2+\cdots+w_nx_n) $

or, more concisely,

$ \displaystyle \sum_{i=1}^n w_if(x_i) \ge f\left(\sum_{i=1}^n w_ix_i\right) $.

**Proof**: We proceed by induction on $ n $.

Base Case - $ n = 1 $, we have $ f(x_1) \ge f(x_1) $ which is trivially true. $ n = 2 $, we have $ \lambda_1 f(x_1)+\lambda_2 f(x_2) \ge f(\lambda_1 x_1+\lambda_2 x_2) $ with $ \lambda_1+\lambda_2 = 1$ which is true by the definition of convexity.

Induction Step - Suppose $ \displaystyle \sum_{i=1}^k w_if(x_i) \ge f\left(\sum_{i=1}^k w_ix_i\right) $. We wish to show

$ \displaystyle \sum_{i=1}^{k+1} u_if(x_i) \ge f\left(\sum_{i=1}^{k+1} u_ix_i\right) $

for nonnegative reals $ u_1, u_2, \ldots, u_{k+1} $ that sum to $ 1 $. Well,

$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) $

by the definition of convexity. But since $ \displaystyle \frac{1}{1-u_{k+1}} \sum_{i=1}^k u_i = \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} = 1 $, by our inductive hypothesis (the $ k $ term case) we must have

$ \displaystyle f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) \le \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) $.

Combining this into the above inequality, we obtain

$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})\sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) = \sum_{i=1}^{k+1} u_if(x_i) $

as desired, completing the induction. QED.

--------------------

**Definition**: The

*convex hull* of a set $ S $ is defined to be the smallest convex set containing $ S $. It is the set of all points that can be written as $ \displaystyle \sum_{i=1}^n a_ix_i $, where $ n $ is a natural number, $ a_1, a_2, \ldots, a_n $ are nonnegative reals that sum to $ 1 $, and $ x_1, x_2, \ldots, x_n \in S $.

--------------------

**Definition**: A

*vertex* of a set $ D $ is a point $ v \in D $ such that for all $ x \in D $ not equal to $ v $ and $ \lambda > 1 $ we have $ \lambda v+(1-\lambda)x \not\in D $.

--------------------

**Theorem**: Let $ V_D = \{v_1, v_2, \ldots, v_k\} $ be the set of vertices of a compact convex set $ D $. The convex hull of $ V_D $ is $ D $.

**Example**: The set of vertices of a convex polygon is, in fact, its vertices as traditionally defined in geometry. Any point inside the polygon can be written as a convex combination of its vertices.

--------------------

**Theorem**: If $ f $ is a real-valued, convex function defined on a compact convex set $ D $, then $ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $, where $ V_D $ is the set of vertices of $ D $.

**Proof**: We will show that $ \displaystyle f(x) \le \max_{x \in V_D} f(x) $ for all $ x \in D $.

Let $ \displaystyle x = \sum_{i=1}^k \lambda_i v_i $ where $ \lambda_1, \lambda_2, \ldots, \lambda_k $ are nonnegative reals that sum to $ 1 $ and $ V_D = \{v_1, v_2, \ldots, v_k\} $. This is possible by the preceding theorem. Then by Jensen's Inequality we get

$ \displaystyle \sum_{i=1}^k \lambda_i f(v_i) \ge f\left(\sum_{i=1}^k \lambda_i v_i\right) = f(x) $.

And since $ \displaystyle \max_{x \in V_D} f(x) \ge \sum_{i=1}^k \lambda_i f(v_i) $, we have $ \displaystyle \max_{x \in V_D} f(x) \ge f(x) $ for all $ x \in D $. Furthermore, since $ V_D $ is a subset of $ D $, we know that this maximum is attained, thus

$ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $

as desired. QED.

--------------------

Comment: This is a very useful result, as it allows us to limit our search for the maximum of a function to its set of vertices, which is usually a considerably smaller set, though it may still be infinite (think sphere). In any case, this works well for constrained optimization problems in which the domain is limited to a polygon, the easiest application of this theorem.

--------------------

Practice Problem: Let $ x $ and $ y $ be real numbers satisfying $ |2x-y| \le 3 $ and $ |y-3x| \le 1 $. Find the maximum value of $ f(x, y) = (x-y)^2 $.