Definition: A set $ S $ is said to be convex if, for any $ X, Y \in S $ and $ \lambda \in [0,1] $, we have $ \lambda X+(1-\lambda)Y \in S $.
--------------------
Definition: Let $ f: D \rightarrow \mathbb{R} $ be a real-valued function defined on a convex set $ D $. We say that $ f $ is convex if, for all $ X, Y \in D $ and $ \lambda \in [0, 1] $, we have
$ \lambda f(X) + (1-\lambda) f(Y) \ge f(\lambda X+(1-\lambda)Y) $.
--------------------
Theorem: (Jensen's Inequality) Let $ f $ be a real-valued, convex function defined on a convex set $ D $. Given $ x_1, x_2, \ldots, x_n \in D $ and nonnegative reals $ w_1, w_2, \ldots, w_n $ such that $ w_1+w_2+\cdots+w_n = 1 $, we have
$ w_1f(x_1)+w_2f(x_2)+\cdots+w_nf(x_n) \ge f(w_1x_1+w_2x_2+\cdots+w_nx_n) $
or, more concisely,
$ \displaystyle \sum_{i=1}^n w_if(x_i) \ge f\left(\sum_{i=1}^n w_ix_i\right) $.
Proof: We proceed by induction on $ n $.
Base Case - $ n = 1 $, we have $ f(x_1) \ge f(x_1) $ which is trivially true. $ n = 2 $, we have $ \lambda_1 f(x_1)+\lambda_2 f(x_2) \ge f(\lambda_1 x_1+\lambda_2 x_2) $ with $ \lambda_1+\lambda_2 = 1$ which is true by the definition of convexity.
Induction Step - Suppose $ \displaystyle \sum_{i=1}^k w_if(x_i) \ge f\left(\sum_{i=1}^k w_ix_i\right) $. We wish to show
$ \displaystyle \sum_{i=1}^{k+1} u_if(x_i) \ge f\left(\sum_{i=1}^{k+1} u_ix_i\right) $
for nonnegative reals $ u_1, u_2, \ldots, u_{k+1} $ that sum to $ 1 $. Well,
$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) $
by the definition of convexity. But since $ \displaystyle \frac{1}{1-u_{k+1}} \sum_{i=1}^k u_i = \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} = 1 $, by our inductive hypothesis (the $ k $ term case) we must have
$ \displaystyle f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) \le \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) $.
Combining this into the above inequality, we obtain
$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})\sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) = \sum_{i=1}^{k+1} u_if(x_i) $
as desired, completing the induction. QED.
--------------------
Definition: The convex hull of a set $ S $ is defined to be the smallest convex set containing $ S $. It is the set of all points that can be written as $ \displaystyle \sum_{i=1}^n a_ix_i $, where $ n $ is a natural number, $ a_1, a_2, \ldots, a_n $ are nonnegative reals that sum to $ 1 $, and $ x_1, x_2, \ldots, x_n \in S $.
--------------------
Definition: A vertex of a set $ D $ is a point $ v \in D $ such that for all $ x \in D $ not equal to $ v $ and $ \lambda > 1 $ we have $ \lambda v+(1-\lambda)x \not\in D $.
--------------------
Theorem: Let $ V_D = \{v_1, v_2, \ldots, v_k\} $ be the set of vertices of a compact convex set $ D $. The convex hull of $ V_D $ is $ D $.
Example: The set of vertices of a convex polygon is, in fact, its vertices as traditionally defined in geometry. Any point inside the polygon can be written as a convex combination of its vertices.
--------------------
Theorem: If $ f $ is a real-valued, convex function defined on a compact convex set $ D $, then $ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $, where $ V_D $ is the set of vertices of $ D $.
Proof: We will show that $ \displaystyle f(x) \le \max_{x \in V_D} f(x) $ for all $ x \in D $.
Let $ \displaystyle x = \sum_{i=1}^k \lambda_i v_i $ where $ \lambda_1, \lambda_2, \ldots, \lambda_k $ are nonnegative reals that sum to $ 1 $ and $ V_D = \{v_1, v_2, \ldots, v_k\} $. This is possible by the preceding theorem. Then by Jensen's Inequality we get
$ \displaystyle \sum_{i=1}^k \lambda_i f(v_i) \ge f\left(\sum_{i=1}^k \lambda_i v_i\right) = f(x) $.
And since $ \displaystyle \max_{x \in V_D} f(x) \ge \sum_{i=1}^k \lambda_i f(v_i) $, we have $ \displaystyle \max_{x \in V_D} f(x) \ge f(x) $ for all $ x \in D $. Furthermore, since $ V_D $ is a subset of $ D $, we know that this maximum is attained, thus
$ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $
as desired. QED.
--------------------
Comment: This is a very useful result, as it allows us to limit our search for the maximum of a function to its set of vertices, which is usually a considerably smaller set, though it may still be infinite (think sphere). In any case, this works well for constrained optimization problems in which the domain is limited to a polygon, the easiest application of this theorem.
--------------------
Practice Problem: Let $ x $ and $ y $ be real numbers satisfying $ |2x-y| \le 3 $ and $ |y-3x| \le 1 $. Find the maximum value of $ f(x, y) = (x-y)^2 $.
O_O so little people aren't as better as big people O_O
ReplyDeleteIn retrospect, that proof should've been obvious. Tricky!
ReplyDeleteOMG ur mom's blog is so cool!
ReplyDeletemy dad totally had freakanomics and i read parts of it and its really good.