Mathematical Food For Thought

Saturday, July 7, 2007

What's Your Function? Topic: Algebra. Level: AMC/AIME.

Problem: Given two positive reals $ \alpha $ and $ \beta $, show that there is a continuous function $ f $ that satisfies $ f(x) = f(x+\alpha)+f(x+\beta) $.

Solution: There are several special cases that are interesting to look at before we make a guess as to what type of function $ f $ will be. First we consider the case $ \alpha = \beta = 1 $. This immediately gives

$ f(x) = 2f(x+1) $.

It should not be too difficult to guess that $ f(x) = 2^{-x} $ is a solution to this, as well as any constant multiple of it. Now try $ \alpha = -1 $ and $ \beta = -2 $, resulting in

$ f(x) = f(x-1)+f(x-2) $.

Looks a lot like Fibonacci, right? In fact, one possible function is just $ f(x) = \phi^x $. That's pretty convenient.

Notice how both of these illuminating examples are exponential functions, which leads us to guess that our function will be exponential as well. So, following this track, we set

$ f(x) = a^x $

so we simply need to solve

$ a^x = a^{x+\alpha}+a^{x+\beta} $

$ a^x = a^x\left(a^{\alpha}+a^{\beta}\right) $.

Unfortunately, the equation $ a^{\alpha}+a^{\beta} = 1 $ does not always have a solution (take $ \alpha = 1 $ and $ \beta = -1 $ for example). But that's ok and I'll worry about it some other time. In any case we have found a function for whenever the equation $ a^{\alpha}+a^{\beta} = 1 $ has a solution in the reals.

Friday, June 29, 2007

Addition At Its Finest. Topic: Calculus/S&S.

Problem: Evaluate $ \displaystyle \sum_{n=1}^{\infty} \frac{x^n}{n(n+1)} $ where $ x $ is a real number with $ |x| < 1 $.

Solution: Looking at that all too common denominator, we do a partial fraction decomposition in hopes of telescoping series. The summation becomes

$ \displaystyle \sum_{n=1}^{\infty} \left(\frac{x^n}{n}-\frac{x^n}{n+1}\right) $.

Common Taylor series knowledge tells us that

$ \displaystyle \ln{(1-x)} = -\left(x+\frac{x^2}{2}+\frac{x^3}{3}+\cdots\right) = -\sum_{n=1}^{\infty} \frac{x^n}{n} $,

which convenient fits the first part of the summation. As for the second part, we get

$ \displaystyle \sum_{n=1}^{\infty} \frac{x^n}{n+1} = \frac{1}{x} \sum_{n=1}^{\infty} \frac{x^{n+1}}{n+1} = \frac{-\ln{(1-x)}-x}{x} $

from the same Taylor series. Combining the results, our answer is then

$ \displaystyle \sum_{n=1}^{\infty} \frac{x^n}{n(n+1)} = 1-\ln{(1-x)}+\frac{\ln{(1-x)}}{x} $.

QED.

--------------------

Comment: Even though the trick at the beginning didn't actually get much to telescope, the idea certainly made it easier to recognize the Taylor series. Algebraic manipulations are nifty to carry around and can be applied in problems wherever you go.

--------------------

Practice Problem: Show that $ \displaystyle \int_0^{\frac{\pi}{2}} \ln{(\tan{x})} = 0 $.

Monday, June 18, 2007

The Smaller The Better. Topic: Calculus.

Problem: Given a complicated function $ f: \mathbb{R}^n \rightarrow \mathbb{R} $, find an approximate local minimum.

Solution: The adjective complicated is only placed so that we assume there is no easy way to solve $ \bigtriangledown f = 0 $ to immediately give the solution. We seek an algorithm that will lead us to a local minimum (hopefully a global minimum as well).

We start at an arbitrary point $ X_0 = (x_1, x_2, \ldots, x_n) $. Consider the following process (for $ k = 0, 1, 2, \ldots $), known as gradient descent:

1. Calculate (approximately) $ \bigtriangledown f(X_k) $.

2. Set $ X_{k+1} = X_k - \gamma_k \bigtriangledown f(X_k) $, where $ \gamma_k $ is a constant that can be determined by a linesearch.

It is well-known that the direction of the gradient is the direction of maximum increase and the direction opposite the gradient is the direction of maximum decrease. Hence this algorithm is based on the idea that we always move in the direction that will decrease $ f $ the most. Sounds pretty good, right? Well, unfortunately gradient descent converges very slowly so it is only really useful for smaller optimization problems. Fortunately, there exist other algorithms but obviously they are more complex, such as the nonlinear conjugate gradient method or Newton's method, the latter of which involves the computation of the inverse of the Hessian matrix, which is a pain.

Wednesday, June 6, 2007

Colorful! Topic: Calculus.

Theorem: (Green's Theorem) Let $ R $ be a simply connected plane region whose boundary is a simple, closed, piecewise smooth curve $ C $ oriented counterclockwise. If $ f(x, y) $ and $ g(x, y) $ are continuous and have continuous first partial derivatives on some open set containing $ R $, then

$ \displaystyle \oint_C f(x, y) dx + g(x, y) dy = \int_R \int \left(\frac{\partial g}{\partial x}-\frac{\partial f}{\partial y}\right) dA $.

--------------------

Problem: Evaluate $ \displaystyle \oint_C x^2y dx + (y+xy^2) dy $, where $ C $ is the boundary of the region enclosed by $ y = x^2 $ and $ x = y^2 $.

Solution: First, verify that this region satisfies all of the requirements for Green's Theorem - indeed, it does. So we may apply the theorem with $ f(x, y) = x^2y $ and $ g(x, y) = y+xy^2 $. From these, we have $ \frac{\partial g}{\partial x} = y^2 $ and $ \frac{\partial f}{\partial y} = x^2 $. Then we obtain

$ \displaystyle \oint_C x^2y dx + (y+xy^2) dy = \int_R \int (y^2-x^2) dA $.
But clearly this integral over the region $ R $ can be represented as $ \displaystyle \int_0^1 \int_{x^2}^{\sqrt{x}} (y^2-x^2) dy dx $, so it remains a matter of calculation to get the answer. First, we evaluate the inner integral to get

$ \displaystyle \int_0^1 \int_{x^2}^{\sqrt{x}} (y^2-x^2) dy dx = \int_0^1 \left[\frac{y^3}{3}-x^2y\right]_{x^2}^{\sqrt{x}} dx = \int_0^1 \left(\frac{x^{3/2}}{3}-x^{5/2}-\frac{x^6}{3}+x^4\right)dx $.

Then finally we have

$ \displaystyle \int_0^1 \left(\frac{x^{3/2}}{3}-x^{5/2}-\frac{x^6}{3}+x^4\right)dx = \left[\frac{2x^{5/2}}{15}-\frac{2x^{7/2}}{7}-\frac{x^7}{21}+\frac{x^5}{5}\right]_0^1 = \frac{2}{15}-\frac{2}{7}-\frac{1}{21}+\frac{1}{5} = 0 $.

QED.

--------------------

Comment: To me, Green's Theorem is a very interesting result. It's not at all obvious that a line integral along the boundary of a region is equivalent to an integral of some partial derivatives in the region itself. A simplified proof of the result can be obtained by proving that

$ \displaystyle \oint_C f(x, y) dx = -\int_R \int \frac{\partial f}{\partial y} dA $ and $ \displaystyle \oint_C g(x, y) dy = \int_R \int \frac{\partial g}{\partial x} dA $.

--------------------

Practice Problem: Let $ R $ be a plane region with area $ A $ whose boundary is a piecewise smooth simple closed curve $ C $. Show that the centroid $ (\overline{x}, \overline{y}) $ of $ R $ is given by

$ \displaystyle \overline{x} = \frac{1}{2A} \oint_C x^2 dy $ and $ \displaystyle \overline{y} = -\frac{1}{2A} \oint_C y^2 dx $.

Monday, June 4, 2007

More Integrals... whine. Topic: Calculus.

Definition: (Jacobian) If $ T $ is the transformation from the $ uv $-plane to the $ xy $-plane defined by the equations $ x = x(u, v) $ and $ y = y(u, v) $, then the Jacobian of $ T $ is denoted by $ J(u, v) $ or by $ \partial(x, y)/\partial(u, v) $ and is defined by

$ J(u, v) = \frac{\partial(x, y)}{\partial(u, v)} = \frac{\partial x}{\partial u} \cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v} $,

i.e. the determinant of the matrix of the partial derivatives (also known as the Jacobian matrix). Naturally, this can be generalized to more variables.

--------------------

Theorem: If the transformation $ x = x(u, v) $, $ y = y(u, v) $ maps the region $ S $ in the $ uv $-plane into the region $ R $ in the $ xy $-plane, and if the Jacobian $ \partial(x, y)/\partial(u, v) $ is nonzero and does not change sign on $ S $, then (with appropriate restrictions on the transformation and the regions) it follows that

$ \displaystyle \int_R \int f(x, y) dA_{xy} = \int_S \int f(x(u, v), y(u, v)) \left|\frac{\partial(x, y)}{\partial(u, v)} \right| dA_{uv} $.

--------------------

Problem: Evaluate $ \displaystyle \int_R \int e^{(y-x)/(y+x)} dA $, where $ R $ is the region in the first quadrant enclosed by the trapezoid with vertices $ (0, 1); (1, 0); (0, 4); (4, 0) $.

Solution: The bounding lines can be written as $ x = 0 $, $ y = 0 $, $ y = -x+1 $, and $ y = -x+4 $. Now consider the transformation $ u = y+x $ and $ v = y-x $. In the $ uv $-plane, the bounding lines of the new region $ S $ can now be written as $ u = 1 $, $ u = 4 $, $ v = u $, and $ v = -u $.

We can write $ x $ and $ y $ as functions of $ u $ and $ v $: simply $ x = \frac{u-v}{2} $ and $ y = \frac{u+v}{2} $. So the Jacobian $ \displaystyle \frac{\partial(x, y)}{\partial(u, v)} = \frac{\partial x}{\partial u} \cdot \frac{\partial y}{\partial v} - \frac{\partial y}{\partial u} \cdot \frac{\partial x}{\partial v} = \frac{1}{2} \cdot \frac{1}{2} - \frac{1}{2} \cdot \left(-\frac{1}{2} \right) = \frac{1}{2} $.

Then our original integral becomes $ \displaystyle \int_R \int e^{(y-x)/(y+x)} dA = \frac{1}{2} \int_S \int e^{v/u} dA $. And this is equivalent to

$ \displaystyle \frac{1}{2} \int_S \int e^{v/u} dA = \frac{1}{2} \int_1^4 \int_{-u}^u e^{v/u} dv du = \frac{1}{2} \int_1^4 \big[ u e^{v/u} \big]_{v=-u}^u du = \frac{1}{2} \int_1^4 u\left(e-\frac{1}{e}\right) du = \frac{15}{4}\left(e-\frac{1}{e}\right) $.

QED.

--------------------

Comment: Note that the above theorem is probably very important in multivariable calculus, as it is the equivalent to $ u $-substitution in one variable, which we all know is the ultimate integration technique. It functions in the same way, giving you a lot more flexibility on the function you are integrating and the region you are integrating on.

--------------------

Practice Problem: Evaluate $ \displaystyle \int_R \int (x^2-y^2) dA $, where $ R $ is the rectangular region enclosed by the lines $ y = -x $, $ y = 1-x $, $ y = x $, $ y = x+2 $.

Monday, May 28, 2007

One By One, We're Making It Fun. Topic: Calculus/S&S.

Theorem: (Stolz-Cesaro) Let $ \{a_n\} $ and $ \{b_n\} $ be sequences of real numbers such that $ \{b_n\} $ is positive, strictly increasing, and unbounded. If the limit

$ \displaystyle \lim_{n \rightarrow \infty} \frac{a_{n+1}-a_n}{b_{n+1}-b_n} = L $

exists, then the following limit also exists and we have

$ \displaystyle \lim_{n \rightarrow \infty} \frac{a_n}{b_n} = L $.

--------------------

Theorem: (Summation by Parts) If $ \{f_k\} $ and $ \{g_k\} $ are two sequences, then

$ \displaystyle \sum_{k=m}^n f_k(g_{k+1}-g_k) = [f_{n+1}g_{n+1}-f_mg_m]-\sum_{k=m}^n g_{k+1}(f_{k+1}-f_k) $.

--------------------

Problem: Let $ \{a_n\} $ be a sequence of real numbers such that $ \displaystyle \sum_{k=0}^{\infty} a_k $ converges. Show that $ \displaystyle \lim_{n \rightarrow \infty} \frac{1}{n+1} \sum_{k=0}^n k \cdot a_k = 0 $.

Solution: Define the sequence $ \{b_n\} $ by $ \displaystyle b_n = \sum_{k=0}^n a_k $ and let $ \displaystyle \lim_{n \rightarrow \infty} b_n = L $. Then, by summation by parts with $ \{f_n\} = \{n\} $ and $ \{g_n\} = \{b_n\} $, we have

$ \displaystyle \sum_{k=0}^n k \cdot a_k = \sum_{k=0}^n k \cdot (b_{k+1}-b_k) = (n+1)b_{n+1}-\sum_{k=0}^n b_{k+1} $.

The summation we wish to take the limit of is then

$ \displaystyle \frac{1}{n+1} \sum_{k=0}^n k \cdot a_k = b_{n+1}-\frac{1}{n+1} \sum_{k=0}^n b_{k+1} $.

But since, by Stolz-Cesaro,

$ \displaystyle \lim_{n \rightarrow \infty} \frac{1}{n+1} \sum_{k=0}^n b_{k+1} = \lim_{n \rightarrow \infty} b_{n+1} = L $,

we obtain

$ \displaystyle \lim_{n \rightarrow \infty} \frac{1}{n+1} \sum_{k=0}^n k \cdot a_k = \lim_{n \rightarrow \infty} \left(b_n-\frac{1}{n+1} \sum_{k=0}^n b_{k+1}\right) = L-L = 0 $.

QED.

--------------------

Comment: Summation by parts is a very useful technique to change sums around so that they are easier to evaluate. If you hadn't noticed, it is the discrete analogue of integration by parts and is in fact very similar. Stolz-Cesaro is powerful as well and seems like a discrete analogue to L'Hopital (but I'm not sure about this one). Applying well-known calculus ideas to discrete things can often turn into neat results.

--------------------

Practice Problem: If $ \{a_n\} $ is a decreasing sequence such that $ \displaystyle \lim_{n \rightarrow \infty} a_n = 0 $, show that

$ \displaystyle \sum_{k=1}^{\infty} a_k \cdot \sin{(kx)} $

converges for all $ x $.

Saturday, May 12, 2007

Bigger Means Better. Topic: Algebra/Inequalities/Sets. Level: Olympiad.

Definition: A set $ S $ is said to be convex if, for any $ X, Y \in S $ and $ \lambda \in [0,1] $, we have $ \lambda X+(1-\lambda)Y \in S $.

--------------------

Definition: Let $ f: D \rightarrow \mathbb{R} $ be a real-valued function defined on a convex set $ D $. We say that $ f $ is convex if, for all $ X, Y \in D $ and $ \lambda \in [0, 1] $, we have

$ \lambda f(X) + (1-\lambda) f(Y) \ge f(\lambda X+(1-\lambda)Y) $.

--------------------

Theorem: (Jensen's Inequality) Let $ f $ be a real-valued, convex function defined on a convex set $ D $. Given $ x_1, x_2, \ldots, x_n \in D $ and nonnegative reals $ w_1, w_2, \ldots, w_n $ such that $ w_1+w_2+\cdots+w_n = 1 $, we have

$ w_1f(x_1)+w_2f(x_2)+\cdots+w_nf(x_n) \ge f(w_1x_1+w_2x_2+\cdots+w_nx_n) $

or, more concisely,

$ \displaystyle \sum_{i=1}^n w_if(x_i) \ge f\left(\sum_{i=1}^n w_ix_i\right) $.

Proof: We proceed by induction on $ n $.

Base Case - $ n = 1 $, we have $ f(x_1) \ge f(x_1) $ which is trivially true. $ n = 2 $, we have $ \lambda_1 f(x_1)+\lambda_2 f(x_2) \ge f(\lambda_1 x_1+\lambda_2 x_2) $ with $ \lambda_1+\lambda_2 = 1$ which is true by the definition of convexity.

Induction Step - Suppose $ \displaystyle \sum_{i=1}^k w_if(x_i) \ge f\left(\sum_{i=1}^k w_ix_i\right) $. We wish to show

$ \displaystyle \sum_{i=1}^{k+1} u_if(x_i) \ge f\left(\sum_{i=1}^{k+1} u_ix_i\right) $

for nonnegative reals $ u_1, u_2, \ldots, u_{k+1} $ that sum to $ 1 $. Well,

$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) $

by the definition of convexity. But since $ \displaystyle \frac{1}{1-u_{k+1}} \sum_{i=1}^k u_i = \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} = 1 $, by our inductive hypothesis (the $ k $ term case) we must have

$ \displaystyle f\left(\frac{1}{1-u_{k+1}} \sum_{i=1}^k u_ix_i\right) \le \sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) $.

Combining this into the above inequality, we obtain

$ \displaystyle f\left(\sum_{i=1}^{k+1} u_ix_i\right) \le u_{k+1}f(x_{k+1})+(1-u_{k+1})\sum_{i=1}^k \frac{u_i}{1-u_{k+1}} f(x_i) = \sum_{i=1}^{k+1} u_if(x_i) $

as desired, completing the induction. QED.

--------------------

Definition: The convex hull of a set $ S $ is defined to be the smallest convex set containing $ S $. It is the set of all points that can be written as $ \displaystyle \sum_{i=1}^n a_ix_i $, where $ n $ is a natural number, $ a_1, a_2, \ldots, a_n $ are nonnegative reals that sum to $ 1 $, and $ x_1, x_2, \ldots, x_n \in S $.

--------------------

Definition: A vertex of a set $ D $ is a point $ v \in D $ such that for all $ x \in D $ not equal to $ v $ and $ \lambda > 1 $ we have $ \lambda v+(1-\lambda)x \not\in D $.

--------------------

Theorem: Let $ V_D = \{v_1, v_2, \ldots, v_k\} $ be the set of vertices of a compact convex set $ D $. The convex hull of $ V_D $ is $ D $.

Example: The set of vertices of a convex polygon is, in fact, its vertices as traditionally defined in geometry. Any point inside the polygon can be written as a convex combination of its vertices.

--------------------

Theorem: If $ f $ is a real-valued, convex function defined on a compact convex set $ D $, then $ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $, where $ V_D $ is the set of vertices of $ D $.

Proof: We will show that $ \displaystyle f(x) \le \max_{x \in V_D} f(x) $ for all $ x \in D $.

Let $ \displaystyle x = \sum_{i=1}^k \lambda_i v_i $ where $ \lambda_1, \lambda_2, \ldots, \lambda_k $ are nonnegative reals that sum to $ 1 $ and $ V_D = \{v_1, v_2, \ldots, v_k\} $. This is possible by the preceding theorem. Then by Jensen's Inequality we get

$ \displaystyle \sum_{i=1}^k \lambda_i f(v_i) \ge f\left(\sum_{i=1}^k \lambda_i v_i\right) = f(x) $.

And since $ \displaystyle \max_{x \in V_D} f(x) \ge \sum_{i=1}^k \lambda_i f(v_i) $, we have $ \displaystyle \max_{x \in V_D} f(x) \ge f(x) $ for all $ x \in D $. Furthermore, since $ V_D $ is a subset of $ D $, we know that this maximum is attained, thus

$ \displaystyle \max_{x \in D} f(x) = \max_{x \in V_D} f(x) $

as desired. QED.

--------------------

Comment: This is a very useful result, as it allows us to limit our search for the maximum of a function to its set of vertices, which is usually a considerably smaller set, though it may still be infinite (think sphere). In any case, this works well for constrained optimization problems in which the domain is limited to a polygon, the easiest application of this theorem.

--------------------

Practice Problem: Let $ x $ and $ y $ be real numbers satisfying $ |2x-y| \le 3 $ and $ |y-3x| \le 1 $. Find the maximum value of $ f(x, y) = (x-y)^2 $.