Problem: Given a complicated function $ f: \mathbb{R}^n \rightarrow \mathbb{R} $, find an approximate local minimum.
Solution: The adjective complicated is only placed so that we assume there is no easy way to solve $ \bigtriangledown f = 0 $ to immediately give the solution. We seek an algorithm that will lead us to a local minimum (hopefully a global minimum as well).
We start at an arbitrary point $ X_0 = (x_1, x_2, \ldots, x_n) $. Consider the following process (for $ k = 0, 1, 2, \ldots $), known as gradient descent:
1. Calculate (approximately) $ \bigtriangledown f(X_k) $.
2. Set $ X_{k+1} = X_k - \gamma_k \bigtriangledown f(X_k) $, where $ \gamma_k $ is a constant that can be determined by a linesearch.
It is well-known that the direction of the gradient is the direction of maximum increase and the direction opposite the gradient is the direction of maximum decrease. Hence this algorithm is based on the idea that we always move in the direction that will decrease $ f $ the most. Sounds pretty good, right? Well, unfortunately gradient descent converges very slowly so it is only really useful for smaller optimization problems. Fortunately, there exist other algorithms but obviously they are more complex, such as the nonlinear conjugate gradient method or Newton's method, the latter of which involves the computation of the inverse of the Hessian matrix, which is a pain.
No comments:
Post a Comment