Stack of ideas: July 2013

If you think of non-convex optimization in machine learning, everybody is usually content with a local minimum. This may be so because either (1) the "quality" of a local minimum is not that worse than the global minimum (it would be great if this was shown more often), or (2) nothing better can be done; your hands are tied.

Obviously, there are notions of random restarts (or random mutation in genetic algorithms for optimization) which intuitively allow you to "escape local minima".

I was wondering if there has ever been theoretical study about the "robustness" of local minima. What I mean by this is that if I jitter the solution (corresponding to a local minumum) with some random noise (that is, if we are solving $\min_x f(x)$ to reach a local minima $\tilde{x}$, then we add some jitter/noise to $\tilde{x}$), or jitter the "problem" with some random noise (that is, if we are solving $\min_x Ax$, then we add some noise to $A$), how much of a jitter can this local minimum handle? Specifically, if we start the greedy minimization procedure with the jittered solution or with the current solution and the jittered problem, how much jitter (for some formal notion of jitter/noise) will allow the optimization procedure to still end up in (or around) the same local minimum?

Moreover, how useful would this notion of robustness be in actually systematically looking for the global minimum. I have no idea about this. If a local minimum is very "robust", then it would seem that in the region around this local minimum, it is very likely that this local minimum is the only local minimum in and around this minimum. This can allow us to ignore any starting point from in and around this region, and technically remove this region from the parameter search space.

Most of this idea here is very handwavy (call it my rambling) and would require to be formalized to be of any meaning. I will update this post if and when I think a little more about this.

Stack of ideas

Monday, July 22, 2013

Non-convex optimization with random jitters