Information-Geometric Optimization (IGO) is a generic framework of probabilistic model based derivative free search algorithms for any type of optimization problems. Given a probabilistic model, i.e., a family of probability distributions, IGO defines another maximization problem on the space of the model parameter and takes a gradient step to maximize it. When computing the gradient on the parameter space, we take into account the intrinsic metric on the parameter space of probability distributions, namely the Fisher metric, and the gradient reads the so-called natural gradient. IGO implementations repeat to estimate the natural gradient by using samples from the current distribution and update the parameter by the natural gradient so that the probability distribution converges towards the Dirac-delta distribution concentrated at the optimum of the original problem. It has been realized that IGO recovers several known algorithms such as the covariance matrix adaptation evolution strategy (CMA-ES), which is a state-of-the-art evolutionary algorithms for continuous optimization, and the population based incremental learning (PBIL) for binary optimization. In this talk, we introduce the framework of IGO, followed by theoretical aspects of IGO including a linear convergence proof for an IGO implementation given the family of isotropic Gaussian distributions through the theory of stochastic approximation. |