We propose a new method in the spirit of stochastic gradient methods for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. Further, we show that in many cases the convergence rate of the new method (in terms of evaluating individual gradients) will be faster than the lower-bound for methods that evaluate the full sum. Numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms. |