We consider the problem of optimizing machine learning objectives with a decomposable regularization penalty and a non-smooth loss function. For several important learning problems, state-of-the-art optimization approaches such as proximal gradient algorithms are difficult to apply and do not scale up to large datasets. We propose a new conditional-type algorithm, with theoretical guarantees, for such problems. Promising experimental results are presented on real-world datasets. |