Sparse greedy optimization techniques based on the Frank-Wolfe algorithm (also known as conditional gradient) have seen a surge of interest recently, driven by many new promising applications in machine learning, signal processing and other fields. For constrained convex optimization problems, an iteration of the Frank-Wolfe algorithm only requires the solution of a linear subproblem over the same domain, and does not need any projection steps. Here we provide stronger and more general primal-dual convergence results for the Frank-Wolfe algorithm for arbitrary domains, enabled by a simple framework of duality gap certicates for constrained optimization. Our analysis also holds if the linear subproblems are only solved approximately (as well as if the gradients are inexact), and is proven to be worst-case optimal in the sparsity of the obtained solutions. On the application side, this allows us to unify a large variety of existing sparse greedy methods, in particular for optimization over convex hulls of an atomic set, even if those sets can only be approximated, including sparse (or structured sparse) vectors or matrices, low-rank matrices, permutation matrices, or max-norm bounded matrices. We also discuss a new general framework for convex optimization over matrix factorizations, where every Frank-Wolfe iteration will consist of a low-rank update, and consider some applications of this approach.