Machine-Learning Interatomic Potentials (MLIPs) trained on large collections of first principles calculations are rapidly becoming essential tools in the fields of computational materials science and chemistry for performing atomic-scale simulations. Despite this, apart from a few notable exceptions, there is a distinct lack of well-organized, public datasets in common formats available for use with MLIP development. This deficiency precludes the research community from implementing widespread benchmarking, which is essential for gaining insight into model performance and transferability, and also limits the development of more general, or even universal, MLIPs. To address this issue, we introduced the ColabFit Exchange, the first database providing open access to hundreds of systematically organized datasets from multiple domains that is especially designed for MLIP development. In this talk I will explore opportunities presented by the ColabFit Exchange, including the development of universal models for materials discovery, as well as outstanding challenges for training MLIPs on extensive and diverse data sources, and their integration into molecular dynamics simulations.