Abstract Agricultural research increasingly relies on data-driven approaches for crop yield prediction that complement more established crop growth models, including machine learning techniques. However, these approaches rely on large training datasets. Here, we present the Crop Yields, Climate, Soils, and Satellites (CYCleSS) dataset, a large-scale crop yield dataset derived from precision yield data for 934 fields across England on which a variety of crops are grown.