Search for a command to run...
• How geospatial foundation models (GFMs) perform in agriculture remains unclear. • We propose a workflow to benchmark Google’s AlphaEarth GFM for agriculture. • AlphaEarth is tested for yield prediction, tillage mapping, and cover crop mapping. • AlphaEarth rivals local models but lacks transferability, interpretability, and stability. • The benchmarking workflow and datasets can readily support future GFM evaluation. Geospatial foundation models (GFMs), pretrained on massive Earth observations (EO), have emerged as a promising approach to overcoming the limitations in existing featurization methods. Although most studies on GFMs have released the source codes and pre-trained weights, their deployment still demands extensive configuration, environment setup, inference EO preparation, and model fine-tuning. More recently, Google DeepMind has introduced AlphaEarth Foundation (AEF), a GFM pre-trained using multi-source EOs across continuous time. An annual and global embedding dataset is produced using AEF that is ready for analysis and modeling. The internal experiments show that AEF embeddings have outperformed operational models in 15 EO tasks without re-training. However, those experiments are mostly about land cover and land use classification. Applying AEF and other GFMs to agricultural monitoring requires an in-depth evaluation in critical agricultural downstream tasks. There is also a lack of comprehensive comparison between the AEF-based models and traditional remote sensing (RS)-based models under different scenarios, which could offer valuable guidance for researchers and practitioners. This study addresses some of these gaps by evaluating AEF embeddings in three agricultural downstream tasks in the U.S., including crop yield prediction, tillage mapping, and cover crop mapping. Datasets are compiled from both public and private sources to comprehensively evaluate AEF embeddings across tasks at different scales and locations, and RS-based models are trained as comparison models. AEF-based models generally exhibit strong performance on all tasks and are competitive with purpose-built RS-based models in yield prediction and county-level tillage mapping when trained on local data. However, we also find several limitations in current AEF embeddings, such as limited spatial transferability compared to RS-based models, low interpretability, and limited time sensitivity. These limitations suggest exercising caution when applying AEF embeddings in agriculture, where time sensitivity, generalizability, and interpretability is important. To our knowledge, this is the first study that systematically implements and evaluates embeddings from GFMs in agricultural downstream tasks across space, time, and spatial resolutions. The evaluation results and analyses can inform the design of future AEF versions and other GFMs and support their applications in agriculture and Earth science domains. Moreover, the proposed benchmarking workflow and datasets can be readily applied to evaluate future GFMs and facilitate their use in agricultural downstream applications.
Published in: International Journal of Applied Earth Observation and Geoinformation
Volume 149, pp. 105258-105258