Forecasting the Yield of Major Grain Crops across Canada with the Integrated Canadian Crop Yield Forecaster

Zhang, Yinsuo; Zhang, Yinsuo

Yinsuo Zhang, Aston Chipanshi, Nathaniel Newlands, Andrew Davidson and Louis Kouadio

The Science and Technology Branch (STB) of Agriculture and Agri-Food Canada (AAFC)

The Integrated Canadian Crop Yield Forecaster (ICCYF) is a modelling tool for crop yield forecasting and risk analysis based on the integration of geospatial earth observation data using statistics and a Geographic Information System (GIS). By integrating climate, remote sensing and other earth observation information (e.g., historical yields, soil and cropland maps), it provides producers, traders, commodity brokers and other decision-makers with regional crop yield outlooks during and shortly after the growing season. The major features of the ICCYF model include: (1) integrating a physical based soil moisture model with a statistical based yield forecasting model to achieve functionality that generate climate based predictors, (2) flexibility to use the tool to forecast yields of different crops, (3) integrating various EO based information including climates, remote sensing Normalized Difference Vegetation Indices (NDVI), soil water etc., (4) automatic ranking and selection of predictors using Robust Least Angle Regression Scheme (RLARS) and Leave-One-Out-Cross-Validation (LOOCV) scheme at run time, (5) Bayesian method for sequential forecasting (estimation of the prior and posterior distributions of model predictors through a Markov Chain Monte Carlo (MCMC) scheme, and random-forests learning to estimate unobserved variables at the time of forecast). The geospatial processing is achieved using ArcGIS 10.1 and the coding of the statistical modelling is done using an open source software R.

The basic spatial modelling units of this study are the Census Agricultural Regions (CARs) that were delineated in the 2011 Census of Agricultural data collection and dissemination activities. Results are also aggregated to provincial and national levels to evaluate the model performance at larger scales. The station based climate data are provided by Environment Canada and other partner institutions. In total, 330 climate stations are selected to represent the climate of the 82 CARs across Canadian agricultural landscape. In the ICCYF, daily series of air temperature and precipitation are fed into a Versatile Soil Moisture Budget (VSMB) model to generate the agro-climate indices including Growing Degree Days (GDD), precipitation (P), available root zone soil moisture as percentage of Plant Available Water Holding Capacity (PAWHC) and a plant stress index (SI). The pixel based NDVI were derived from original images of the Advanced Very High Resolution Radiometer (AVHRR, ~1km resolution) by the National Oceanographic and Atmospheric Administration (NOAA) and the MODerate-resolution Imaging Spectroradiometer (MODIS, 250m resolution) by the Terra and Aqua satellites. The weekly AVHRR NDVI data used in this study are from 1987 to 2012 and weekly MODIS NDVI data are from 2000-2012. Comparisons of those two datasets in forecasting the crop yield were conducted at CAR scales. A cropland map mask is applied to all climate indices and NDVI data to obtain their represent values for each CAR.

A leave-One-Out-Cross-Validation (LOOCV) scheme is employed to evaluate the model's strength in forecasting the crop yield at different time of growing season for spring wheat, barley and canola, and at different spatial scales. Model performances are evaluated by statistics between the forecasted and surveyed yields such as coefficient of determination (R2), Root Mean Square Error (RMSE) and Model Efficiency Index (MEI).

The major findings of this study are: (1) at CAR level the average variance explained by combined agro-climate and NDVI predictors are 82%, 80% and 78% for spring wheat, barley and canola respectively, while 78%, 72% and 71% were explained by climate indices alone and by 65%, 60% and 48% were explained by NDVI alone, respectively; (2) relatively reliable yield predictions were obtained around mid-August when observed data up to the end of July were available; (3) yield estimation errors were smaller at national and provincial scales than at the CAR scale, due mainly to (a) better model performance at CARS with larger crop coverage and (b) cancellation of errors during aggregation; (4) the models showed strong spatial variability in their performance, largely because of data quality and density problems for both crop and climate variables. Future work has been planned to analyse the sources of forecasting errors and to improve the data and model at those CARs with low forecasting power.

1.2 Forecasting the Yield of Major Grain Crops across Canada with the Integrated Canadian Crop Yield Forecaster