When Machine Learning Objectives Compete for Improved Subseasonal Bias Correction, Who Wins?

Molina, Maria; Molina, Maria

In supervised machine learning (ML), a model is optimized to generate predictions using a predefined objective function and labeled data. An example of an optimized ML model for a predefined objective could entail one that minimizes a measurement of error between its predictions and corresponding observations. However, in real world applications, multiple objectives may be needed for an ML model to be useful or provide added value to already existing physics-based approaches. These multiple objectives may at times be in competition, necessitating that trade-offs be made in the optimal performance of one objective for the optimal performance of the other(s). The parameter space where no single optimal solution exists in multi-objective problems is referred to as the Pareto frontier.

Earth system science is a field rich with multi-objective problems. An example of a multi-objective problem in Earth system science involves the interface between ML model interpretability and ML model complexity, with the latter being potentially more skillful than the former. In this work, we focus on the task of offline bias correction for subseasonal forecasts of temperature and precipitation created using the Community Earth System Model version 2 (CESM2) configured for initialized prediction. While conducting this task, we focused on bias correction using various objectives that were at times in competition with each other. These objectives included improving the skill of temperature or precipitation over land, improving the skill of temperature or precipitation globally, and improving sharpness representation of temperature or precipitation that is at times overly smoothed in coarse resolution Earth system model simulations and ML-based model output. An extensive hyperparameter grid search was conducted to identify image-to-image ML models that performed skillfully across these various metrics using the Earth Computing Hyperparameter Optimization (ECHO) software, which is a distributed hyperparameter optimization package build with Optuna (a commonly used software for ML model optimization). Various models were identified among a Pareto frontier, where improved performance in one metric could be achieved, but necessitated skill reduction of other metrics.

The identified Pareto frontier for our ML-based bias correction of subseasonal forecasts was then leveraged to generate a Macro- and Micro- large ensemble, where different ML model architectures comprised the Macro-ensemble set, and random weight initialization and batch sampling were used to create Micro-ensembles of the respective ML model architectures. Image-to-image ML models identified to perform skillfully for our objectives included U-Net++ and MANET (a Multi-Attention-Network). We will share the methodological setup of our approach and bias correction results. Bias correction skill and ensemble spread as a function of initial state will also be discussed.

9A.1 When Machine Learning Objectives Compete for Improved Subseasonal Bias Correction, Who Wins?