Sunday, 28 January 2024
Hall E (The Baltimore Convention Center)
Handout (2.6 MB)
Accurately modeling moisture flux (MF) is crucial as it affects various atmospheric
processes, including cloud and precipitation formation. However, current climate models
struggle with representing MF due to computational challenges arising from the required finer
grid resolution. As an alternative, machine learning is employed for efficiency, yet complex
models can lack interpretability. This study explores the possibility of using equation discovery,
a type of interpretable machine learning, to model complex processes like MF. Using piecewise
regression (PR) and symbolic regression (SR), two equations for MF are discovered. Preliminary
findings using the lowest three atmospheric levels near the Caribbean Islands indicate accurate
predictions for the highest level, but accuracy decreases closer to the surface. PR-based equation
is simple and easy to interpret, but it struggles with consistent breakpoints, while SR-derived
equation captures data shapes effectively but is more complex and less interpretable. Future work
aims to enhance accuracy and simplicity by adjusting hyperparameters, reducing noise, adding
features, and more.
processes, including cloud and precipitation formation. However, current climate models
struggle with representing MF due to computational challenges arising from the required finer
grid resolution. As an alternative, machine learning is employed for efficiency, yet complex
models can lack interpretability. This study explores the possibility of using equation discovery,
a type of interpretable machine learning, to model complex processes like MF. Using piecewise
regression (PR) and symbolic regression (SR), two equations for MF are discovered. Preliminary
findings using the lowest three atmospheric levels near the Caribbean Islands indicate accurate
predictions for the highest level, but accuracy decreases closer to the surface. PR-based equation
is simple and easy to interpret, but it struggles with consistent breakpoints, while SR-derived
equation captures data shapes effectively but is more complex and less interpretable. Future work
aims to enhance accuracy and simplicity by adjusting hyperparameters, reducing noise, adding
features, and more.

