A crucial challenge in the coming decade will be the integration of direct physical simulations on the one hand, and data-driven approaches on the other. Such a hybrid approach holds many opportunities for weather forecasting, as well as countless other fields.
From model to outcomes
Operational weather models are usually run at a resolution of between 1km and 10km, that is, everything within the same square kilometer is represented by a single grid cell. This resolution is fine enough to capture a wide range of phenomena, but will obviously be unable to capture very localised details.
It may be possible to perform this kind of localisation using models trained on historical data, providing a mapping between the large-scale predictions of the simulation and the small-scale effects. This is an area of active research which could make forecasts more useful for day-to-day activities.
As well as predicting weather at finer scales, similar techniques could help to link weather forecasts with their broader impacts. Many things are affected by the weather, either directly or indirectly; these include traffic, hayfever, flight delays, and hospital admissions. While some effects may not be easy to simulate, using data-driven models could help to provide advance warning of significant impacts.
Once a machine learning model has been trained, it is often much faster to run than a full simulation. This is the motivation for a technique called model emulation. The idea is to build a fast statistical model which closely approximates a far more expensive simulation. Emulators are already being applied to problems such as climate sensitivity. An area of current interest is using the same tools to speed up some components of the weather model.
There are some aspects of weather prediction which require a full physical simulation; this is what lets you predict unseen events with confidence. Other places this is not possible or even justified, and a statistical approximation may be the best you can do. This second case is where emulation can be useful in operational forecasting.
Beyond emulators, there is broader potential for hybrid models with both learned and simulated components. Such models would combine data-driven and physically-driven approaches. For example, it may be possible to adapt statistical components of the model to the local terrain, based on previous observations.
An area where machine learning has made dramatic progress is feature detection. You can see examples of this in apps which not only detect your face, but add glasses and a moustache in real-time.
There is currently a lot of interest in applying similar methods to hazard detection, especially to storm tracking. Trained experts are able to recognise storms and trace their paths from weather imagery; in principle there is no reason an algorithm could not learn to do the same.
Another application could address the challenges posed by data volume and complexity when dealing with data from physical simulations. The fields output by such models are highly multidimensional; making sense of them is a complex task, requiring many “screens” of information. An algorithm which could summarise the salient features and bring them to the forecaster’s attention would help streamline this task.
Exploring combinations of machine learning and numerical simulation is an area of great interest and promise for the Met Office. Not only does it offer an advance in scientific capability, but the challenges arising from the attempt could drive new research in the field of machine learning.