The amount of water vapor in the atmosphere plays a fundamental role in determining the potential for precipitation in a given region. The availability of high temporal and spatial resolution of atmospheric water vapor would therefore seem to allow for better precipitation predictions.
The continued increase in the number of ground-based GPS stations coupled with recent advances in GPS-Meteorology software is now making it possible to obtain water vapor observations with spatial and temporal resolution sufficient for operational forecasting. This is particularly true in the U.S. southwest where there are extensive deployments of mesonets and other earth monitoring networks. The domain of interest in this paper is the Dallas Fort Worth (DFW) Metroplex region where the Texas Department of Transportation (TxDOT) operates some 44 high-quality, dual-frequency GPS CORS (continuously operating reference stations). While these are not specifically operated for GPS-met, the raw pseudo-range and carrier phase data they provide is of sufficient quality that we are able to process it (using ASOS met data, the GAMIT GPS processing software, and standard geospatial interpolation algorithms) to generate, with a 30-minute update rate, the Integrated Precipitable Water (IPW) field as it evolves over the DFW region.
A superposition of reflectivity data from the Fort Worth (FWS) NEXRAD radar over the IPW field shows clear correlation between weather radar reflectivity and IPW. In general, an advecting IPW field (which is a measure of the total amount of water vapor in the atmosphere which could possibly fall as precipitation) is shortly followed by an advecting reflectivity field (a proxy for precipitation rate). Motivated by these observable patterns, our goal is to develop a machine-learning algorithm to extract the temporal and spatial correlations between IPW fields and radar reflectivity fields to generate 1-2 hour nowcasts of the reflectivity (precipitation) field.
We propose to use an ensemble of machine learning classifiers stacked together. This will be a two stage process. The first classifier will train the current time step of IPW fields with current time step of reflectivity fields. Classifiers such as random forests and logistic regression will be used for this step. The second step will train the sequence of outputs generated to the reflectivity field shifted an hour in advance. The probabilistic prediction for precipitation will be evaluated with the area under the ROC (Receiver Operation Characteristics) curve. Optimal hyper parameter and classifier selection will be based on a K-Fold cross validation technique using the area under the curve metric.