In this paper, the authors examine the specific problem of combining various NWP model, radar, satellite and derived fields for forecasting thunderstorm initiation in a 1-2 hour timeframe. For this purpose, a machine learning method called random forests--ensembles of weakly-correlated decision trees--is used to rank predictor importance and provide a benchmark for potential algorithm performance. Using data collected over the summer of 2007, this technique suggests that the best set of initiation predictors varies based on day, hour, and location. Random forests and clustering techniques are then used to help identify meaningful "regimes" representing types of convection, geographical location or synoptic conditions. Forecasts tuned to each regime are created, and a prototype Takagi-Sugeno style algorithm is designed to combine the individual regime forecasts based on fuzzy memberships in each regime. Output from this prototype "combiner" is compared to a general random forest prediction and to existing forecast products via statistical evaluations and case studies. Although this work is still preliminary, the authors conclude that this approach shows promise and that applying a similar methodology to other elements of CoSPA development may be worthwhile.
Supplementary URL: