As a first step toward improving the ability to predict the location and timing of CI in the 14 hour period from present, a database is being formed for “CI” and “NonCI” events. This database will be comprised initially of >50 separate fields for each CI/NonCI event, for 1000s of cases. Locations of CI will be determined from WSR-88D reflectivity observations. The database development will quantify the important land surface cover and topography scales that contribute to producing updrafts (at 11.5 km above ground level) that eventually lead to convective storms in the coming 14 hours. Database development relates the CI process with local atmospheric conditions, fusing data from in situ, and products from satellite and numerical models. The database will include fields such as the mixing height, strength of the capping inversions and the boundary layer winds from the NOAA High Resolution Rapid Refresh (HRRR) model. Other datasets will allow for the exploration of the relationships between CI and the different variables within real-time observations that describe the background conditions within the pre-thunderstorm environment. There is a heavy reliance on data from NASA and NOAA satellite remote sensing including sea and lake/river surface temperatures, cloud products (optical depth, effective radius), land use, elevation, topography, derived fields like NDVI, leaf area index, 0-1 hour convective initiation nowcasts, as well as data from established algorithms that retrieve sensible heating, evapotranspiration and soil moisture.
The training database will then be operated on by machine learning statistical methods after association rules are formed outward of high-resolution (~200 m) simulations are performed. These initial exploratory experiments will help identify the more important variables required for the nowcasting classification studies, and improve our understanding of the roles these variables play in the initiation of convection. Given the requirement that datasets of a variety of storage formats, spatial formats (point/raster/vector), as well as map projections have to be inter compared for statistical relationships, we chose the Postgresql geo-database as the tool for homogenizing our datasets to comparable spatial resolution/coordinate reference. Using a Geo-database also offers the advantage of being capable of handling several storage formats, viz. NetCDF, GRIB, ASCII. Geo-databases also contain several built in spatial functions (reprojection, interconversion of raster to vector and vice versa, spatial subsetting, spatial buffering, geospatial indexes to speed up spatial queries etc.) Specifically, we are in the process of developing a data portal using the Postgresql database.
The outcome of this project will include a 30-min update ~5 km resolution gridded product that provide significantly improved prediction accuracy for CI within the 1-4 h timeframe. Plans are to demonstrate the 1-4 hour probabilistic and gridded CI forecasts to National Weather Service forecasters, using existing collaboration with NASA's Short-term Prediction Research and Transition (SPoRT) Center. Our progress via gridded 1-4 hour CI forecasts products as of the AMS Conference will be described and discussed.