New models were developed and have run daily since April 2006. Training data consisted of 1 day per month March to September 2005 for a domain covering the northern United States and southern Canada. One model with 15 km resolution covering the entire domain was built for each of eight 3-hr diurnal periods. There are two predictands derived from observations by the North American Lightning Detection Network: (1) “time-area coverage” of lightning (similar to probability), and (2) number of flashes per three-hours. Several predictors from deep convection parameterization in the GEM regional model are included, plus important environment predictors. Calculations are on a moveable 9*9 grid centered on each grid point at four times in each three-hour diurnal period (t, t+1, t+2, and t+3 hours). This gives 324 data points to calculate derived predictors that are statistics of basic predictors, e.g. the minimum Showalter index in the 324 points; the fraction of points with upward deep convection velocity greater than 20 m/s. Data reduction keeps the number of predictors in each model to 50 or less. Tree-structured regression, a modern data-mining technique, is used to derive models. Cross-validation shows the trees fit 80-90% of expected predictand variance. Model trees have 300-700 nodes, allowing for quasi-continuous prediction of both predictands across the whole domain.
Prediction is year-round and forecast coverage is extended to regions not included in the training data such as northern Canada and the southern United States. Use by Canadian forecasters has become widespread for thunderstorm prediction in public forecasts and convective area depiction in aviation forecasts. There has been considerable interest from forestry groups in using the forecasts for 1-2 day fire likelihood predictions.