We model tornado reporting bias east of the Rocky Mountains during 1975-2014 with various combinations of the following variables: 1) population density; 2) terrain ruggedness; 3) road density; and distance to 4) nearest 100K city, 5) 5K city, 6) WFO, 7) interstate, 8) WFO or 100K city, and 9) interstate or 5K city. In cross-validation tests, the combination of variables 1, 2, 4, 6, and 9 accounts for the most variance in reporting bias. Estimates of large-scale [O(1000 km)] reporting bias are not unduly sensitive to the number of regression variables, indicating useful information can be gained from limited geopolitical data. However, cross-validation tests and geographic maps of modeled bias suggest more complex regressions substantially improve bias estimates at smaller scales. The resulting improvements to tornado hazard models would be valuable to forecasters, severe storm and climate scientists, and insurance/reinsurance companies.
The regressions suggest only 46 % of tornadoes that actually occurred in the analysis domain were reported, with reporting rate decreasing by half as distance to nearest 100K+ city increases to 50 km. Reporting biases are especially pronounced earlier in the record and for shorter-track tornadoes, but remain nontrivial even for more recent and longer-track tornadoes. Underestimation of tornado frequency increases with damage rating; for example, the actual frequency of EF/F 3-5 tornadoes appears to be nearly three times that in the record. This underscores the problem of under-rating tornadoes in rural areas.