Parameter importance is tested through ERA-5 pseudo-proximity soundings to hail profiles from a new combined dataset constructed from multiple datasets, including the Storm Prediction Center (SPC) Storm Data, the SPC Storm Mode Dataset, Community Collaborative Rain, Hail and Snow Network (CoCoRAHS) and Meteorological Phenomena Identification Near the Ground (MPING) records. Through use of a KD-tree approach for spatial independence, these sources in combination yield 80,000 reliable and independent cases over the past 25 years. This choice of observational independence has important implications for the fitted environmental parameters. Unlike prior efforts, our dataset includes a variety of sizes ranging from 6.4mm to >100.2 mm in maximum diameter. These observations are paired with convective parameters, including both existing formulations, and new concepts derived from recent modeling studies. Results show that the null dataset being used to predict hail matters, and prior approaches have likely failed due to strong parameter overlaps for existing parameters. Through this analysis, we find that predictability exists for weather hail will be severe (>2.5 cm) or not, but with existing parameters becomes more challenging for discriminating larger categories if the underlying mesoscale and synoptic regime is not considered. These suggest that a multi-model and profile-aware approach is necessary to obtain reliable environmental-based predictions of hail size.

