Given the acceptance of prediction uncertainty, how should one evaluate the performance of air quality models? Model evaluation is an interesting question because, typically, most air quality models only predict the 1st-moment of possible outcomes for the stated conditions for each hour of the simulation. ASTM D6589 proposes to form groups of hours having similar conditions, and then to test the ability of the model to accurately represent each group's average result (1st-moment). Implicit in this approach is the assumption that each group's sample values (and therefore the group means as well) are representative of 1st-order air quality model results.
Suppose models were challenged to predict both the 1st and 2nd moments of observations? The basic tenet is to directly challenge a model to characterize what its physics is presumably capable of simulating. In this presentation, a review is presented of how this philosophy for model evaluation has been applied to models of short-range dispersion of inert tracers and the pitfalls of testing 1st-order dispersion models to predict extreme maximum concentration values. We also explore how we can apply this philosophy of model evaluation to assessment of regional-scale air quality models that predict sulfate and nitrate concentration values.