How not to fool yourself with statistics

Boehm, Albert R.; Boehm, Albert R.

The use of statistics is a vital part of the atmospheric sciences. The word statistics is found in nearly half the titles or abstracts of the American Meteorological Society. Some of the most advanced statistical methods have been developed by atmospheric scientists. Nevertheless, the papers of atmospheric science contain numerous examples of the misuse of statistics, statistical blunders, and the complete absence of statistical analysis when it is needed.

Here are the most common problems and some ways to overcome them.

The single biggest problem is that there is no statistical design. Before collecting data, ask what is the question(s) that needs to be answered. Run approximate or simulated data through the analysis to see if it is probable that the analysis can answer the data.

Natural variability is underestimated. Think in terms of a distribution of values rather than a constant.

Presuming a relationship is based on first principles. Most parameterizations in numerical weather prediction are statistical.

Often the wrong statistical test is used. Take into account serial and spatial correlation.

An incorrect statistical hypotheses is assumed. For example, does this data come from a normal distribution. Enough good data will always show that it does not! Ask instead what is the probable error if a normal distribution is used.

Descriptive statistics are confused with confirmatory statistics. Use Bayesian statistics to answer the right question and handle small samples properly.

Puting too much faith in "real" data. All measurement contain instrument error, a host of assumptions about scale, and archiving idiosyncrasies.

Spatial statistics are plotted on an improper map projection. Use equal area projections when area is of concern.

Precise imagery gives a false impression of accuracy. For example, in certain satellite retrievals and in long range forecasts. When derived pictures contain uncertainty, make sure the picture gives a measure of that uncertainty.

The above problems are the result of insufficient statistical training and a duality in attitude about statistical tools. There is, for example, a segment of psychology scientists that regard statistics as trivial, "Statistical significance testing retards the growth of scientific knowledge..." . In climatology, Dennis Shea pointed out that significance tests are often viewed as some type of ritual performed at the end of a study.

1.4 How not to fool yourself with statistics