8.3 Ingredients of Statistical Hypothesis Testing – and their Significance (Invited Presentation)

Wednesday, 13 January 2016: 9:30 AM
Room 226/227 ( New Orleans Ernest N. Morial Convention Center)
Hans von Storch, Helmholtz-Zentrum Geesthacht Centre for Materials and Coastal Research, Geesthacht, Germany

Handout (665.2 kB)

Before applying a formula for asserting the “significance” of an observation, some preparations are needed, mostly the formulation of a null-hypothesis, which (in most cases) is hoped to be rejected. This null-hypothesis includes the assumption that a variable considered would be random and governed by a certain probability distribution. When we want to find out if an “observation” of interest is in contradiction to the null-hypothesis, then we determine if this observation is in the tails of the probability distribution used in the null-hypothesis. In that case, we call the tested observation as “significant”, or more precisely, “significantly inconsistent with the null-hypothesis”. For doing so, we must in principle be able to identify all possible outcomes of the random variable. This is a non-trivial assumption; if there is a group of outcomes which are for whatever reason not accessible, we reject the null-hypothesis “a member of this group” too often. When we sample all admissible outcomes, we may estimate the probability distribution, and the quality of the estimation process is taken into account when conducting the hypothesis test. But if the sampling process is biased in some way, the uncertainty of the estimation may be underestimated. This is the case, when data are (even weakly) serially correlated, as is almost always the case in climatic applications. The technical part of the testing, namely the calculation of the measure of consistency (the “test statistic”) is in most cases simple, once the probability distribution is known or can be generated through Monte-Carlo simulation. A very common problem is that of the “Mexican Hat”, namely that the formulation of the null-hypothesis is done after it the variable to be tested is known to be a rare outcome; also the issue of multiple tests is not always taken care of sufficiently. Another problem is that the word “significance”, which is used for indicating that the null-hypothesis is unlikely to apply to the tested observation, is understood in its colloquial meaning, namely that the inconsistency is relevant, even if there is no such link. In the presentation, the general principle of hypothesis testing is worked out, the assumptions are made explicit and examples of disregarding these assumptions discussed.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner