Semiautomated quality control of historical sub-daily surface synoptic meteorological data: Application of attributes control methodology

Graybeal, Daniel Y.; Graybeal, Daniel Y.

For NOAA's Climate Database Modernization Program, quality assurance (QA) techniques are being developed and applied to historical sub-daily surface meteorological data digitized from original paper observation forms, prior to archival at the National Climate Data Center. As with most QA techniques, the focus of methods developed for these data has been largely on outlier identification. More recently, however, systematic errors have been increasingly noted in these datasets, so that the focus has shifted toward their identification and rectification. The data in question span 1892-1948, range from hourly to twice-daily temporal resolution, include full synoptic reports, and come from 60-240 stations around the United States, including Alaska and Hawaii.

These systematic errors have typically involved substitution of dew point depression for dew point, station for sea-level pressure, or column shifts. In temporal extent they range from one day or less to twelve months or longer. This temporal scale falls between the single or few observations sought by traditional, outlier-focused analysis, and the multi-year sequences identified by what is usually called inhomogeneity analysis. Application of a variation on what is known in statistical quality control literature as attributes control methodology is described. Flags generated from record-by-record processing are counted by station-month, which counts are then analyzed for patterns of clustering and runs in high counts. The process is semiautomated and guides a technician in identifying precise start and end times of systematic errors, as well as an appropriate correction instruction for each pattern so identified.

J3.12 Semiautomated quality control of historical sub-daily surface synoptic meteorological data: Application of attributes control methodology