A Comprehensive Single-Station Quality Control Process for Historical Weather Data

Kruk, Michael C.; Kruk, Michael C.

NCDC's Climate Database Modernization Program Forts Database Build Project has resulted in the scanning and indexing of hundreds of thousands of pages of meteorological records and journals from the 19th century. These daily observations of temperature, precipitation, humidity, cloud cover, wind speed, and others, were made both as part of the United States military record and by volunteers under the supervision of the Smithsonian Institution, the Signal Service, and the Department of Agriculture, among others. The digitization of these records for the 163 highest priority stations is well underway, with an integrated process involving keying of metadata and daily data, and quality control.

In order to ensure the quality of this dataset, rigorous quality control procedures have been implemented. Due to the spatially disparate nature of the historical records, these procedures are based on a single-station data verification strategy. This method employs a series of tests spanning eight different quality assurance categories (i.e., gross translation and error checks, metadata element agreement, element duplication and consistency, data completeness, consistency of the keyed daily and monthly data, internal temperature and precipitation consistency of the monthly means/totals, internal extremes and climatological consistency of daily values, and consistency of daily values calculated from other data) to objectively flag suspicious monthly and daily data values for further review. Systematic issues, such as with poor wet bulb observations in winter, are automatically corrected or flagged in the final data. Other suspicious values are manually verified with the scanned daily data forms by experienced climatologists. The data may verify, either outright or with an error made by the observer, require correction due to keying error, be set to a value of missing, or be deleted from the database. The verification process is handled through a set of web-based tools that apply each test in a designated order and record the results of individual verifications with flags that identify the reason for any change made. This allows for the final data to be reconstructed, if necessary, using only the output from the data entry process and the set of corrections for the station, applied in order. While designed for quality control purposes, these procedures and their results for each station also provide an empirical indication of data quality for users interested in climatological analysis.

Joint Poster Session JP2.23 A Comprehensive Single-Station Quality Control Process for Historical Weather Data