The issue of valid date ranges was a critical concern, and was a shortcoming in previous systems. Each component of a station - its location, equipment, observers, reporting methods, even its identity - can change independently of other components, so each component record has its own period of validity in the form of beginning and ending dates. These date ranges may overlap the valid dates for other data items, making queries and reporting complex. The paper will discuss in depth the approach taken to managing and querying these date pairs.
Station information comes from many sources, and knowing the source for a given piece of information. Some are formal, like a National Weather Service Form B-44; others are less so, like ad hoc research or e-mail confirmation.
The best-intended corrections sometimes overwrite valid data. While mistakes must be corrected, discarding original data values may be perilous. There is no audit trail unless previous the value and information source are retained when erroneous data values are corrected. The technique used to maintain a change log is discussed.
While a well-normalized relational structure provides good data integrity, it can present performance challenges due to the large number of tables involved in queries. With a nod to the realities of using a system in a production environment, we look at some of the techniques used to improve query performance.
Supplementary URL: