5.10
Cluster analysis of meteorological states to understand the weekend-weekday ozone response in the San Francisco, CA Bay Area

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Wednesday, 1 February 2006: 4:30 PM
Cluster analysis of meteorological states to understand the weekend-weekday ozone response in the San Francisco, CA Bay Area
A407 (Georgia World Congress Center)
Scott Beaver, Bay Area Air Quality Management District, San Francisco, CA; and A. Palazoglu and S. Tanrikulu

Presentation PDF (189.7 kB)

A novel clustering scheme is developed for time series environmental data. The method is used to determine recurring patterns in hourly, ground-level wind observations for the San Francisco, CA Bay Area for the summer seasons 1996—2003. The ground-level patterns serve as surrogates for identifying the prevailing synoptic meteorological state. These meteorological regimes have varying potential for ozone buildup in the densely populated Bay Area. Two different wind patterns are detected that are conducive to elevated ozone levels. These two clusters have marked differences in the compositions of ozone and its precursors for weekdays versus weekends. Subregions of the study domain have different ozone sensitivity to precursor concentrations, as they have different responses to the change in emissions characteristics from the weekdays to the weekend.

Environmental time series data sets contain information at many time scales, some of which are periodic. Information at the synoptic time scale must be statistically separated from the other scales, especially the prevalent diurnal and annual time scales, to determine relevant patterns associated with the day-to-day variability in weather and air quality. The proposed clustering algorithm has two stages. In the first stage, a moving window is used to break the time series spanning the observation period into subsets. Each of these subsets or windows can then be clustered in the next stage of the algorithm.

The moving window accounts for specific properties or aerometric data, and allows the extension of time series clustering methods to be applied to environmental data. Because of the periodic nature of environmental observations, period biases can emerge in the cluster solution if care is not taken to prevent this undesirable effect. The window length must be set to an integer multiple of the diurnal cycle period of 24 hours. This ensures that each window of data contains the same number of observations from each hour of the day and cannot be associated with any particular phase of the diurnal cycle. The window length also determines the time scale at which patterns are detected, and must be consistent with the goal of identifying synoptic patterns. Here, a window length of 2 days is chosen to meet both design criteria. After generating the windows, they are autoscaled to remove any seasonal mean shifts between them.

The spacing between the windows determines the temporal resolution of the analysis. More closely spaced windows will better resolve the cluster labels in time but also increase the computational complexity of the analysis. Setting the window spacing to a small fraction of the window rate allows for a large fraction of overlapping samples between windows adjacent in time. Windows containing many overlapping samples should be expected to classify similarly, and departure from this expected behavior indicates that no stable atmospheric pattern is detected. Also, any detected patterns should be at the synoptic scale, as the variability at the other time scales was largely removed by the windowing process. Patterns lasting for too short of duration cannot be associated with the synoptic scale and are considered erroneous. Thus, by observing the number of consecutive windows assigned to the same cluster, the persistence of the identified states can be determined. Periods of time not corresponding to a persistent synoptic atmospheric state can be disregarded from the solution..

The nonhierarchical clustering algorithm used to partition the windows of time series data among a fixed number of clusters is analogous to the k-means algorithm. The prototype model for each cluster is based on Principal Components Analysis, and the distances from each window to each prototype are calculated using a sum of squared errors for fitting a given window in to a prototype model. A large number of runs of this algorithm, performed over an appropriate range for the number of clusters, is aggregated to form a single hierarchical solution. A method is given to estimate the range for the number of clusters required to generate a converging, reproducible solution.

Four clusters are determined which correspond to four dominant, persistent atmospheric states for the Bay Area. Two of the meteorological regimes are conducive to elevated ozone levels in the Bay Area. One of these clusters captures the effects of an atypical summer low pressure system which redirects the typical wind patterns and allows ozone to build up. The other cluster captures the effect of a high pressure center over the western United States which reduces the strength of the typical shoreward marine flow, creating hot and stagnant conditions.

Because the meteorology plays such an important role in ozone formation dynamics, it is not possible to determine the effects of emissions characteristics on ozone levels from historical data observed under different atmospheric conditions. Once a group of days is labeled as having similar meteorological conditions, however, a reasonable estimate of the effects of emissions characteristics can be made. Emissions in the Bay Area are assumed relatively constant for the weekdays (Mon—Fri). The weekends (Sat—Sun) are assumed to have lower emissions rates, especially for NOx due to less diesel traffic on the weekends. Observed compositions for the morning NOx spike are significantly lower on the weekends throughout the Bay Area, supporting the above claim. VOC emissions are unknown, but are assumed reduced for the weekend but by less than for NOx. The ozone response to the varying emissions levels can be determined simply by comparing daily maximum ozone levels between weekend and weekday days assigned to the same meteorological state. Sites within the most urban areas of San Jose and San Francisco/Oakland achieve the highest ozone levels for the weekends, while the other, suburban locations experience the highest ozone levels on the weekdays. This opposite response to a change in emissions characteristics for different portions of the study region implies different sensitivities of ozone to its precursors at different locations. It is likely that the elevated NOx levels at the urban sites during the weekdays forms a NOx-sensitive regime in which NOx is choking off the photochemistry, resulting in decreased ozone levels on the weekdays versus the weekends.