Traditional model evaluation focuses on comparison between modeled and observed values for a few air pollution episodes and these episodes are then used to address policy and regulatory issues for air quality improvements. With greater computational resources available, limiting simulations to a few episodes is no longer necessary. EPD has modelled air quality for one full year (October 2000 to September 2001) and made a comprehensive statistical evaluation of its model performance.
Hong Kong (~22oN, 114oE) is on the southern coast of the east Asian land mass and is thus subject to strong monsoonal flow. Yet its own complex terrain and its position on the Pearl River Delta create local phenomena which also strongly affect local circulation and produce rather diverse winds as evident in the data from the 24 meteorological stations in the territory. HK's emissions only account for a small fraction of the total of that in the Pearl River Delta which is reputed to be the manufacturing centre of the world. HK operates 14 air quality monitoring stations, mostly in densely populated areas, but there is currently no corresponding data in the immediate vicinity of HK to serve model evaluation purposes.
Of the three major processes in determining air quality, only meteorology and the end products of the transport and chemical processes are amenable to evaluation with data. The accuracy of the emissions must be treated as the unknown and this error can only be inferred through reverse engineering. Thus our evaluation here is limited to meteorology and air quality.
The small size of HK (no more than 60km by 60km in the EW and NS directions) means that at any one time, HK can be considered to be under the influence of only one weather type. As such, our model evaluation process started with weather clustering based on 5 years of daily averaged meteorological data from HK. After comparing different approaches hierarchical and K-means and using principal component analysis, we settled for the K-means method and came up with 8 optimal clusters (based on pressure, temperature, winds etc.) which to a reasonable degree reproduce the 6 to 7 weather types based previously on visual inspection of weather maps (e.g. position of centres of high and low pressure system over Asia and typhoons). To determine whether the model reproduces the same clusters so as to give a first indication of performance, we perform clustering with one year modelled data and match the solution of the 8-cluster that came from 5 years of observation. Only 2 to 3 significant clusters have one to one correspondences whereas most have several clusters in one solution map to a single cluster in the other solution and vice versa. When modeled outputs are forced to the measurements' 8-cluster solution, four small clusters have more than 80% correct classification and the overall misclassification rate is about 38%.
When observed air quality data are included with the observed meteorological data, a 9-cluster solution results and this solution has an 85% one-to-one match with the 8-cluster solution based solely on observed meteorology. To a large extent, air quality and meteorological types are closely linked.
Standard statistical measures (errors, correlation, Willmott index etc.) are then used to analyse model performance according to each cluster, station, month, hour of the day and calendar day for ozone, NO, NO2, SO2 and RSP. The trend for the emitted species indicates general under-prediction with particularly low variability for RSP. This suggests that the assumed boundary conditions rather than actual emission in the domain dominates the observed RSP and to a lesser extent other species. Performance analysis by cluster also reviews fine differences, e.g. the cluster (warm winter) with mainly E and NE winds and the cluster (late summer and cyclone influence) with N winds see under-prediction of RSP, strongly suggesting that RSP sources to the N and E of HK have been misplaced, on top of overall under-representation.
Evaluation of a photochemical modeling system involves performance testing over a range of interacting spatial and temporal scales. In HK, only local data are available for such testing. We meet the challenge by pegging local conditions to large scale weather through local meteorological clustering. Then air quality is included in the clustering. This stratification of long-term data through clustering enables us to target performance analysis (e.g. wind directions, emission in the near and far field and from specific directions etc.) and to seek systematic improvements in the system. The analysis techniques are packaged for repeated analysis. The big unknown in air quality modeling in the region the quantity and spatial pattern of the emissions is being constraint through backward engineering.