The latter are critical especially in large metropolitan areas where transport emissions are relevant causing a greater exposure of population with consequent health problems.
Ozone is a very reactive gas and presents concentration levels which are strongly dependent both from the micro-meteorological conditions of the site and the seasonal effects. The prediction of Ozone levels is very complex to obtain as described in different studies.
For Ozone models one of the most difficult problems to deal with, is the simulation of the chemical reactions that occur in the atmosphere, linked to the long range transport, to the incoming solar radiation and to the atmospheric turbulence conditions.
Among the complex systems, an important tool in order to forecast air pollution data, by advanced statistical method, is the neural network (NN), that can work as universal approximators of non-linear functions and, consequently, can be used in assessing the dynamics of such systems.
In our work, NN methods have been developed to forecast daily maximum ozone levels, the 8h average and the hourly levels, using daily and hourly meteorological and concentration data as input parameters. The aim of our work is to provide a methodological procedure in order to forecast Ozone 24 hours in advance.
The data optimization process is necessary in order to select patterns and variables having a high meaning for explaining data variability related to NN model performance.
The NN parameters were obtained by a training procedure based on the use of an efficient unconstrained minimization algorithm.
The training procedure target is not just to reproduce ozone trend, but it is to try to simulate the process of ozone diffusion and the chemical reactions in atmosphere. The meteorological conditions play an important role during the summer, when relevant photochemical peaks of pollutant can be verified and can rise at health effect on the child and elderly.
For the simulations, the optimization of the input patterns is a critical point. In fact the presence of extreme events (Ozone levels higher than 80 µg/m3) has interest for the related human health, but is less significant from a statistical point of view, constituting about 1% of the collected data.
Data used in our simulations come from two monitoring stations of the ARPA network in the urban center of Rome (Magna Grecia and Corso Francia), and regard all the year 2005.
The variables used for the simulations are: pollutants observed (CO, NO, NO2, O3) and the related meteorological variables (temperature (T), relative humidity (RH), global solar radiation (RS) and atmospheric pressure (Press)).
Data coming from Magna Grecia are used to train the NNs whereas those coming from Corso Francia are used to test the performance of the NNs.
The first NN, used to forecast meteorology take 8 input data T, RH, RS and Press at different time lags: T(L48-24), RH(L48-24), Press(L48-24), RS(L48-24).The output of this NN are the same meteorological variables at time lag zero: T(L0), RH(L0), Press(L0), RS(L0).
The second NN is used to forecast Ozone levels, taking as input data those coming from the previous NN as well as the observed pollutants at different time lags: CO(L48-24), NO(L48-24), NO2(L48-24), O3(L48-24).
The NNs ‘ performance shows to be very good, both for hourly predictions, maximum daily predictions and 8h average predictions. The results shows, using as input data 48h-24h past measurements of primary pollutants and meteorological variables predicted at time lag zero, a correlation coefficient for the Ozone ranging from 0.78 to 0.82 for the Corso Francia monitoring station.
The goodness of the Ozone forecast is strongly dependent by selection of the pattern during the training phase and the results are related with the statistical distribution of the Input/Output data set. Our research show that a preliminary study of input data is always needed in order to remove not very meaningful data from the training sets, to choice a suitable normalization rule and to compute the usual statistical indexes of correlations among the variables. The skill of the NN to capture the environmental information inside the data is highly dependent by the preliminary study of patterns. The generalization capacity of the net to forecast ozone peaks has to be connected with the essential information inside the data set and this information is not necessarily regularly distribute inside all patterns.
Supplementary URL: