4.5
A Machine Learning Tool to Forecast PM10 Level
G. Raimondo, Polytechnic Univ., Turin, Italy; and A. Montuori, W. Moniaci, E. Pasero, and E. Almkvist
The research activity described in this paper concerns the feasibility of applying data mining algorithms to forecast air pollution. The study analyzed the air-pollution principal causes and identified the best subset of features (meteorological data and air pollutants concentrations) for PM10 medium-term concentration forecast. The system described in this paper consists of two different computing blocks. The first implements a feature backward selection algorithm based on the notion of relative entropy. This allows the selection of the most useful features for the prediction of each of the targets relative to the air-pollutants concentrations. The target was chosen to be the mean value over 24 hours, measured every 4 hours (corresponding to 4 daily intervals a day). The complete set of features on which was made the selection, for each of the available parameters (air pollutants, air temperature, relative humidity, atmospheric pressure, solar radiation, rain, wind speed and direction), consists of the maximum and minimum values and the daily averages of the previous three days to which the measurement hour and the reference to the week day were added. Thus the initial set of features includes 130 features. From this analysis an apposite set of data was excluded; such set has been used as the test set. In this way we are able to select the most relevant features to predict the future trend of PM10.The best subset of 16 features turned out to be the following: - Average concentration of PM10 in the previous day. - Maximum hourly value of the ozone concentration one, two and three days in advance. - Maximum hourly value of the air temperature one, two and three days in advance. - Maximum hourly value of the solar radiation one, two and three days in advance. - Minimum hourly value of SO2 one and two days in advance. - Average concentration of the relative humidity in the previous day. - Maximum and minimum hourly value of the relative humidity in the previous day. - Average value of the air temperature three days in advance.
The results can be explained considering that PM10 is partly primary, directly emitted in the atmosphere, and partly secondary, that is produced by chemical/physical transformations that involve different substances as SOx, NOx, COVs, NH3 and that determine its generation and/or removal. The second block, by means of two machine learning techniques (Artificial Neural Networks (ANN) and Support Vector Machines (SVM)), is the forecasting engine of the system. We used a set of feed-forward neural networks with the same topology. Each network had three layers with 1 neuron in the output layer and a certain number of neurons in the hidden layer (varying in a range between 3 and 20). The hyperbolic tangent function was used as transfer function.We used the back-propagation rule to adjust the weights of each network and the Levenberg-Marquardt algorithm to proceed smoothly between the extremes of the inverse-Hessian method and the steepest descent method. The Matlab Neural Network Toolbox was used to implement the neural networks' set. We used also an SVM with a ε-insensitive loss function. The Gaussian function was used as kernel function of the SVM. The principal parameters of the SVM were the regularized constant C determining the trade-off between the training error and model flatness, the width value σ of the Gaussian kernel, and the width ε of the tube around the solution. The SVM performance was optimized choosing the proper values for such parameters. An active set method was used as optimization algorithm for the training of the SVM. The SVM was implemented using the “SVM and Kernel Methods Matlab Toolbox”. As we can see from the findings the ANN performance increases when the number of input features increase. More precisely the performance increases meaningfully from 2 to 8 input features and tends to flatten when the size of the input vector is greater than 8. The best subset of 8 features is the following: - Average concentration of PM10 in the previous day. - Maximum hourly value of the ozone concentration one, two and three days in advance. - Maximum hourly value of the air temperature in the previous day. - Maximum hourly value of the solar radiation one, two and three days in advance.
In the following table there are the results of the ANN with 8 input features.
Correct Forecasting (below the threasold):5073
Incorrect Forecasting (below the threasold):42
Correct Forecasting (above the threasold):48
Incorrect Forecasting (below the threasold):13
We tried different assignment for SVM parameters ε, σ and C, in order to find the optimum configuration with the highest performance.
Correct Forecasting (below the threasold):5107
Incorrect Forecasting (below the threasold):8
Correct Forecasting (above the threasold):48
Incorrect Forecasting (below the threasold):13
In the previous table the best results of SVM with 8 features are shown. In the future, since for some pollutants the meteorological conditions are very important in the generation process, different neural networks will be trained for each different geopotential condition.
Session 4, Applications of Artificial Intelligence 
 Tuesday, 16 January 2007, 8:30 AM-9:45 AM, 210B
	
Previous paper