The results can be explained considering that PM10 is partly primary, directly emitted in the atmosphere, and partly secondary, that is produced by chemical/physical transformations that involve different substances as SOx, NOx, COVs, NH3 and that determine its generation and/or removal. The second block, by means of two machine learning techniques (Artificial Neural Networks (ANN) and Support Vector Machines (SVM)), is the forecasting engine of the system. We used a set of feed-forward neural networks with the same topology. Each network had three layers with 1 neuron in the output layer and a certain number of neurons in the hidden layer (varying in a range between 3 and 20). The hyperbolic tangent function was used as transfer function.We used the back-propagation rule to adjust the weights of each network and the Levenberg-Marquardt algorithm to proceed smoothly between the extremes of the inverse-Hessian method and the steepest descent method. The Matlab Neural Network Toolbox was used to implement the neural networks' set. We used also an SVM with a ε-insensitive loss function. The Gaussian function was used as kernel function of the SVM. The principal parameters of the SVM were the regularized constant C determining the trade-off between the training error and model flatness, the width value σ of the Gaussian kernel, and the width ε of the tube around the solution. The SVM performance was optimized choosing the proper values for such parameters. An active set method was used as optimization algorithm for the training of the SVM. The SVM was implemented using the “SVM and Kernel Methods Matlab Toolbox”. As we can see from the findings the ANN performance increases when the number of input features increase. More precisely the performance increases meaningfully from 2 to 8 input features and tends to flatten when the size of the input vector is greater than 8. The best subset of 8 features is the following: - Average concentration of PM10 in the previous day. - Maximum hourly value of the ozone concentration one, two and three days in advance. - Maximum hourly value of the air temperature in the previous day. - Maximum hourly value of the solar radiation one, two and three days in advance.
In the following table there are the results of the ANN with 8 input features.
Correct Forecasting (below the threasold):5073
Incorrect Forecasting (below the threasold):42
Correct Forecasting (above the threasold):48
Incorrect Forecasting (below the threasold):13
We tried different assignment for SVM parameters ε, σ and C, in order to find the optimum configuration with the highest performance.
Correct Forecasting (below the threasold):5107
Incorrect Forecasting (below the threasold):8
Correct Forecasting (above the threasold):48
Incorrect Forecasting (below the threasold):13
In the previous table the best results of SVM with 8 features are shown. In the future, since for some pollutants the meteorological conditions are very important in the generation process, different neural networks will be trained for each different geopotential condition.
Supplementary URL: