A gradient boosting approach for the short term prediction of solar energy production (AMS 2013-2014 Solar Energy Prediction Contest)
By exploiting weather forecast from numerical models in conjunction with actual power productions, the predictive analytics techniques enable to extract accurate models for the short term prediction of solar energy production. In this contest, the objective was to find the best correlations between the production of the 98 Oklahoma Mesonet sites and the weather predictions from the NOAA/ESRL Global Ensemble Forecast System (GEFS).
The approach we selected was based on gradient boosting of regression trees. It provided the best accuracy and overperformed other analytics techniques. We used the implementation of this technique that is directly available in R (gbm package) with the mean absolute error. No feature selection was implemented and in order to reduce the overfitting risks, ensembling of several boosted trees was performed. A better accuracy was carried by considering a single dataset for the whole 98 Mesonet sites rather than individual models. As is often the case in predictive analytics, data preparation was one of the most important steps in this project and some transformations were previously performed in the training and testing datasets. This approach took the second place of the Solar Energy Prediction Contest.