In this study, we analyze the performance of nine operational global ensemble prediction systems (EPSs) participating in TIGGE such as that of the European Centre for Medium-Range Weather Forecasts (ECMWF). We concentrate on the monsoon seasons 2007–2014 and the three regions West Sahel, East Sahel, and Guinea Coast. Both station data and the Tropical Rainfall Measuring Mission (TRMM) 3B42 gridded data set are used as observational references. Temporal aggregations from 1 to 5 days and spatial aggregations from 0.25°x0.25° to 5°x2° are considered. Past observations are used to construct a probabilistic climatology that forms the reference forecast against which skill is measured. In addition to the raw ensemble forecasts, we apply state-of-the-art statistical postprocessing methods in form of Bayesian Model Averaging (BMA) and Ensemble Model Output Statistics (EMOS).
Raw ensemble forecasts are uncalibrated, unreliable, and underperform relative to climatology, independently of region, accumulation time, monsoon season, and ensemble. Differences between raw ensemble and climatological forecasts are large, and partly stem from poor prediction for low precipitation amounts. BMA and EMOS postprocessed forecasts are calibrated, reliable, and strongly improve on the raw ensembles, but – somewhat disappointingly – typically do not outperform climatology. Most EPSs exhibit slight improvements over the period 2007–2014, but overall have little added value compared to climatology. We suspect that one of the reasons for the sobering lack of ensemble forecast skill is the inability of the convective parametrizations in the TIGGE models to realistically represent processes of convective organization in a region dominated by MCSs.