Considering global probabilistic forecasts of 850-hPa and 2-m temperatures, for example, results indicated that a multi-model ensemble containing nine ensemble prediction systems from the TIGGE archive did not improve on the performance of the best single-model, the ECMWF EPS. However, a reduced multi-model system, consisting of only the four best ensemble systems, provided by Canada, the US, the UK and ECMWF, showed an improved performance. This multi-model ensemble has been used as a new benchmark for the single-model systems contributing to the multi-model, to which reforecast-calibrated ECMWF EPS forecasts have been compared. Results have shown that forecasts from the reforecast-calibrated ECMWF EPS are of comparable or superior quality to these multi-model predictions. The improved performance was achieved by using the ECMWF reforecast dataset to correct for systematic errors and spread deficiencies. During this talk, the performance of TIGGE single and multi-model systems will be compared, and the calibration methodology applied to the ECMWF EPS will be illustrated. Further results on refining the multi-model performance by applying model-dependent weights to the individual components of the multi-model will also be discussed.