The benefits of applying the correlation function can be understood in part from examining the characteristics of simple 2 * 2 covariance matrices generated from random sample vectors with known variances and covariance. These show that for small sample sizes, noisiness in covariance estimates tends to overwhelm signal when the ensemble size is small and/or the true covariance between the sample elements is small. Since the true covariance of forecast errors is generally related to the distance between grid points, covariance estimates from a small ensemble have a higher ratio of noise to signal with increasing distance between grid points. This property is also demonstrated using a two-layer hemispheric primitive equation model and comparing covariance estimates generated by small and large ensembles. Covariances from the large ensemble are assumed to be accurate and are used a reference for measuring errors from covariances estimated from a small ensemble.
The benefits of including distance-dependent reduction of covariance estimates is demonstrated with an ensemble Kalman filter data assimilation scheme. The optimal correlation length scale of the filter function depends on ensemble size; larger correlation lengths are preferable for larger ensembles.
The effects of inflating background error covariance estimates are examined as a way of stabilizing the filter. It was found that more inflation was necessary for smaller ensembles than for larger ensembles.