These three issues are addressed by observation thinning. Thinning is a reduction of a set of observations to a subset usually of much lower cardinality. Along the decrease in data size, the main goal of thinning is to extract the most independent information distributed among the observations and to reduce the effective error-correlation. Some simple thinning methods are applied in operational practice by most numerical weather prediction centers nowadays. For example, dense observation sets are often thinned to subsets of observations homogenously distributed over the space domain. However, in this simple approaches, the observation values are usualy not taken into consideration. Therefore, such strategies are called non-adaptive. They have the disadvantage that the information about some important small- or middle-scale phenomena, e.g., athmospheric fronts, presented in the original observation set, may be lost during the thinning. To overcome this problem, several adaptive thinning schemes were proposed recently (Refs. 1 and 2). These are heuristic methods, which tend to retain more observations in regions, where a high variance (high gradients) of observation values is detected, as compared to regions with a lower variance. Although these methods use similar heuristics, their thinning strategies are quite different. So, one of the two algorithms proposed in Ref. 1 is based on a cluster analysis scheme, whereas the other one takes advantage of a more sophisticated and expensive estimation filter. The approach presented in Ref. 2 is inspired by mesh simplification methods used in computer graphics and implements a simple quad-tree decomposition approach. The outputs of the methods, i.e., the resulting thinned data sets, satisfy the optimality criteria implied by the algorithms. However, these criteria provide no means to judge the thinning quality with respect to the resulting forecast error. An independent evaluation criterium is necessary. Clearly, the most relevant criterium is the improvement in the quality of the weather forecasts due to the integration of a certain thinning method in the data assimilation pipeline in an experimental system. Such investigation was performed for the two thinning methods in Ref. 1. However, this evaluation strategy extremely timeconsuming and impractical for extensive empirical studies. Thus, an alternative evaluation scheme is necessary. Development of such a scheme is the subject of this study.

We present an experimental framework for evaluation of thinning algorithms with respect to two kinds of algorithm-independent quality metrics. The first one embraces several metrics corresponding to the analysis error: *l*_{1}, *l*_{2}, and a weighted *l*_{2} metric. The analysis is computed using the 3D-Var assimilation scheme. We use synthetic truth, background and observations data and are thus able to formalize the background and observation error statistics necessary for the minimization of the cost function. In the real world, these statistics are usually poorly known. The true and background signals are assumed to be scalar isotropic stationary random fields defined by their autocorrelation functions. The observations are generated by spatial integration of the true signal with a weighting function of a remote-sensing instrument with unit instrumental error. A uniform weighting function is used. A large observation set is simulated and thinned by one of the thinning methods before being used in the assimilation. Several hundreds of realizations of the truth, background and observation signals are simulated to estimate the analysis error, which can be then used as a measure for the quality of the applied thinning scheme.

However, due to the non-linear dynamics implied by the numerical weather forecast models, a minimal analysis error does not necessarily garantee a minimum of the forecast error. Therefore, the second quality metric we suggest is related to the forecast error directly. Since the study of the forecast quality with a real weather prediction model is prohibitively expensive for extended studies, we propose to resort to a simple non-linear model given by the shallow water equations. The shallow water model exhibits major properties of horizontal athmospheric dynamics on a sphere and is therefore commonly used for the preliminary evaluation of numerical methods intended for the application in a real athmospheric model (Ref. 3). The experimental procedure is similar to the one explained above: first we simulate true, background, and observation signals, apply a thinning algorithm prior to assimilation and execute two runs of the shallow water model: one from the known true initial conditions, and the second from its approximation, i.e., an analysis produced by the assimilation. Different forecast error metrics (*l*_{1}, *l*_{2}, and a weighted *l*_{2}) are then used as an evaluation criterium for the thinning algorithms.

1. Ochotta, T., Gebhardt, C., Saupe, D., Wergen, W., Adaptive thinning of atmospheric observations in data assimilation with vector quantisation and filtering methods, Quarterly Journal of the Royal Meteorological Society, to appear, 2006.

2. Ramachandran R., Li X., Movva S., Graves S., Greco S., Emmitt D., Terry J., and Atlas R. Intelligent Data Thinning Algorithm for Earth System Numerical Model Research and Application. Proc. of the 21st Intern. Conf. on Interactive Information Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology. 85th AMS Annual Meeting, 2005.

3. Williamson, D. L., J. B. Drake, J. J. Hack, R. Jakob, and P. N. Swarztrauber, 1992: A standard test set for numerical approximations to the shallow water equations in spherical geometry. J. Comput. Phys., 102, 211-224.

Supplementary URL: