Wednesday, 12 January 2005: 4:45 PM
Intelligent Data Thinning Algorithm for Earth System Numerical Model Research and Application
Poster PDF
(866.2 kB)
As space-based observing systems generate ever-increasing volumes of data, there arises the need to better discriminate between useful data points and data points which are simply redundant. Current global data assimilation systems are unable to handle the enormous volume of observational data due to the prohibitively large computational costs. Data Assimilation System (DAS) processing times can increase by as much as the square of the number of observations, depending on the assimilation method used. To circumvent this problem, most operational centers must resort to using very crude thinning methods to reduce data volume. These methods eliminate large amounts of data, some useful, while retaining a considerable amount of useless information. This paper will present a data thinning algorithm currently being designed in an ongoing collaborative project to address this problem. This recursive simplification algorithm is based on the concepts of data decimation or simplification used in computer graphics. The algorithm uses a quad-tree decomposition to recursively divide the data and calculate an objective information measure for the partitioned data. If the objective measure is greater than the user specified threshold, the algorithm continues dividing this partitioned data further. If the objective measure is less than the user threshold then the algorithm terminates that recursive path and the center data point of the quadrant is used as the representative thinned value. The other termination condition for the algorithm is when the recursion reaches the lowest level the data cannot be portioned further. Two objective measures have been used in the initial experiments with the algorithm. The first one uses a texture measure of homogeneity to determine the information content within a quadrant. A homogeneity measure of 1 implies that the partitioned data is homogeneous and has no variability, whereas a homogeneity measure of 0 indicates an inhomogeneous data with maximum variability. The second objective measure used is a statistical ratio of standard deviation and the mean. The greater the variability in the data points of a quadrant, the higher the value of this measure. The details of this algorithm, objective measures and the initial results from the data thinning algorithm will be described in this paper.
Supplementary URL: http://maya.itsc.uah.edu