Hilbert Curves Applied to the Efficient Approximate Estimation of Spatial Density of Data the Exhibits Strong Clustering

Purser, R. James; Purser, R. James

An application of the space-filling Hilbert curve to the practical resolution of a difficulty encountered in the assimilation of some highly inhomogeneously-distributed data is described. When certain types of data, such as the aircraft reports in a global domain, or mesonet surface data in a high-resolution mesoscale domain, exhibit a strong sporadic clustering at a spatial scale where the representativeness error components become the dominant source of effective observational error, it becomes necessary, in practice, to artificially reduce the effective weight that these data have in an assimilation scheme. A traditional resolution of this difficulty has been to ‘thin’ the data in regions where unduly close pairs exist, which means that potentially good data are entirely discarded. An alternative approach is to down-weight the data perceived to be in regions of excessive data density. The precision weights of the data can be modulated by a factor inversely proportional to that density where the density is high enough to make this factor less than one, otherwise the weight is left alone. But the difficulty with this approach is that the estimation of data density by a conventional smoothing method on a sufficiently fine grid can be very expensive, and inefficient in the sense that the region covered may be mostly void of any data.
An alternative approach, which is the subject of this presentation, is to project the data onto a space-filling ‘Hilbert curve’, which associates to each datum a real parameter that specifies its location along this curve, and to smooth the incidence function of the sorted data along the curve only at the observations themselves. The smoothing can be very simple, such as a constant-width moving average, and can be done knowing only the sequential locations of the sorted data and without the need for any regular auxiliary grid. Provided the Hilbert curve is constructed ‘isometrically’, with a constant ratio between the curve segment length and the corresponding volume or area of filled space, the resulting estimate of data density along the curve can be translated directly into an implied estimate of the data density in area or volume. Although there is a substantial sampling error that results from the effectively random way that the a single curve fills the region, this unwanted component of error in the density estimate is greatly reduced when averaging estimates using several Hilbert curves, defined with quasi-random orientations of their associated geometrical frameworks.
We shall report on experiments that apply this approach to both three-dimensionally distributed aircraft measurements on a global domain, and preliminary results with the application of the two-dimensional version of this approach in a mesoscale domain with mesonet surface data in the Real-Time Mesoscale Analysis (RTMA) scheme.

136 Hilbert Curves Applied to the Efficient Approximate Estimation of Spatial Density of Data the Exhibits Strong Clustering