Hydrometeor classification by means of statistical clustering: a semi-supervised approach

Besic, Nikola; Besic, Nikola

We propose a new approach for hydrometeor classification using polarimetric radar dataset. This approach arises from the evolution of the unsupervised clustering method for hydrometeor classification, proposed by Grazioli et al. in 2015. Briefly, the obtained clusters, which correspond to different classes of hydrometeors are here identified and to a degree modified, by involving a comprehensive scattering simulator. This supplemental process represents the principal extension of the original algorithm. The constraints introduced through the scattering simulations of hydrometeors make this version a semi-supervised one. At this stage, in order to minimize the presence of mixtures, we do not consider resolution cells from a very long range.

The strategy can be summarized as follows:

Step I: 'clusters and simulated models'. The clusters are derived by employing the Agglomerative Hierarchical Clustering (AHC) with implicitly introduced spatial information. Polarimetric radar signatures of the hydrometeor classes are simulated by employing both single and double layer T-matrix scattering models. As for the input parameters, we have accounted for a wide range of parameters, above all, Particle Size Distribution (PSD) parameters. These are derived from both our own measurements and measurements reported in the literature. Following this step, we aim at having hypotheses that are as comprehensive as possible.

Step II: 'cluster identification'. The five-dimensional distributions of the clusters are compared with their simulated counterparts for the hydrometeor classes. The comparison is built upon the Kolmogorov-Smirnov statistical test. Clusters which do successfully pass the test (zero hypothesis accepted) are immediately labeled. Namely, these clusters fit one of the simulated classes well enough (with a certain tolerance). On the other hand, clusters for which the zero hypothesis is rejected are sent to 'cleaning'.

Step III: 'cluster cleaning'. The cleaning is actually a step backward with respect to Step I, meaning that the clusters, previously obtained by a process of merging, are now to be iteratively divided in two. That is to say, this operation is a form of reverse clustering which aims to bring one of the newly derived clusters closer to the closest simulated class, as determined by the Kolmogorov-Smirnov test. In achieving this, we comparatively use the unconstrained approach and the constrained approach. The former relies on the k-medoids algorithm, while the latter exploits the statistics of the closest simulated class in splitting the cluster. The percentage of labeled observations is very satisfying in both cases.

In conclusion, by relocating the scattering simulations to the end of the chain we are trying to avoid our decision having a large dependency on the always controversial choice of the hypotheses related to the electromagnetic properties of hydrometeors. The additional peculiarity of this version is that clutter is also treated as a class, allowing us to deal with its residuals after the AH clustering.

The method has been applied on X-band polarimetric datasets collected in Switzerland and in the region of Ardèche (France), as well as on C-band polarimetric datasets collected by the Albis radar of MeteoSwiss in Switzerland. The achieved benefits are illustrated through the comparison with a rather classical supervised approach, by considering the ground truth data and the spatial consistency of the labeled bins.

49 Hydrometeor classification by means of statistical clustering: a semi-supervised approach