437554 Using Unbalanced Optimal Transport as an Image-to-Image Comparison Metric

Monday, 29 January 2024: 12:00 AM
345/346 (The Baltimore Convention Center)
Lander Ver Hoef, Colorado State University, Fort Collins, CO; and I. Ebert-Uphoff

When comparing two images, the most common methods currently used in environmental science are typically pixel-wise methods, such as mean squared error (MSE). These methods are fast to compute and in many applications work quite well, but because they focus only on differences within each pixel, they are not responsive to spatial patterns that we as humans find quite informative. This issue of spatial information is part of the motivation behind the increasingly common use of the fraction skill score (FSS) metric, which incorporates spatial information by locally aggregating pixelwise metrics. In this work, we present an introduction to a metric from applied mathematics which inherently incorporates spatial information.

The metric we will discuss is called the Sinkhorn Divergence [cite], and is a generalization of the Wasserstein metric (also known as the earth-mover’s distance) in the class of problems known as optimal transport. While Wasserstein metrics have been used in earth science many times before, they are typically applied to summary statistics about an image (such as pixel intensity histograms) to analyze how these overall statistics relate. However, recent work in applied mathematics has found generalizations and relaxations of the Wasserstein metric which are more easily computable, differentiable, and which do not have the constraint that the objects being compared have the same total sum. These generalized metrics allow us to compute useful distances directly between images.

The Sinkhorn divergence has a number of potential advantages as a metric in earth science. It directly incorporates spatial information in a flexible way, as numerous physical distance measures can be utilized; it is based on a physically reasonable set of assumptions; and it generates not just a single number representing the distance between images, but additionally provides the optimal transport plan, which describes the lowest-cost method for transforming one image into another. This transport plan is a key element that can be used to gain insight into where the most significant differences between the images being compared are.

In this work, we will provide a brief and intuitive introduction to optimal transport and the Sinkhorn divergence, as well as several examples of how it can be utilized in analyzing environmental science imagery, both in the context of satellite imagery time series as well as in comparing model output with ground truth.

References:

[1] Séjourné, T., Feydy, J., Vialard, F., Trouvé, A., & Peyré, G. (2019). Sinkhorn divergences for unbalanced optimal transport. arXiv. https://doi.org/10.48550/arxiv.1910.12958

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner