J49.1 Addressing FAIR Data Principles Sustainably

Wednesday, 15 January 2020: 3:00 PM
157C (Boston Convention and Exhibition Center)
David W. Gallaher, Univ. of Colorado, Boulder, CO; and R. McAllister, C. Pankratz, J. Craft, G. Grant, and K. Schaefer

There are massive changes on the horizon that will challenge FAIR principles. These changes include changes in how, when, and for how much data is made available. With the advent of New Space there are now commercial interests collecting high value data from space without government funding. In the next few years, the vast majority of Earth observing satellites with be owned and controlled privately. These factors, along with recent advances in Machine Learning, Artificial Intelligence, and Data Mining have laid bare the need for change in how data is accessed and made available. The challenge of the commercial cloud and its associated costs have driven many organizations to realize that the concept of free data and processing is not sustainable. Any sustainable solution for providing FAIR data must include a realization that data has value, and that this value must be distributed to those involved in producing it. In other words, perhaps we should consider the acronym of “FAIRS” that adds the term “Sustainable” as none of the other FAIR objectives can be met if the data and the platform for the data cannot be maintained in the long run.

As a critical component of a solution, we present the International Center for Earth Data (ICED), a cloud-based data marketplace that produces and disseminates low-latency Analysis-Ready Data (ARD) from heterogeneous Earth data sources. It is a market-sustainable platform, meaning that it addresses the problem of value distribution by having the data producers retain their ownership of that data, selling it through ICED. Recognizing that there is a time-value to Earth data, ICED allows free access to Earth datasets by researchers after the commercial value of the data has “aged,” usually after a month or so. We believe that this is the proper way to fund data development and the systems to distribute that data while at the same time satisfying the needs of scientists in the research community.

