Abstract: Quantitative Analysis of the Benefit of Ontologies and Rich Metadata for Earth Science Data Discovery (85th AMS Annual Meeting)

Monday, 10 January 2005: 11:00 AM

Quantitative Analysis of the Benefit of Ontologies and Rich Metadata for Earth Science Data Discovery

Carroll A. Hood, Raytheon, Aurora, CO; and L. D. Forsyth and L. Olsen

Earth science data and observations hold value beyond their originally intended research or operational application. In fact, the underlying premise for the Global Earth Observation System of System (GEOSS) is that these observations can be intelligently exploited to provide worldwide tangible economic and social benefits. A key enabler to achieving this vision is semantic interoperability across disconnected or loosely connected data and knowledge domains.

Semantic interoperability is quite new and few, if any, attempts have been made to quantify the benefit of these techniques. This paper provides a quantitative view of the benefits of using ontologies with Earth science data to extract rich metadata and empower diverse groups of users to make context sensitive queries.

At the directory level, we will define an index, the Knowledge Benefit ratio (KB ratio) that will be used to quantitatively assess the benefit of using ontologies to assist in Earth Science data discovery. A Figure-of-Merit (FOM) analysis will be used to compare the result of queries against the Global Change Master Director (GCMD) - both with and without the use of ontologies. We will compare the relative benefit of using a high-level ontology, such as the Semantic Web for Earth and Environmental Technologies (SWEET), the combination of domain-specific ontologies, and cross-mapped ontologies. The results may provide guidance for future development and exploitation of Earth science related ontologies.

For data discovery at the inventory level, we will demonstrate the utility of rich metadata to improve search efficiency. The same FOM analysis will be applied to provide a quantitative measure of the relative benefit of this approach. In a high data rate, high data volume environment, the concept of rich metadata could place an unrealistic burden on the product processing requirements. Thus, it is imperative that the approach to rich metadata be dynamically configurable (apply only to the right data at the right time) and responsive to evolving priorities. We will discuss some of the implications of these constraints on the development and exploitation of rich metadata methodologies.

Supplementary URL: