PEA: Phenomena Extraction Algorithm

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Wednesday, 1 February 2006: 9:30 AM
PEA: Phenomena Extraction Algorithm
A412 (Georgia World Congress Center)
Rahul Ramachandran, Univ. of Alabama, Huntsville, AL; and X. Li, S. Graves, R. D. Clark, and D. Fitzgerald

Presentation PDF (225.5 kB)

A phenomenon is defined as any state or process known through the senses rather than by intuition or reasoning, and thus is an observable event, especially something special or unusual. One can define a Geophysical Phenomenon in the context of scientific research and analysis in terms of an observable “region”, if the right science data fields are used. This region may have higher or lower than the average background value and higher variation or gradient as compared to the remaining data points. Typically the spatial extent of such a geophysical phenomenon is much smaller than the rest of the data. It can also have a temporal extent, meaning the size and magnitude of the phenomenon and the variation within the region can change over time.

The feature detection and extraction step is key in any data mining process. Mining applications in the geosciences generally require algorithms for detecting and extracting geophysical phenomena in addition to other features in the data. These algorithms are typically developed using domain knowledge and thus tend to be application and data specific.

In this paper, we will present the Phenomena Extraction Algorithm (PEA). PEA is a general purpose algorithm to detect and extract geophysical phenomena in science datasets, which does not depend on any specific science domain heuristics or target datasets. PEA is a recursive simplification algorithm based on the concepts of data decimation or simplification used in computer graphics. The algorithm uses a tree based decomposition technique to recursively divide the data into multiple sections, and calculates an objective information measure for each data partition. If the objective measure for a given partition is greater than a user specified threshold, the algorithm continues dividing this partitioned data further. If the objective measure is less than the user threshold then the algorithm terminates that recursive path. The other termination condition for the algorithm is when the recursion reaches the lowest level the data cannot be partitioned any further. In this termination case, all the lowest level data points are retained. The objective measure used in PEA is a combination of two statistical tests (F-Test and T-Test) to account for both the location and the spread of the data values. All the data points remaining after the decomposition are aggregated into regions using image processing techniques, with these regions representing the geophysical phenomena detected.