10.5
PEA: Phenomena Extraction Algorithm
Presentation PDF (225.5 kB)
The feature detection and extraction step is key in any data mining process. Mining applications in the geosciences generally require algorithms for detecting and extracting geophysical phenomena in addition to other features in the data. These algorithms are typically developed using domain knowledge and thus tend to be application and data specific.
In this paper, we will present the Phenomena Extraction Algorithm (PEA). PEA is a general purpose algorithm to detect and extract geophysical phenomena in science datasets, which does not depend on any specific science domain heuristics or target datasets. PEA is a recursive simplification algorithm based on the concepts of data decimation or simplification used in computer graphics. The algorithm uses a tree based decomposition technique to recursively divide the data into multiple sections, and calculates an objective information measure for each data partition. If the objective measure for a given partition is greater than a user specified threshold, the algorithm continues dividing this partitioned data further. If the objective measure is less than the user threshold then the algorithm terminates that recursive path. The other termination condition for the algorithm is when the recursion reaches the lowest level the data cannot be partitioned any further. In this termination case, all the lowest level data points are retained. The objective measure used in PEA is a combination of two statistical tests (F-Test and T-Test) to account for both the location and the spread of the data values. All the data points remaining after the decomposition are aggregated into regions using image processing techniques, with these regions representing the geophysical phenomena detected.