Thus, a project is recently funded by NASA Earth Science Technology Office (ESTO) to create an Automated Event Service (AES), based initially on NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) data, that will provide 1) an intuitive web interface for basic event definitions, 2) an Event Specification Language (ESL) modeled after popular scripting languages (e.g. Python) for more sophisticated event definitions, 3) a social component for scientists of like interests to collaborate on the definitions of events, 4) a database to catalogue potential event definitions and query results, and 5) a linkage to find corresponding data in NASA's vast store of remote sensing observations. It is the intention of the project team to make the service interactive; that is, we aim to return event query results in real time. Thus, it necessitates the application of data-intensive techniques.
In search for an efficient technique for AES, we have evaluated the following data-intensive techniques: SciDB, MapReduce (MR) combined with Hadoop File System (HDFS), and a custom technique, on a “junior” AES with a reduced set of capabilities and for a subcollection of the intended datasets. In this presentation, we report respective strengths and weaknesses of the techniques listed above and lessons learned for each.