EnviroNet: ImageNet for Environment

Mukkavilli, Surya Karthik; Mukkavilli, Surya Karthik

Geoscience is a field that requires solutions to several critical problems facing our society where our planet is exceeding its boundaries through climate change impacts, ocean acidification, air pollution, atmospheric aerosol loading, ecological change resulting in biodiversity loss and land system change and consumption of natural resources such as water [1]. As geoscience enters the era of big data, machine learning and computer vision — that has been widely successful in commercial artificial intelligence (AI) domains — offers immense potential to contribute to problems in geoscience. Problems in geoscience however have several unique challenges that are seldom found in traditional AI applications, requiring novel algorithm formulations and methods for capturing features of data (e.g. fluid segmentation and feature characterisation) [2].

'ImageNet' refers to one of the largest labelled data repositories that has been successfully used to track progress of AI algorithms with the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [3, 4]. The database and challenge has led to successful commercial applications in the technology industry. The database design and challenge focused solely on the recognition of objects in human-captured imagery, which is not well-suited to geosciences, which require a new standard dataset and challenge formulations. For example, geoscience objects include waves, flows, and structures in all phases of matter. Hence, the structure, and patterns of geoscience objects that can exist at multiple scales in continuous spatio-temporal fields are much more complex than those found in the discrete spaces that AI algorithms typically work with, from a purely object recognition perspective (e.g. items in a basket or tagging, dog and cat images). With geoscience changing from a data poor to a data rich field, we are increasingly witnessing algorithms such as deep convolutional neural networks, that dramatically reduced ILSVRC error rates in the 2010s, being increasingly applied to geoscience problems. It is therefore, conceivable that a similar ImageNet designed for geosciences and a global challenge may help bridge gaps with AI and advance both fields.

This work thus discusses an ongoing proposal and project update on the creation of an ImageNet analog for environmental sciences. Our efforts will contribute a novel labelled dataset called the EnviroNet for the wider community across both geoscience and AI fields, and design of an EnviroNet-International planetary computing challenge (E-IPCC). At present, feedback is being sought on EnviroNet from various universities, industry, research labs and startups with support from the American Meteorological Society Committee on Artificial Intelligence Applications to Environmental Science and collaborators. This EnviroNet project will seek to provide a new benchmark and labelled dataset lacking in geoscience to track progress of AI algorithms, to bridge gaps in both fields. Similar to ImageNet categories, the project will seek to advance the field instead to address geoscience challenges: 1) localise model or satellite data against ground observations 2) classify events, including extremes 3) track events in spatiotemporal fields, and a category unique to geosciences that will involve, 4) forecasting future events. It is expected that the EnviroNet labelled dataset and challenge will be made available at groundobs.org.

[1] Rockström, J., Steffen, W., Noone, K., Persson, Å., Chapin III, F. S., Lambin, E. F., ... & Nykvist, B. (2009). A safe operating space for humanity. nature, 461(7263), 472.

[2] Karpatne, A., Ebert-Uphoff, I., Ravela, S., Babaie, H. A., & Kumar, V. (2018). Machine Learning for the Geosciences: Challenges and Opportunities. IEEE Transactions on Knowledge and Data Engineering.

[3] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (pp. 248-255). Ieee.

[4] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.

6.1 EnviroNet: ImageNet for Environment