Thursday, 1 February 2024: 2:15 PM
317 (The Baltimore Convention Center)
Machine Learning (ML) approaches have been shown to outperform statistical approaches in most cases. However, results in the literature have generally been challenging to reproduce when the code or datasets are not available, and the runtime environment is not well documented. To combat this, the ML community is moving towards an open-source approach, where it is encouraged to make code and data publicly available to ensure reproducibility.
However, there is not a single catalog of all datasets and models that are publicly available, but rather one-off repositories scattered across various sites. Each of the repositories provides its own pipeline for transforming data, training a model, performing prediction, and evaluating the results. In essence, researchers are reinventing the wheel implementing their own ML pipeline even though most of the pipeline can be generalized and streamlined for a variety of problem areas.
NextGen Federal Systems’ cloud-based StratusML platform addresses these issues by providing a one-stop shop to develop and operationally deploy new ML models at scale, using an efficient and state-of-the-art cloud-agnostic pipeline approach to train and evaluate models based on public and user-provided data sources. StratusML provides data pre-processing features such as normalization and outlier detection, data labeling capabilities, and data experimentation and visualization tools. Users can quickly kick-off model training, prediction, and evaluation tasks, monitoring the progress of each throughout its runtime all in a code-less environment. StratusML trained models are easily extracted for inference in an operational environment.
StratusML currently houses over 25 state-of-the-art models across various task types including time-series forecasting, object detection and tracking, image classification, image and text generation, and text analysis. Specifically, tenants are using the platform for forecasting and identifying impacts to aviation operations. Tenants are using StratusML to forecast contrail formation and persistence, aviation turbulence and icing, detect contrail formation, classify cloud types, forecast space weather, and generate global synthetic weather radar products.
For the contrail formation modeling effort, the team recognized the biggest hurdle in developing state-of-the-art models was the lack of contrail observation data. Using StratusML, the contrail team created, evaluated, and registered a standardized contrails database that can be shared across StratusML users. Teams were able to prototype new models using this data, and easily compare model performance using this standardized test set. Users developed several ML models, including random forest and deep neural networks, but saw the best performance with a Fully Convolutional Network (FCN), achieving an F1 score of 86% for predicting contrail formation. The standardized (and growing) contrails dataset and associated trained and tested models are housed in StratusML, so users can develop new models or use one of the existing contrails models for their own needs.
Tenants of the platform are also looking at forecasting turbulence and icing to help mitigate the effect on aviation operations. The turbulence team provided commercially purchased IATA Eddy Dissipation Rate (EDR) data to the platform as a training and testing dataset. The same team is providing a year’s worth of Tropospheric Airborne Meteorological Data Reporting (TAMDAR) sensor system data to develop an aviation icing product to StratusML. The team seeks to adapt the StratusML registered FCN model for forecasting icing as it is the highest performing model for both contrail formation and turbulence forecasting applications. This generalizability of the StratusML models allows for re-use across problem sets, reducing the cost and time associated with ML prototyping.
The backbone to the StratusML code-less environment is the use of a standardized API that all the ML artifacts (models, transforms, and evaluators) within the platform adhere to. All the ML artifacts within StratusML are wrapped against the API and containerized through Docker for distribution. These Docker containers hold all the code and necessary environment variables of the artifact, leaving it up to StratusML, and not the end-user, to handle the environment building prior to using the artifact.
StratusML takes advantage of this containerization technique for deploying the developed models to operations. For example, the Global Synthetic Weather Radar team is working on hardening synthetic weather radar models and deploying them to operations. The team can experiment with and train new iterations of the synthetic weather radar model within the platform. The containerized model is easily extracted and deployed to their operational environment, relying on the Docker container to setup the environment for the model to run.
NextGen has completed the first Research to Operations (R2O) transition of StratusML, deploying the platform to the Air Force Weather Cloud environment. The transition was an important step to expand StratusML, moving it from a research project to an operational ecosystem. Subsequently, the ML models developed using the platform, including contrail formation and aviation turbulence, are being transitioned to an operational testbed. The platform is customizable and portable to other cloud environments through our tailorable deployment scripts. StratusML provides a library of containerized models and data-processing tools readily deployed to any cloud environment through Kubernetes. The platform and registered artifacts adhere to DoD cybersecurity standards (NIST RMF, ATO) and continuous cybersecurity scanning of the platform is performed through Continuous Integration/Continuous Delivery pipelines.
StratusML helps bridge the gap between DoD-mission SMEs and ML SMEs by providing a centralized, code-less platform to rapidly prototype state-of-the-art models using streamlined processes. This reduces the duplication of work and provides life-cycle management of ML artifacts to ensure reproducibility. StratusML allows for rapid experimentation and prototyping of ML techniques, accelerating the advancement of AI and ML across a variety of domains in the DoD -- most notably for aviation operational applications.
However, there is not a single catalog of all datasets and models that are publicly available, but rather one-off repositories scattered across various sites. Each of the repositories provides its own pipeline for transforming data, training a model, performing prediction, and evaluating the results. In essence, researchers are reinventing the wheel implementing their own ML pipeline even though most of the pipeline can be generalized and streamlined for a variety of problem areas.
NextGen Federal Systems’ cloud-based StratusML platform addresses these issues by providing a one-stop shop to develop and operationally deploy new ML models at scale, using an efficient and state-of-the-art cloud-agnostic pipeline approach to train and evaluate models based on public and user-provided data sources. StratusML provides data pre-processing features such as normalization and outlier detection, data labeling capabilities, and data experimentation and visualization tools. Users can quickly kick-off model training, prediction, and evaluation tasks, monitoring the progress of each throughout its runtime all in a code-less environment. StratusML trained models are easily extracted for inference in an operational environment.
StratusML currently houses over 25 state-of-the-art models across various task types including time-series forecasting, object detection and tracking, image classification, image and text generation, and text analysis. Specifically, tenants are using the platform for forecasting and identifying impacts to aviation operations. Tenants are using StratusML to forecast contrail formation and persistence, aviation turbulence and icing, detect contrail formation, classify cloud types, forecast space weather, and generate global synthetic weather radar products.
For the contrail formation modeling effort, the team recognized the biggest hurdle in developing state-of-the-art models was the lack of contrail observation data. Using StratusML, the contrail team created, evaluated, and registered a standardized contrails database that can be shared across StratusML users. Teams were able to prototype new models using this data, and easily compare model performance using this standardized test set. Users developed several ML models, including random forest and deep neural networks, but saw the best performance with a Fully Convolutional Network (FCN), achieving an F1 score of 86% for predicting contrail formation. The standardized (and growing) contrails dataset and associated trained and tested models are housed in StratusML, so users can develop new models or use one of the existing contrails models for their own needs.
Tenants of the platform are also looking at forecasting turbulence and icing to help mitigate the effect on aviation operations. The turbulence team provided commercially purchased IATA Eddy Dissipation Rate (EDR) data to the platform as a training and testing dataset. The same team is providing a year’s worth of Tropospheric Airborne Meteorological Data Reporting (TAMDAR) sensor system data to develop an aviation icing product to StratusML. The team seeks to adapt the StratusML registered FCN model for forecasting icing as it is the highest performing model for both contrail formation and turbulence forecasting applications. This generalizability of the StratusML models allows for re-use across problem sets, reducing the cost and time associated with ML prototyping.
The backbone to the StratusML code-less environment is the use of a standardized API that all the ML artifacts (models, transforms, and evaluators) within the platform adhere to. All the ML artifacts within StratusML are wrapped against the API and containerized through Docker for distribution. These Docker containers hold all the code and necessary environment variables of the artifact, leaving it up to StratusML, and not the end-user, to handle the environment building prior to using the artifact.
StratusML takes advantage of this containerization technique for deploying the developed models to operations. For example, the Global Synthetic Weather Radar team is working on hardening synthetic weather radar models and deploying them to operations. The team can experiment with and train new iterations of the synthetic weather radar model within the platform. The containerized model is easily extracted and deployed to their operational environment, relying on the Docker container to setup the environment for the model to run.
NextGen has completed the first Research to Operations (R2O) transition of StratusML, deploying the platform to the Air Force Weather Cloud environment. The transition was an important step to expand StratusML, moving it from a research project to an operational ecosystem. Subsequently, the ML models developed using the platform, including contrail formation and aviation turbulence, are being transitioned to an operational testbed. The platform is customizable and portable to other cloud environments through our tailorable deployment scripts. StratusML provides a library of containerized models and data-processing tools readily deployed to any cloud environment through Kubernetes. The platform and registered artifacts adhere to DoD cybersecurity standards (NIST RMF, ATO) and continuous cybersecurity scanning of the platform is performed through Continuous Integration/Continuous Delivery pipelines.
StratusML helps bridge the gap between DoD-mission SMEs and ML SMEs by providing a centralized, code-less platform to rapidly prototype state-of-the-art models using streamlined processes. This reduces the duplication of work and provides life-cycle management of ML artifacts to ensure reproducibility. StratusML allows for rapid experimentation and prototyping of ML techniques, accelerating the advancement of AI and ML across a variety of domains in the DoD -- most notably for aviation operational applications.

