16B.5 A Machine Learning System in GOES-R Ground System for Monitoring Satellite Health & Safety

Thursday, 1 February 2024: 5:30 PM
316 (The Baltimore Convention Center)
Zhenping Li, ASRC Federal, Beltsville, MD; ASRC Federal, Beltsville, MD

The technologies in AI and Machine Learning (ML) are critical components in ground system architectures for resilient space missions. A ML system currently deployed in the GOES-R ground system for monitoring satellite health and safety is presented. The ML system is scalable to handle around 1800 telemetry datasets for each satellite in the GOES constellation and is extensible to meet requirements from different missions. The ML operation is defined by a database that provides hierarchical data definitions with ML attributes. Each dataset is associated with a ML algorithm implemented as a plugin and play software components with the standard API and a collection of ML algorithms in the ML system provides flexibility for selecting an ML algorithm for specific datasets based on complexity of its data pattern to ensure both efficiency and accuracy in data training operations, which enables rapid deployments to new missions. The incremental operation concept performs ML data trainings in periodic sessions and training outputs in previous sessions are used as inputs for current sessions, which has been critical to provide data training efficiency for an ML system in operational environments and allows data training outputs to be adaptive to seasonal data pattern changes. Anomaly detection in an ML solution is to find unexpected data pattern changes in telemetry datasets. However, normal satellite operations, such as periodical satellite maneuvers, also lead to data pattern changes in telemetry datasets. The challenge for ML solutions in satellite health and safety monitoring is to separate data pattern changes of normal operation events from those caused by anomalies. In addition, interactions among different subsystems in a satellite leads to correlations among datasets so that data pattern changes generally occur among multiple datasets in different subsystems during normal operation events or anomalies. The ML system provides an ML event representation based on data pattern changes in telemetry datasets among different subsystems, which captures signatures of normal operation events and anomaly characteristics based on patterns of correlations among different datasets. Anomaly detection is achieved with a clustering algorithm on ML event representations, which removes false positives that has been a significant challenge for ML solutions in satellite health and safety monitoring.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner