92nd American Meteorological Society Annual Meeting (January 22-26, 2012)

Tuesday, 24 January 2012
Ensuring Data Integrity During Subsystem Failure Recovery within the NOAA Jason Ground System
Kirill Lokshin, Ingenicomm, Inc., Centreville, VA; and A. Puri, F. Tao, A. Agarwal, and S. Tehranian

The National Oceanic and Atmospheric Administration (NOAA) Jason Ground System (NJGS) is a consolidated next-generation ground system that will support the simultaneous operation of the OSTM/Jason-2 and Jason-3 ocean surface topography missions. The NJGS will consist of several independent subsystems for spacecraft command and control, telemetry processing, and data archiving and distribution.

To provide high availability and multi-level resilience against equipment failures, the NJGS will employ a subsystem-level redundancy scheme, in which two or more independent instances of each subsystem provide fully redundant functionality, for the various subsystems within the NJGS. The use of this scheme requires the implementation of several safeguards to ensure that a subsystem failure and the resulting system recovery operations do not result in any loss of operational data.

This paper discusses the key elements of the subsystem-level redundancy scheme and the mechanism through which the NJGS recovers from a subsystem failure. The paper focuses on the potential failure scenarios present in the recovery process, and the technical and procedural safeguards necessary to ensure data integrity across subsystem instances.

