Information-Based Analysis of Data Assimilation
To analyze DA assumptions, uncertainty is quantified as the Shannon-type entropy of a discretized probability distribution. The maximum amount of information that can be extracted from observations about model states is the mutual information between states and observations, which is equal to the reduction in entropy in our estimate of the state due to Bayesian filtering. The difference between this potential and the actual reduction in entropy due to Kalman (or other type of) filtering measures the inefficiency of the filter assumptions. Residual uncertainty in DA posterior state estimates can be attributed to three sources: (i) non-injectivity of the observation operator, (ii) noise in the observations, and (iii) filter approximations. The contribution of each of these sources is measurable in an OSSE setting.
The amount of information extracted from observations by data assimilation (or system identification, including parameter estimation) can also be measured by Shannon's theory. Since practical filters are approximations of Bayes' law, it is important to know whether the information that is extracted form observations by a filter is reliable. We define information as either good or bad, and propose to measure these two types of information using partial Kullback-Leibler divergences. Defined this way, good and bad information sum to total information. This segregation of information into good and bad components requires a validation target distribution; in a DA OSSE setting, this can be the true Bayesian posterior, but in a real-world setting the validation target might be determined by a set of in situ observations.