The bio-geochemical activity of the oceans plays an important part in the carbon cycle. A change in climate and an increase of the quantity of available atmospheric carbon affect the primary oceanic production, and in turn cause a change in the oceanic bio-geochemical activity. This affects, by modifying albedo and carbon fixation rates, the atmospheric carbon concentration and the climate. It is therefore important to be able to better determine the oceanic primary production. Chlorophyll-A is one indicator of the primary production.
Many algorithms have been developed that infer the top of the ocean Chlorophyll-A concentration through satellite imaging (Gordon et al. 1997, Brajard et al. 2008, Seawifs Chlorophyll algorithms). It has also been proved that the vertical phytoplankton distribution, is correlated with surface data ( Uitz et al. 2006, Demarcq et al. 2008). However, standard modelling of the oceanic production has been proven to diverge from in situ measurements, (Kane et al.2010). This is in great part due to, either a lack of knowledge of all the complex relations existing between the parameters influencing the oceanic primary production system, or the complexity of the modelisation of these relations .
The present paper presents a statistical inversion approach towards the determination of the vertical Chlorophyll-A distributions through satellite imaging. In this approach we used two distinct statistical models : the Hidden Markov Models and the Self Organising Maps.The Hidden Markov Models allow us to learn the probabilistic links existing between two related time series consisting of discrete states: the observable time series and the hidden one. However in our study we have concurrent and continuous multidimensional time-series corresponding to our satellite observations and hidden vertical distributions. We therefore need to transform these time-series into two discrete-state time-series. To do so, we used the Self-Organising Maps.
The Self Organising Maps allow a discretization of multidimensional data into an appropriate number of typical situations, or classes, each represented by a a vector in the multidimensional data, and a position on a topological map. Elements that are close on the topological map will have vectors that are close in the multidimensional space. This in turn also permits a modification of the training algorithms of the Hidden Markov Models, circumventing problematic null probability links and allowing ergodicity within the Hidden Markov Model.Once we have obtained the discrete-states time-series, and completed the learning phase of the Hidden Markov Models, we can use them in order to infer the most likely time-series of hidden discrete-states corresponding to a given time-series of observation discrete-states. (Fig. 1)
In our problem, this allows us to reconstruct, at a specific point of the ocean, a time-series of states of the vertical distribution of Chlorophyll-A, based upon a time-series of sea-surface observation states. The discrete sea-surface observation states we classified correspond to concurrent sea-surface elevation, solar radiation flux, wind speed intensity, top of the ocean chlorophyll-a concentration, and sea-surface temperature time-series. The discrete hidden states correspond to Chlorophyll-A concentrations at seventeen depth levels from 5 to 220 meters below the sea surface.
The same approach was also applied in order to be able to infer the sea-surface states time series when there were missing observations due to cloud cover.
Figure 1. An illustration of the model components.