State-aware workflow for tuning a climate model
Here we focus on the workflow software that underpinned the research, after summarising the work undertaken and its methodology. This is described in full in Tett et al. (in press).
The four parameters we chose are ones that previous research had shown to strongly affect climate sensitivity.
1. Entcoef: In the model's convection scheme this controls the rate at which bulk air is entrained into the convective plumes.
2. Vf1: controls the speed of falling ice.
3. Ct : this defines the rate at which cloud droplets turn into rain.
4. Rhcrit: A parameter used to define the critical value, at each model level, of relative humidity for clouds to form.
The workflow was cyclical: a set of runs, each with different parameter values was performed, the outcome of each set being a definition of a new set of runs. The processing stopped when the goodness of fit was not improving or when the goodness of fit was within a defined tolerance of the observed radiation target.
The inherent noisiness of climate data precludes some optimisation methods. We used an approximate derivative method, in which 5 runs were performed in a first step: a “base” run with determined parameter values, and then four runs, each with one parameter perturbed by a fixed amount. Following these initial 5 runs, regularised approximate derivatives are computed. From these a Gauss-Newton line-search optimisation method was used to compute an improvement to the current best parameters' values, and form a new base run. The line-search entails using the 5 runs to define a vector beginning from the parameter values of the current base run. Each parameter has a range of allowed, expert-defined, values. If the vector crosses these bounds then it is shortened. We choose three points on that vector: the full feasible vector, at 90% and at 30% of the feasible vector. Three model runs are performed for each of these three parameter sets; their results are compared to the observations, and the best of these three runs is then used to define a new base for the next set of 5 runs. When no improvement is found, the exploration of the parameter space is stopped.
A typical exploration from a defined initial point took about 36 model runs, each of 6.25 model years to generate data that can be compared with the observed outgoing radiation. The model we used is HadAM3, a model still in widespread use, although it is not state-of-the-art. Each run required about 90 minutes, using 12 processes on a cluster running MPI, and using Infiniband interconnect. The researcher chooses the initial values, the step size for each parameter, and the allowed range of values for each parameter.
The software components are:
1. a text file with contents that define the perturbations to parameters that are to be applied for each run
2. a Python script that reads the file, and applies these perturbations, submitting each model run to a batch job system
3. The scripts, executables and parameter files for the unperturbed climate model
4. An additional script that executes after each model run has finished. This checks for successful completion of the run and then invokes the following:
a) A Fortran program that extracts the globally averaged outgoing radiation, for comparison with the satellite observations.
b) A check of the state of the current runs. If other runs are continuing, no further action is taken. If all runs are now complete then the script executes:
c) Programs that derive the approximate derivatives (in the case of the 5 runs) or else compare the goodness of fit (in the case of the three runs on the line search), check for convergence or stagnation, and derive the next set of parameter perturbations to be applied.
d) The script to initiate further runs.
A consequence of the scripts being able to determine the state of the current set of runs is that no process that orchestrates the workflow is needed. The processing makes effective use of the batch job queue and the available capacity of our University cluster. Careful configuration of the basic model run, from which all other runs are derived, permits concurrent execution of model runs with in a set, and also model runs in multiple explorations from different starting points. Once that configuration is complete, and the initial parameter values are defined, and the initial script invokes these runs, the exploration proceeds with no intervention.
We intend to continue this work by testing the methodology with more parameters and different climate models.
Tett, S. F. B., M. J. Mineter, C. Cartis, D. J. Rowlands, P. Liu: in press: Can Top Of Atmosphere Radiation Measurements Constrain Climate Predictions? Part 1: Tuning. Journal of Climate
Tett, S. F. B. , D. J. Rowlands,M. J. Mineter, C. Cartis: in press: Can Top of Atmosphere Radiation Measurements Constrain Climate Predictions? Part 2: Climate Sensitivity. Journal of Climate