Development of these new CERES algorithms can be explained in two steps. The first step includes the validation of the ML algorithm called Random Forests (RF), which is used to classify the CERES broadband footprint measurement into clear and cloudy scenes. The second step involves the conversion of CERES TOA clear-sky and all-sky directional radiance to TOA fluxes using a ML algorithm called artificial neural networks (ANN). Random Forests use decision tree classifiers as the base learner, which can be represented using a flow-chart-like tree structure (Breiman 2001). In the first part of the study, RF classification is carried out using CERES radiances (TOA LW and SW) and ancillary variables into clear and cloudy scenes without using any imager information. For this purposes we have used CERES Single Scanner Footprint (SSF) data set for all the 12 months from 2003-2013. Input variables are split in to two groups; CERES variables and ancillary variables. CERES variables used in the analysis are solar zenith angle (SZA), viewing zenith angle (VZA), relative azimuth angle (RAZ), CERES TOA LW and SW broadband radiances and surface type information. Ancillary variables used in the analysis are LW surface emissivity, broadband surface albedo, surface skin temperature, surface wind speed and atmospheric precipitable water. Our primary goal was to train and test the efficiency of RF algorithm in classifying the CERES radiances in to clear and cloudy classes. For this purpose, a training and test dataset is developed using the multi-year SSF data. A typical monthly CERES SSF dataset contains millions of CERES footprints spread all over the globe and are very difficult to process simultaneously. In order to create more compact training dataset, the SSF dataset is stratified in the variable of interest (SZA, VZA, RAZ, TOA radiances) and corresponding mean values are used in the analysis. The training and test dataset is classified into clear and cloudy classes and labeled. Using the Random forest algorithm and training dataset, a trained forest of decision trees are built and saved. Using this saved forest and RF algorithm, the test dataset is then classified in to clear and cloudy classes. RF classification of CERES scene types in to clear and cloudy classes show very good results with average classification error being < 5% for most of the surface types. This study shows that using Random forest method, it is possible to successfully classify the CERES scene type into clear and cloudy scenes without using any imager information.
Once the TOA radiances are classified into ‘clear-sky’ and ‘cloudy’ scenes, the next step in the analysis involves TOA radiance to flux conversion. For this purpose, a feed-forward error back propagation Artificial Neural Network (ANN) algorithm is used to produce CERES ADMs that can be used for the conversion of CERES TOA radiances to flux. A clear-sky and all-sky ANN based ADMs are developed for TOA SW and LW flux retrieval. Results from the ANN based analysis is tested by comparing TOA fluxes estimated using ANN and that from CERES SSF products. Compared to the all-sky only ANN approach used by Loukachine and Loeb (2003), the new combined clear-sky and all-sky ANN approach allow much better determination of clear-sky flux, all-sky flux and the cloud radiative effect, which is critical in understanding the radiative effect of clouds in our climate system. Based on the analysis using ~1.5 million clear sky test data points (over Ocean), modified ANN-clear sky approach for SW TOA flux produced lower bias values for ~60-70% of test cases compared to the Loukachine and Loeb method. For TOA LW cases, using modified ANN-clear sky method produced relatively lower bias values of ~85-90% for test cases.