Evaporation duct height (EDH) is an important parameter that describes the evaporation ducts. Evaporation duct height is the altitude where the modified refractivity (M) is the minimum in the surface layer. For evaporation ducts, the duct height and the EM energy trapping layer are the same. Modified refractivity is a function of atmospheric pressure, temperature, and humidity. Hence, profiles of pressure (P), temperature (T), and vapor pressure (e) are required to estimate the evaporation duct height. Due to the inherent difficulty to make in situ profile observations in the surface layer, the profile functions in the Monin-Obukhov Similarity Theory (MOST) based bulk flux algorithms are used to estimate the P, T and e profiles in the MASL. During CASPER-field experiment, we employed the widely used Navy Atmosphere Surface Layer Model (NAVSLaM) to determine the evaporation duct heights from the forecast fields generated using the Coupled Ocean-Atmosphere Mesoscale Prediction System (COAMPS) model.
This study examined the effectiveness of numerous machine learning methods for predicting evaporation duct height. The predictions produced by the NAVSLaM were assumed as ground truth, and an attempt was made to approximate this function with various machine learning technologies. We also investigated restricting the number of input features and their effects on model approximation. In the end, we sought a model with the fewest inputs that best approximated NAVSLaM. Our analysis also provided insight into the important input features for predicting EDH. To date, we have constructed models with similar predictive power using Random Forests, Deep Learning, and Gradient Boosting Machines. An initial grid search over each model’s hyperparameter space was performed. The best Random Forest model performed slightly better than the other modeling techniques. A Mean Absolute Error (MAE) of about 0.1 meters was achieved on a common, held-out test set. In follow-on analyses, our grid search will be expanded, and additional modeling technologies will be evaluated, such as Support Vector Machines (SVM) and Generalized Linear Models (GLM). The modeling techniques examined thus far are applicable to a specific geographic region for a narrow period. Follow-on analyses will encompass broader geographic areas and longer temporal variability. We will also evaluate performance differences between global models and regionally or temporally specific models.