691 A Machine Learning Model to Estimate Oak Pollen Concentration in Korea

Tuesday, 9 January 2018
Exhibit Hall 3 (ACC) (Austin, Texas)
Yun Am Seo, National Institute of Meteorological Sciences/Korea Meteorological Administration, Seogwipo-si, Korea, Republic of (South); and T. H. Kim, C. Cho, B. J. Kim, and K. R. Kim

Pollen is known as an inducer of allergic diseases such as allergic rhinitis and allergic dermatitis. The amount of observed pollen is higher than in the past and pollen allergic patients are also increasing. In addition, the patient's age is expanded to young children. Therefore, the risk of pollen allergy is expected to increase. The Korea Meteorological Administration (KMA) has implemented a pollen allergy warning system in response to the risk of pollen allergy. Since the daily pollen concentration is influenced by weather conditions, the regression model has been used in KMA to describe the relationship between observed pollen concentrations and meteorological factors. However, this model underestimates the concentration of pollen and does not predict the high concentrations. It also over-predicts pollen season period by about 60 days compare to observation. The reason for this is that the regression model does not simulate the nonlinear relationship between the meteorological factor and the pollen concentrations, and the distribution of pollen concentrations is a skewed to low concentrations. Therefore, the purpose of this study is to predict the oak pollen concentration using the deep neural network (DNN) model, one of the machine learning methods, to improve the KMA model. The bagging method was applied to the DNN model to prevent underestimation and overfitting of the model. For the evaluation of the developed model, regression and support vector regression (SVR) models with the same conditions as the DNN model were used. As a result of the test data, DNN was about two times lower than the other models in mean absolute percentage error of the predicted pollen concentration. In addition, the performance of the DNN model was the best in predicting high concentrations. The predicted pollen season was longer in the regression, SVR, and DNN models than in the observations by 53, 42, and 15 days, respectively. Overall, the DNN model showed better performance than the other models in predicting the pollen concentration and pollen season.

Acknowledgments This research was supported by the "Research and Development for KMA Weather, Climate, and Earth system Services" of the National Institute of Meteorological Sciences (NIMS) of the Korea Meteorological Administration (KMA).

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner