Thursday, 1 February 2024
Hall E (The Baltimore Convention Center)
Machine learning teaches machines by training features in a data frame and predict an outcome using machine learning model. One of the approaches to introduce machine learning (ML) to undergraduate students is by doing mini projects on data sets available on open sources. Additionally, these mini projects help students become familiar with computer programming, big data and data science, data analysis and visualization, and prediction. In the present study, we demonstrate a mini project on tornados in USA to describe the methodology of using machine learning to predict Enhanced Fujita scale (EF) as an early warning tool. Historical data of tornados for the past 50 years was collected from NOAA website. Statistical data analysis and machine leaning modeling were done using Python programming in Jupyter Notebook. To understand the general pattern of occurrence, frequency distribution of tornados as a function EF scale, month, year, and Beginning day of the year were drawn. Spatial distribution of tornados, and their tracks were studied using geopandas package. For the purpose ML, 48 features were generated including tornado length, tornado width, damages, deaths, beginning date, beginning time. Random Forest model was used to predict EF scale. Even though the accuracy of the model is low (69%), the study describes understanding confusion matrix and model metrics of ML modeling. The work is a part of DHS STEM Enhancement Program (DHS-SRTMSI-2022-FacultyProposa101).

