Monday, 28 August 2023
Boundary Waters (Hyatt Regency Minneapolis)
Handout (55.3 MB)
The raindrop size distribution (DSDs) describes the number and size distributions of raindrops in a volume of air. It is key to model the propagation of microwave signals through the atmosphere (crucial for telecommunication and radar remote sensing), to improve microphysical schemes in numerical weather prediction models, and to understand rain-related land surface processes (rainfall interception, soil erosion). Despite its importance, the spatial and temporal variability of the DSD remains poorly understood. This has motivated scientists all around the globe to deploy DSD recording instruments known as disdrometers, in order to collect DSD observations in various climatic regions. However, only a small fraction of these data is easily accessible. Data are stored in disparate formats with poor documentation, making them difficult to share, analyze, compare and re-use. Additionally, very limited software is currently publicly available for DSD processing.
This study presents the DISDRODB project, which aims to create a global archive of DSD measurements and to establish a global standard for DSD observation sharing. To this end, we undertook an initial effort to index, collect and homogenize many already public DSD data sets across the globe from various institutions, including NASA, NCAR, ARM, NCEP, NERC, INPE, EPFL, DELFT (...), and we currently maintain data and metadata from over 400 stations.
The DISDRODB metadata repository on GitHub (available at https://github.com/ltelab/disdrodb-data) enables to track change and to collaboratively refine station metadata following the best-open source practices.
The DISDRODB archive has a decentralized structure, where data contributors upload the station raw files to a data repository of their choice (i.e. zenodo) and add the station metadata to the DISDRODB metadata repository, including the URL where the raw data are uploaded. Finally, the contributors specify the content of the raw data (i.e. the logged variables) into a reader template function that they add to the disdrodb Python package.
The open-source disdrodb package (available at https://github.com/ltelab/disdrodb) provides tools for interactions with the DISDRODB metadata repository and the user-side disdrometer data retrieval, standardization and homogenization. In future, we plan to expand the disdrodb package to also include algorithms and utility for data preprocessing, analysis, and visualization of disdrometer measurements.
The DISDRODB project allows authors and contributors to retain authorship on their raw data, while the users can generate standardized products with custom settings. By adopting this framework, DISDRODB aims to promote the mobilization of data archives currently scattered across various institutions and foster international collaborations.
The DISDRODB group is actively seeking new data/code contributors and aims to create a thriving DSD community around the project, with the ultimate goal of accelerating and advancing reproducible precipitation research.
This study presents the DISDRODB project, which aims to create a global archive of DSD measurements and to establish a global standard for DSD observation sharing. To this end, we undertook an initial effort to index, collect and homogenize many already public DSD data sets across the globe from various institutions, including NASA, NCAR, ARM, NCEP, NERC, INPE, EPFL, DELFT (...), and we currently maintain data and metadata from over 400 stations.
The DISDRODB metadata repository on GitHub (available at https://github.com/ltelab/disdrodb-data) enables to track change and to collaboratively refine station metadata following the best-open source practices.
The DISDRODB archive has a decentralized structure, where data contributors upload the station raw files to a data repository of their choice (i.e. zenodo) and add the station metadata to the DISDRODB metadata repository, including the URL where the raw data are uploaded. Finally, the contributors specify the content of the raw data (i.e. the logged variables) into a reader template function that they add to the disdrodb Python package.
The open-source disdrodb package (available at https://github.com/ltelab/disdrodb) provides tools for interactions with the DISDRODB metadata repository and the user-side disdrometer data retrieval, standardization and homogenization. In future, we plan to expand the disdrodb package to also include algorithms and utility for data preprocessing, analysis, and visualization of disdrometer measurements.
The DISDRODB project allows authors and contributors to retain authorship on their raw data, while the users can generate standardized products with custom settings. By adopting this framework, DISDRODB aims to promote the mobilization of data archives currently scattered across various institutions and foster international collaborations.
The DISDRODB group is actively seeking new data/code contributors and aims to create a thriving DSD community around the project, with the ultimate goal of accelerating and advancing reproducible precipitation research.

