In this study, we ask the question: to improve streamflow simulation in the NWM, where does standard model calibration suffice and where do we need process-level changes? We focus initially on processes impacting soil water movement. We use a multi-year retrospective CONUS-scale WRF-Hydro model run to identify hydrologically poorly performing regions. Based on error analysis and regional characteristics, we identify regions where model errors are likely due to shortcomings in surface and subsurface drainage dynamics (vs. forcings, snowpack dynamics, etc.). We then isolate representative basins in each of these regions and evaluate the impacts of standard soil parameter calibration, non-standard calibration against remote sensing products, and additional physical model complexity. Specifically, we test three levels of model improvements: (1) improved soil moisture drainage behavior through standard soil parameter calibration against streamflow, (2) incorporation of spatially variable surface detention storage calibrated against remotely-sensed inundation products, and (3) more accurate deep groundwater and soil water exchanges through the use of a 2-D coupled, process-based groundwater model. We quantify how these model improvements impact streamflow prediction in the various problem regions, and make recommendations for extrapolating these watershed-based findings to the full CONUS implementation of the NWM as well as other large-scale hydrologic modeling applications.