We tested the ability of ESMs to capture the ecological dynamics observed in paleoecological and historical data spanning the last millennium. Focusing on an area from the Upper Midwest to New England, we first analyzed differences in the magnitude and spatial pattern of PFT distributions and ecotones, vegetation biomass, and LAI between historic datasets and the CMIP5 and MsTMIP inter-comparison project's large-scale ESMs. The distribution of ecosystem characteristics in modeled climate space reveals widely disparate relationships between modeled climate and vegetation that lead to large differences in long-term biosphere-atmosphere fluxes for this region. We hypothesized that much of the difference between data and models was due to the modeled response of vegetation dynamics to infrequent, but extreme climate responses such as drought stress. To test this hypothesis, we conducted a 1000-year model inter-comparison using six state-of-the-art biosphere models at sites that bridged regional temperature and precipitation gradients. Model simulations revealed that both the interaction between climate and vegetation and the representation of ecosystem dynamics within models were important controls on biosphere-atmosphere exchange.