Through the Eyes of the Experts: A Taxonomy of Climate Model Uncertainty with Dr. Flavio Lehner
Satellite images of massive fires in Québec, Canada (Lat: 53.33, Lng: -76.11) - 28 June 2023, by Pierre Markuse via Wikimedia Commons (CC BY 2.0)
By Dr. Toby Ault, Associate Professor in Department of Earth and Atmospheric Sciences at Cornell University and PRI Research Associate
February 19, 2026; Originally published December 4, 2025 on LinkedIn
Through the Eyes of the Experts
In a time when almost anyone can sound like an expert in almost anything, I thought it would be interesting to interview a few of my colleagues to find out what they believe differentiates the novice from the expert in each of their fields.
What follows is the first in a series of articles I am calling "Through the Eyes of the Experts."
Part I: Dr. Flavio Lehner
I began with Dr. Flavio Lehner, who joined Cornell University’s Department of Earth and Atmospheric Sciences in 2020 after spending more than half a decade at the National Center for Atmospheric Research (NCAR) in Boulder, Colorado, studying climate change, climate variability, and climate risk.
What I learned was this: when it comes to regional climate change, the novice sees uncertainties as an obstacle to taking action, while the expert values the information hidden within those same uncertainties and knows how to make them useful.
A Bit About the Expert
I’ve been a fan of Dr. Lehner’s work since first reading his 2016 paper, Future risk of record-breaking summer temperatures and its mitigation, which cleanly articulates how under our current warming trajectory, “nearly every summer will be warmer than the warmest during the historical period.”
Perhaps unsurprisingly, given the significance of his research and the clarity of his writing, Flavio has become well-known in the climate science community for his thoughtful work on climate change, natural variability, and uncertainty in the Community Earth System Model (CESM) Large Ensemble. (see also section below: Paper Airplanes and Supercomputers)
One of my earliest impressions of Flavio comes from a long conversation we had on the corner of Pearl Street and Broadway in Boulder about a decade ago, after a workshop on Colorado River flow in a changing climate. Even then, Flavio was asking a question that was equal parts scientific pragmatism and moral philosophy: if we cannot credibly reduce the uncertainties of our state-of-the-art climate model projections, what is it that we should communicate to stakeholders and water resource managers?
The conversation left me feeling unsettled yet cautiously optimistic that the tangled mess of historical data and climate projections could, in the hands of someone like Flavio, be organized and sorted into something genuinely useful and valuable for decision makers.
In a sense, our most recent conversations in preparation for this series are really just a continuation of that long late night in Boulder almost a decade ago.
A (Slightly) Fictionalized Example
To see Flavio’s expertise in action, I posed the following scenario to him without any advance warning or time to prepare:
Suppose you run a small company that specializes in helping industries understand their exposure to climate and extreme-weather risks in western North America. You’ve just landed your first major client, and if this project goes well, it will launch your business and keep you solvent for at least the next three to five years. But if you’re wrong, you know your career is probably over.
Your client, let’s say, is the paper company Dunder Mifflin (e.g., from the hit TV show The Office).
Given the rising prices of raw materials, Dunder Mifflin is considering making major forestry investments in British Columbia so that they can streamline and consolidate their supply chain. However, given the recent headlines about megafires, the effects of wildfire smoke on human health, the proximity of some towns in British Columbia to the holdings they are considering, and the ambiguous role of climate change, they’ve hired you to help them quantify their risks on five-, ten-, fifteen-, and twenty-five-year time horizons.
How would you approach this problem?
(Note: I also intentionally picked a region, British Columbia, and a phenomenon, Wildfire, that is adjacent to Flavio's expertise, but not within his core research area, because I wanted to hear his reasoning in action and in real time.)
A Taxonomy of Uncertainty
To assess the climate risk for the bespoke company, Flavio stressed the importance of thinking about different sources of uncertainty as distinct from one another: they originate from different processes, they operate on different spatial and temporal scales, and they manifest in different variables. These considerations led us into a discussion of the "Taxonomy of Uncertainty."
Forced Trends
The forced response of any climate variable of interest can be best isolated by averaging ensemble members together. "Uncertainties in that forced trend," Flavio explained, "each arise from distinct sources. And they are different depending on your time horizon of interest." For example, the forced response of both temperature and precipitation is small on year-to-year timescales, but larger on multi-decadal time horizons. The same is true for the uncertainties associated with those forced trends.
Natural Variability
Natural variability arises from the chaotic dynamical fluctuations of regional and global weather and from other phenomena like El Niño and La Niña, the Pacific Decadal Oscillation, and the modes of variability that shape the structure of the high-latitude atmosphere, especially in winter. The amplitude of natural variability, and its relative importance compared to trends, varies by region. But in general, natural variability on a year-to-year timescale is almost always large. Natural variability is the different trajectories that the paper planes would take even if one attempted to throw them the same way each time; this is why uncertainty from natural variability is mostly considered irreducible. At the same time, no climate risk assessment is useful without a robust quantification of this natural variability.
Multi-Model Spread
While a single-model ensemble might be able to precisely determine forced trends and natural variability in that model, there is no perfect model. Therefore, an additional source of uncertainty, whether you are looking at trends forced by greenhouse gases or the natural variability inherent to the climate system, arises from differences in modeling frameworks. "You might get a different answer," Flavio said, "if you're looking at CESM versus CMIP6 or the MMLEA [Multi-Model Large Ensemble Archive]"
Downscaling
Finally, to examine climate risks in detail on the scales that most stakeholders find actionable, spatial downscaling is often required. Think of this as being analogous to taking an old, grainy digital photo from the early 2000s and improving its resolution. You need to do this if, for example, you are trying to model flood risk at the scales of farms or neighborhoods. For your risk estimates to be tailored to your specific needs, you would probably adjust the large-scale (say 50 x 50 km) coarse resolution climate model output to account for small-scale (e.g., 1km) topographic variations (among other small-scale features). While necessary in many applications, it also introduces yet another layer of uncertainty, depending on which specific methods are used and the details of how those methods are implemented.
Idealized relative uncertainties from different sources for an unspecified region on interannual (left) and multi-decadal (right) times cales. Both the total uncertainty and the relative importance of different components grow with timescale.
With that taxonomy in mind, Flavio turned to what he imagined would be the two primary variables of interest: temperature and precipitation. He added that if we were going to do this as a research application, we would need to talk to the stakeholders and survey the literature more carefully to determine which specific conditions are linked to wildfire or megafire risk, such as vapor pressure deficit or prolonged soil-moisture anomalies.
He also wanted me to be clear that we are not actually looking at the data. We are speculating on what would likely be relevant considerations for your startup firm and its client, Dunder Mifflin.
"First of all, on these timescales, natural variability itself is not going away," Flavio said, "it will continue to modulate these signals on annual and interannual timescales, often overshadowing the forced trend. And for regions like western North America, he said the larger wildcard may be how teleconnections such as El Niño and La Niña evolve, because even subtle shifts in those patterns can reshape the seasonal temperature and precipitation signals that matter most for fire risk.
Higher temperatures and longer, warmer growing seasons or dry seasons would also play a major role. He noted that the lengthening of the warm or dry season can matter as much as the mean warming itself, because it changes the window during which fuels can dry out enough to burn.
Linear trend calculated from the Multi-Model Large Ensemble (MMLEA) using NCAR's Climate Variability Diagnostic Package (https://www.cesm.ucar.edu/community-projects/mmlea/v2)
Home for the Holidays
Flavio also pointed out that even after decades of model improvements, uncertainty in regional projections has not narrowed much. “We’ve gone from CMIP3 to CMIP6,” he said, “and the spread is about the same.” The difference is that water managers and planners now know how to live with that. They no longer wait for the models to converge, but instead design plans that are resilient across that uncertainty. “The lesson isn’t that models failed,” he said, “but that some uncertainties are irreducible,” at least on the timescales on which stakeholders need to act.
Private firms, in contrast, face different trade-offs and usually must make consequential decisions on shorter time horizons. They can’t afford to hedge indefinitely. They don't always have a grasp on the amplitude of natural variability across interannual and decadal timescales. And accordingly, it can be tempting to convince yourself that the forced trends and climate variations are smaller than they are in reality. Most years, you can get away with this, but if you play this game for a decade or so, eventually, “the house wins.” (In this case, "the house" is the climate system, including both natural variability and forced trends).
In reality, we all make decisions in the face of uncertainties every day. Imagine it like this: you are expecting relatives for the holidays and trying to plan when to put food in the oven. Say you live in Ithaca, and your relatives are driving in from Ohio and Boston. They both give you an estimated arrival time, but winter weather, construction, and traffic delays could all slow them down. You would not hold them to their original estimate. You would expect updates along the way and adjust your timing as new information comes in. You would estimate an “earliest” and a “latest” plausible arrival time and make your own plans accordingly (e.g., when to prep the vegetables, when to roll out the pie crust, when to set the table, and how to keep everyone entertained if they end up delayed).
The Takeaway
What we call “uncertainty” is not a nuisance to simply be minimized or ignored, but rather it is information that can be included to help manage complex systems and natural resources in the face of a changing climate. The expert knows this and uses it to help stakeholders develop actionable plans that incorporate both natural variability and climate change on the timescales that matter most to them.
Paper Airplanes And Supercomputers
Our original conversation generated much more insight that I’ve shared above, so I’ve added additional content for those who might be interested in learning more about the CESM large ensemble and how this experiment (and related ones) helps shape our understanding of climate models. You can think of this section like the “Director’s Cut” from the series of conversations Flavio and I had about climate model uncertainty. It provides more detail on the background, theory, and application of "large ensembles" of climate model simulations.
A paper airplane ensemble (Toby R. Ault, CC-BY-SA 4.0)
Super Computers and Paper Airplanes
To understand why the CESM Large Ensemble experiments were some of the most important simulations done in the last decade or two of climate science, you have to understand what a climate model does — and why multiple climate-model simulations from different institutions have been used to understand climate change past, present, and future as part of the IPCC. But most importantly, we need to understand and identify the value of a large ensemble of climate-model simulations.
At its core, a climate model is a system that uses the laws of physics to simulate the flow of energy and mass through the entire climate system. To do this, climate models use a similar set of physical equations to numerical weather-forecasting models, except that they do it for the entire globe, throughout the atmosphere, and to the depths of the oceans. They also include land-surface components to account for the fluxes of moisture, energy, carbon, and other nutrients between the land and the atmosphere, as well as the rivers, lakes, and the ocean. And they include representations of the cryosphere, the stratosphere, and the coupled fluxes between all of these systems.
They are enormously complex pieces of code that have been developed, vetted, validated, and tested for their physical realism — and especially for their definitive conservation of mass and energy. They are not based on machine learning or AI, although recent efforts are underway to include new tools to do just that. Historically, there were only a few dozen places around the world that could develop and run climate models because of their complexity, the human expertise required to build, validate, and test them, and also because of the computational needs required to run simulations for at least a century in duration.
When Flavio joined NCAR, the entire climate-modeling community there had just begun using a new and extremely powerful supercomputer that made it possible to run not just one, but many climate-model simulations all at once.
Enter: the CESM Large Ensemble.
The members of the CESM Large Ensemble are all simulations that begin from slightly different initial conditions but are identically forced by 20th- and 21st-century boundary conditions. Think of forcing like this: suppose you and a group of friends hold a paper-airplane throwing contest on a windy day, and you want to understand how much of the variation in their flight paths is due to the wind and how much would occur naturally. The wind is like the forcing — its magnitude and direction will shape the path of most or all of the airplanes. One approach is to have each friend throw one or two planes and then average the results to estimate the wind’s effect, but differences in how each friend builds or throws their plane can still muddy the picture. A large ensemble is like asking one skilled paper-airplane maker to build many identical planes and throw them from the same spot with the same force. The resulting spread of flight paths isolates both the influence of the wind and the natural range of outcomes for that single, consistent airplane maker.
Side view of the paths taken by multiple paper airplanes made and thrown by the same paper airplane maker. (Toby R. Ault, CC-BY-SA 4.0)
In this thought experiment, the multi-model ensemble is analogous to the original contest, with many different paper-airplane makers of different heights, strengths, styles, and skills. The wind is the forcing, and the flight paths of the individual airplanes are the natural variations combined with the effects of the forcing. The experiment with just the one skilled paper-airplane maker making an exact copy of the same airplane each time and flying it in the wind is like the CESM Large Ensemble.
Flight paths of paper airplanes in "unforced" (calm) and "forced" (windy) conditions. In both cases, the paths of the individual airplanes spread out, though in the forced case, their paths are all affected by the wind. (Toby R. Ault, CC-BY-SA 4.0)
Sing, Choirs of Models!
The CESM Large Ensemble was a breakthrough because it gave researchers an entirely new tool for investigating climate change possibilities. Before this experiment, differences between models made it hard to separate what differences were structural from what was internal. The CESM Large Ensemble changed that. By running dozens of simulations with slightly different initial conditions but identical historical and future forcings, it allowed scientists to quantify, with much greater precision, two things at once: the model’s sensitivity (for example, how much warming it produces per doubling of CO₂) and the strength, frequency, and persistence of its natural variations in every variable and every region of the world. The practical breakthrough from the CESM-LE was that it allowed us to determine to the degree to which impactful climate events and trends were forced or just bad luck. It turned out that such probabilistic assessments were exactly what industries focused on climate risk needed. The success of CESM-LE motivated other modeling centers to follow suit, and today we have a growing collection of multi-model large ensembles that extend these insights across many independent frameworks.
The CESM Large Ensemble and subsequent related efforts did more than improve our knowledge of climate sensitivity in those models and the importance of natural variability generated by the chaotic dynamical systems they simulate. They allowed researchers like Flavio (and specifically researchers who were in fact Flavio) to quantify and characterize uncertainties stemming from different sources. Flavio applied exactly that kind of logic when I posed to him the fictional example of a consulting company with their first major client, Dunder Mifflin, who was trying to understand wildfire risks in British Columbia on decadal to multi-decadal time horizons.
See Other Posts in This Series
Through the Eyes of the Experts: Precipitation in a Changing Climate with Dr. Angie Pendergrass
Through the Eyes of the Experts: Earth's Energy Imbalance with Dr. Daniele Visioni
Read More
Below is a curated set of references that Flavio and I determined would present a comprehensive overview of the (1) conceptual frameworks discussed here, (2) Technical and Methodological Details, and (3) practical real-world applications.
1. Conceptual Framework
Deser, C., Lehner, F., Rodgers, K. B., Ault, T., Delworth, T., DiNezio, P., Fiore, A., et al. (2020). Insights from Earth system model initial-condition large ensembles and future prospects. Nature Climate Change, 10(4), 277–286. https://doi.org/10.1038/s41558-020-0731-2
Lehner, F., & Deser, C. (2023). Origin, importance, and predictive limits of internal climate variability. Environmental Research: Climate, 2(2), 023001. https://doi.org/10.1088/2752-5295/accf30
Lehner, F., & Deser, C. (2025). The importance of internal variability for the uncertainty in climate change projections and decision-making. In Uncertainty in Climate Change Research: An Integrated Approach (pp. 177–183). Springer. https://doi.org/10.1007/978-3-031-85542-9_17
Phillips, A. S., Deser, C., & Fasullo, J. (2014). Evaluating Modes of Variability in Climate Models. Eos, 95(49), 453–455. https://doi.org/10.1002/2014EO490002
2. Technical Details and Methods
Maher, N., Phillips, A. S., Deser, C., Wills, R. C. J., Lehner, F., Fasullo, J. M., Caron, J. M., et al. (2025). The updated Multi-Model Large Ensemble Archive and the Climate Variability Diagnostics Package: new tools for the study of climate variability and change. Geoscientific Model Development, 18(18), 6341–6365. https://doi.org/10.5194/gmd-18-6341-2025
Lehner, F. (2024). Climate model large ensembles as test beds for applied compound event research. iScience. https://doi.org/10.1016/j.isci.2024.110567
Bevacqua, E., Suarez-Gutierrez, L., Jézéquel, A. et al. (2023). Advancing research on compound weather and climate events via large ensemble model simulations. Nature Communications, 14, 2145. https://doi.org/10.1038/s41467-023-37847-5
Zhuang, H., DeGaetano, A. T., & Lehner, F. (2024). Internal climate variability obscures future freezing rain changes despite global warming trend. Geophysical Research Letters, 51(23), e2024GL111741. https://doi.org/10.1029/2024GL111741
3. Practical Applications
Hartke, S. H., Newman, A. J., Gutmann, E. D., McCrary, R. R., Lybarger, N. D., & Lehner, F. (2025). Lack of clear standards and usable comparisons of downscaled climate projections pose a roadblock for U.S. climate discovery and adaptation. Environmental Research Letters, 20(5), 054067. https://doi.org/10.1088/1748-9326/adc74e
Kuo, Y. N., Lehner, F., Simpson, I. R. et al. (2025). Recent southwestern US drought exacerbated by anthropogenic aerosols and tropical ocean warming. Nature Geoscience, 18, 578–585. https://doi.org/10.1038/s41561-025-01728-x
Lehner, F., Hawkins, E., Sutton, R., Pendergrass, A. G., & Moore, F. C. (2023). New potential to reduce uncertainty in regional climate projections by combining physical and socio-economic constraints. AGU Advances, 4(4), e2023AV000887. https://doi.org/10.1029/2023AV000887
