ENSO Blog

A blog about monitoring and forecasting El Niño, La Niña, and their impacts.

Disclaimer

The ENSO blog is written, edited, and moderated by Michelle L’Heureux (NOAA Climate Prediction Center), Emily Becker (University of Miami/CIMAS), Nat Johnson (NOAA Geophysical Fluid Dynamics Laboratory), and Tom DiLiberto and Rebecca Lindsey (contractors to NOAA Climate Program Office), with periodic guest contributors.

Ideas and explanations found in these posts should be attributed to the ENSO blog team, and not to NOAA (the agency) itself. These are blog posts, not official agency communications; if you quote from these posts or from the comments section, you should attribute the quoted material to the blogger or commenter, not to NOAA, CPC, or Climate.gov.

Latest Blogs

La Niña's delayed effect on sizzling Texas summers

August 2022 ENSO update: Summer Nights

ENSO and Salmon

July 2022 La Niña update: comic timing

RSS Feed

ENSO

30% of the time, it rains every time

A recent post on the Climate Prediction Center’s (CPC) winter forecast brought out several comments wondering about the quality of these seasonal forecasts. Folks asked: how good are these forecasts? (They have skill. I’ll show you over a couple of posts.) Do we even check? (Of course!) What about the Farmer’s Almanac? (No comment.)

Grading forecasts, or in nerd-speak, verification, is incredibly important. Not to get philosophical, but, like pondering the sound a tree makes in the woods if no one is around, a forecast is not a useful forecast if it is never validated or verified. After all, anyone can guess (educated or… not) at what will happen. I could give you my thoughts on what the stock market will look like in the future but I wouldn’t recommend putting money down based on my musings. I could even tell you what I think tomorrow’s winning lottery numbers will be, but, well, you get the point. It’s not hard to make a prediction; it’s hard to get it right.

How do you know who to trust? The simple answer is by checking how well forecasts have been in the past, making sure to look at all of the forecasts and not focusing on one extremely good or bad prognostication. In the case of seasonal forecasts, how did they do in previous years? Are they routinely on/off the mark, or is there variability in their performance?

How did we do?

From 2004-2013, contiguous US maps of forecast and observed winter temperature outlooks. This highlights how some forecasts were good and others...not so much

Figure 1 shows the December through February (DJF) temperature forecasts from 2004-2013 paired with observations from those seasons. The forecast outcome with the highest probability is colored for each location: red for well above average, blue for well below average, and dark gray for near-normal. White shows “equal chances:” places where the odds for any of the three outcomes were equal.

It is easy to see with your own eyes that the temperature forecasts’ performance varied greatly. Some years (e.g., 2013-2014) were not so good. Other years (e.g., 2005-2006) were much better. Many other years had forecasts that were correct in some areas but missed the larger picture. Precipitation (not shown here) is usually harder to predict as one big wet weather event during the season can skew seasonal totals but even then, some years are very good, while others… not so much.

What can we conclude then by our quick glance at the forecasts? That some years are better than others, but from just visual inspection, it is hard to know, on average, how well we did overall.

While “eyeballing” the difference between a forecast and the observations is easy (and something we all do), it can also be misleading. Your eyes can fail you. We tend to see only the really bad and the really good. Our brains do a poor job of averaging over the entire domain (the United States in this case) and over all of the years’ performance. And if you zero in on just one location—say the town where you live—you would only know how good or bad the forecasts were for one small spot, not the entire country.

Without verification metrics--statistical analyses that boil down everything that your eyes are seeing into numbers that quickly put the forecast performance into context--any “eyeball” verification might lead you astray. Doing such complete summarizing is much easier if you use a computer and use one of many available scoring metrics (I will touch on these in a later post).

Another issue with this type of verification is that it is zero-sum: a map can only show one probability category for each location. The complete forecast might be 50 percent chance for above normal, 33 percent chance for near-normal, and 17 percent chance for below normal. But for the map, it has to be simplified down to whichever category has the highest probability (see the most recent DJF forecast here and here).

Here’s an example: Figure 2 looks at a different representation of last year’s winter temperature forecast. It shows the probability for above-average temperatures (upper left), for below-average (upper right), and a combination plot (bottom) showing the highest probability for any category. All of a sudden what seemed like a straightforward forecast of above (reds) and below (blues) (last image in figure 1) shows a lot more nuance. Even with a forecast of 40-50% chance of above-average temperatures in Texas last winter (upper left), there was still a 20-25% chance for below-average temperatures (upper right).

This nuance is missed in the eyeball verification in figure 1, which only uses blunt categories. It is important to see seasonal forecasts through this lens. Every point does not have just one forecast, but multiple (above, below, and normal/average).

Three maps in triangle (two on top one on bottom) showing the 2013-2014 winter temperature forecast. On top, the winter outlook has been broken down into chance for above-average and below-average temperatures. On bottom is the outlook that plots only the highest category of the outlook

So…we forecasted a chance for both above and below average temperature. Does this mean that we can never be wrong? Not exactly. It also means we can’t be right either. Wait… what? The truth is seasonal forecasts cannot be verified using a single year alone. Instead of looking at only one forecast you need to look at many years. That is how forecast probabilities work. Only after investigating their performance over decades can we determine how skillful these forecasts are.

What’s the point then, you might ask. What value do these forecasts have? Let’s use an analogy. If I told you that there was a 40% chance there would be a major traffic jam on the way home today and a 25% chance there would be clear sailing, you probably would not take a different way home (unless maybe it was your turn to meet your kids’ school bus).

However, you likely will at least make plans ahead of time on a different route to take should a traffic jam occur—plans that could be put to use quickly if you get more up-to-date information when you are on the road (like a long line of red break lights ahead of you). Seasonal forecasts are your first chance to plan for upcoming conditions, and those plans could be useful given more accurate monthly or weekly forecasts that come out when the season in question is closer.

Later on, I will apply some verification metrics to the plots in figure 1 so we can see just how well our eyes did at evaluating the forecasts. Because while using your eyes is a good first step, proper verification relies on a large amount of details that our eyes/brains cannot accurately evaluate simultaneously.

Latest Blogs

La Niña's delayed effect on sizzling Texas summers

August 2022 ENSO update: Summer Nights

ENSO and Salmon

July 2022 La Niña update: comic timing

30% of the time, it rains every time

How did we do?

Comments

Add new comment

How did we do?

Comments

Add new comment

NEWS & FEATURES

MAPS & DATA

TEACHING CLIMATE

CLIMATE RESILIENCE TOOLKIT