Three statisticians go hunting for rabbit. They see a rabbit. The first statistician fires and misses, her bullet striking the ground below the beast. The second statistician fires and misses, their bullet striking a branch above the lagomorph. The third statistician, a lazy frequentist, says, “We got it!”
OK, that joke was not 1/5th as funny as any of XKCD’s excellent jabs at the frequentist-bayesian debate, but hopefully this will warm you up for a somewhat technical discussion on how to decide if observations about the weather are at all explainable with reference to climate change.
We are having this discussion here and now for two reasons. One is that Hurricane Harvey was (is) a very serious weather event in Texas and Louisiana that may have been made worse by the effects of anthropogenic global warming, and there may be another really nasty hurricane coming (Irma). The other is that Michael Mann, Elisabeth Lloyd and Naomi Oreskes have just published a paper that examines so-called frequentist vs so-called Bayesian statistical approaches to the question of attributing weather observations to climate change.
Mann, Michael, ElisabethLloyd, Naomi Oreskes. 2017. Assessing climate change impacts on extreme weather events; the case for an alternative (Baesian) approach. Climate Change (2017) 144:131-142.
First, I’ll give you the abstract of the paper then I’ll give you my version of how these approaches are different, and why I’m sure the authors are correct.
The conventional approach to detecting and attributing climate change impacts on
extreme weather events is generally based on frequentist statistical inference wherein a null hypothesis of no influence is assumed, and the alternative hypothesis of an influence is accepted only when the null hypothesis can be rejected at a sufficiently high (e.g., 95% or Bp = 0.05^) level of confidence. Using a simple conceptual model for the occurrence of extreme weather events, we
show that if the objective is to minimize forecast error, an alternative approach wherein likelihoods
of impact are continually updated as data become available is preferable. Using a simple proof-of-concept, we show that such an approach will, under rather general assumptions, yield more
accurate forecasts. We also argue that such an approach will better serve society, in providing a
more effective means to alert decision-makers to potential and unfolding harms and avoid
opportunity costs. In short, a Bayesian approach is preferable, both empirically and ethically.
Frequentist statistics is what you learned in your statistics class, if you are not an actual statistician. I want to know if using Magic Plant Dust on my tomatoes produces more tomatoes. So, I divide my tomato patch in half, and put a certain amount of Magic Plant Dust on one half. I then keep records of how many tomatoes, and of what mass, the plants yield. I can calculate the number of tomatoes and the mass of the tomatoes for each plant, and use the average and variation I observe for each group to get two sets of numbers. My ‘null hypothesis’ is that adding the magic dust has no effect. Therefore, the resulting tomato yield from the treated plants should be the statistically the same as from the untreated plants. I can pick any of a small number of statistical tools, all of which are doing about the same thing, to come up with a test statistic and a “p-value” that allows me to make some kind of standard statement like “the treated plants produced more tomatoes” and to claim that the result is statistically significant.
If the difference, though, is very small, I might not get a good statistical result. So, maybe I do the same thing for ten years in a row. Then, I have repeated the experiment ten times, so my statistics will be more powerful and I can be more certain of an inference. Over time, I get sufficient sample sizes. Eventually I conclude that Magic Plant Dust might have a small effect on the plants, but not every year, maybe because other factors are more important, like how much water they get or the effects of tomato moth caterpillars.
In an alternative Bayesian universe, prior to collecting any data on plant growth, I do something very non-statistical. I read the product label. The label says, “This product contains no active ingredients. Will not affect tomato plants. This product is only for use as a party favor and has no purpose.”
Now, I have what a Bayesian statistician would call a “prior.” I have information that could be used, if I am clever, to produce a statistical model of the likely outcome of the planned experiments. In this case, the likely outcome is that there won’t be a change.
Part of the Bayesian approach is to employ a statistical technique based on Bayes Theorem to incorporate a priori assumptions or belief and new observations to reach towards a conclusion.
In my view, the Bayesian approach is very useful in situations where we have well understood and hopefully multiple links between one or more systems and the system we are interested in. We may not know all the details that relate observed variation in one system and observed variation in another, but we know that there is a link, that it should be observable, and perhaps we know the directionality or magnitude of the effect.
The relationship between climate change and floods serves as an example. Anthropogenic climate change has resulted in warmer sea surface temperatures and warmer air. It would be very hard to make an argument from the physics of the atmosphere that this does not mean that more water vapor will be carried by the air. If there is more water vapor in the air, there is likely to be more rain. Taken as a Bayesian prior, the heating of the Earth’s surface means more of the conditions that would result in floods, even if the details of when, how much, and where are vague at this level.
A less certain but increasingly appreciated effect of climate change is the way trade winds and the jet stream move around the planet. Without going into details, climate change over the last decade or two has probably made it more likely that large storm systems stall. Storms that may have moved quickly through an area are now observed to slow down. If a storm will normally drop one inch of rain on the landscape over which it passes, but now slows down but rains at the same rate, perhaps 3 inches of rain will be dropped (over a shorter distance). What would have been a good watering of all the lawns is now a localized flood.
That is also potentially a Bayesian prior. Of special importance is that these two Bayesian priors imply change in the same direction. Since in this thought experiment we are thinking about floods, we can see that these two prior assumptions together suggest that a post-climate change weather would include more rain falling from the sky in specific areas.
There are other climate change related factors that suggest increased activity of storms. The atmosphere should have more energy, thus more energetic storms. In some places there should more of the kind of wind patterns that spin up certain kinds of storms. It is possible that the relationship between temperature of the air at different altitudes, up through the troposphere and into the lower stratosphere, has changed so that large storms are likely to get larger than they otherwise might.
There is very little about climate change that implies the reverse; Though there may be a few subsets of storm related weather that would be reduced with global warming, most changes are expected to result in more storminess, more storms, more severe storms, or something.
So now we have the question, has climate change caused any kind of increase in storminess?
I’d like to stipulate that there was a kind of turning point in our climate around 1979, before which we had a couple of decades of storminess being at a certain level, and after which, we have a potentially different level. This is also a turning point in measured surface heat. In, say, 1970 plus or minus a decade, it was possible to argue that global warming is likely but given the observations and data at the time, it was hard to point to much change (though we now know, looking back with better data for the previous centuries, that is was actually observable). But, in 2008, plus or minus a decade, it was possible to point to widespread if anecdotal evidence of changes in storm frequency, patterns, effects, as well as other climate change effects, not the least of which was simply heat.
I recently watched the documentary, “An Inconvenient Sequel.” This is a fairly misunderstood film. It is not really part two of Al Gore’s original “An Inconvenient Truth.” The latter was really Al Gore’s argument about climate change, essentially presented by him. “An Inconvenient Sequel” was made by independent film makers with no direct input by Gore with respect to contents and production, though it is mostly about him, him talking, him making his point, etc. But I digress. Here is the salient fact associated with these two movies.An Inconvenient Truth came out in May 2006, so it is based mainly on information available in 2005 and before. In it, there are examples of major climate change effects, including Katrina, but it seems like the total range of effects is more or less explicated almost completely. When An Inconvenient Sequell came out a few weeks ago, a solid 10+ years had passed and the list of actual climate effects noted in the movie was a sampling, not anything close to a full explication, of the things that had happened over recent years. Dozens of major flooding, storming, drying, and deadly heat events had occurred of which only a few of each were mentioned, because there was just so much stuff.
My point is that there is a reasonable hypothesis based on anecdotal observation (at least) that many aspects of weather in the current decade, or the last 20 years, or since 1979 as I prefer, are different in frequency and/or severity than before, because of climate change.
A frequentist approach does not care why I think a certain hypothesis is workable. I could say “I hypothesize that flies can spontaneously vanish with a half life of 29 minutes” and I could say “I hypothesis that if a fly lays eggs on a strawberry there will later be an average of 112 maggots.” The same statistical tests will be usable, the same philosophy of statistics will be applied.
A Bayesian approach doesn’t technically care what I think either, but what I think a priori is actually relevant to the analysis. I might for example know that the average fly lays 11 percent of her body mass in one laying of eggs, and that is enough egg mass to produce about 90-130 maggots (I am totally making this up) so that observational results that are really small (like five maggots) or really large (like 1 million maggots) are very unlikely a priori, and, results between 90 and 130 are a priori very likely.
So, technically, a Bayesian approach is different because it includes something that might be called common sense, but really, is an observationally derived statistical parameter that is taken very seriously by the statistic itself. But, philosophically, it is a little like the pitcher of beer test.
I’ve mentioned this before but I’ll refresh your memory. Consider an observation that makes total sense based on reasonable prior thinking, but the standard frequentist approach fails to reject the null hypothesis. The null hypothesis is that there are more tornadoes from, say, 1970 to the present than there were between 1950 and 1970. This graph suggests this is true…
… but because the techniques of observation and measuring tornado frequency have changed over time, nobody believes the graph to be good data. But, it may not be bad data. In other words, the questions about the graph do not inform us of the hypothesis, but the graph is suggestive.
So, I take a half dozen meteorologists who are over 55 years old (so they’ve seen things, done things) out for a beer. The server is about to take our order, and I interrupt. I ask all the meteorologists to answer the question … using this graph and whatever else you know, are there more tornadoes in the later time interval or not? Write your answer down on this piece of paper, I say, and don’t share your results. But, when we tally them up, if and only if you all have the same exact answer (all “yes” or all “no”) then this pitcher of beer is on me.
Those are quasi-Bayesian conditions (given that these potential beer drinkers have priors in their heads already, and that the graph is suggestive if not conclusive), but more importantly, there is free beer at stake.
They will all say “yes” and there will be free beer.
OK, back to the paper.
Following the basic contrast between frequentist and Bayesian approaches, the authors produce competing models, one based on the former, the other on the latter. “In the conventional, frequentist approach to detection and attribution, we adopt a null hypothesis of an equal probability of active and inactive years … We reject it in favor of the alternative hypothesis of a bias toward more active years … only when we are able to achieve rejection of H0 at a high… level of confidence”
In the bayesian version, a probability distribution that assumes a positive (one directional) effect on the weather is incorporated, as noted above, using Bayes theorem.
Both methods work to show that there is a link between climate change and effect, in this modeled scenario, eventually, but the frequentist approach is very much more conservative and thus, until the process is loaded up with a lot of data, more likely to be wrong, while the bayesian approach correctly identifies the relationship and does so more efficiently.
The authors argue that the bayesian method is more likely to accurately detect the link between cause and effect, and this is almost certainly correct.
This is what this looks like: Frank Frequency, weather commenter on CNN says, “We can’t attribute Hurricane Harvey, or really, any hurricane, to climate change until we have much more data and that may take 100 years because the average number of Atlantic hurricanes to make landfall is only about two per year.”
Barbara Bayes, weather commenter on MSNBC, says, “What we know about the physics of the atmosphere tells us to expect increased rainfall, and increased energy in storms, because of global warming, so when we see a hurricane like Harvey it is really impossible to separate out this prior knowledge when we are explaining the storms heavy rainfall and rapid strengthening. The fact that everywhere we can measure possible climate change effects on storms, the storms seem to be acting as expected under climate change, makes this link very likely.”
I hasten to add that this paper is not about hurricanes, or severe weather per se, but rather, on what statistical philosophy is better for investigating claims linking climate change and weather. I asked the paper’s lead author, Michael Mann (author of The Madhouse Effect: How Climate Change Denial Is Threatening Our Planet, Destroying Our Politics, and Driving Us Crazy, The Hockey Stick and the Climate Wars: Dispatches from the Front Lines, and Dire Predictions, 2nd Edition: Understanding Climate Change), about Hurricane Harvey specifically. He told me, “As I’ve pointed out elsewhere, I’m not particularly fond of the standard detection & attribution approach for an event like Hurricane Harvey for a number of reasons. First of all, the question isn’t whether or not climate change made Harvey happen, but how it modified the impacts of Harvey. For one thing, climate change-related Sea Level Rise was an important factor here, increasing the storm surge by at least half a foot.” Mann recalls the approach taken by climate scientist Kevin Trenberth, who “talks about how warmer sea surface temperatures mean more moisture in the atmosphere (about 7% per degree C) and more rainfall. That’s basic physics and thermodynamics we can be quite certain of.”
The authors go a step farther, in that they argue that there is an ethical consideration at hand. In a sense, an observer or commenter can decide to become a frequentist, and even one with a penchant for very low p-values, with the purpose of writing off the effects of climate change. (They don’t say that but this is a clear implication, to me.) We see this all the time, and it is in fact a common theme in the nefarious politicization of the climate change crisis.
Or, an observer can chose to pay attention to the rather well developed priors, the science that provides several pathways linking climate change and severe weather or other effects, and then, using an appropriate statistical approach … the one you use when you know stuff … be more likely to make a reasonable and intelligent evaluation, and to get on to the business of finding out in more detail how, when, where, and how much each of these effects has taken hold or will take hold.
The authors state that one “… might therefore argue that scientists should err on the side of caution and take steps to ensure that we are not underestimating climate risk and/or underestimating the human component of observed changes. Yet, as several workers have shown …the opposite is the case in prevailing practice. Available evidence shows a tendency among climate scientists to underestimate key parameters of anthropogenic climate change, and thus, implicitly, to understate the risks related to that change”
While I was in contact with Dr. Mann, I asked him another question. His group at Penn State makes an annual prediction of the Atlantic Hurricane Season, and of the several different such annual stabs at this problem, the PSU group tends to do pretty well. So, I asked him how this season seemed to be going, which partly requires reference to the Pacific weather pattern ENSO (El Nino etc). He told me
We are ENSO neutral but have very warm conditions in the main development region of the Tropcs (which is a major reason that Irma is currently intensifying so rapidly). Based on those attributes, we predicted before the start of the season (in May) that there would be between 11 and 20 storms with a best estimate of 15 named storms. We are currently near the half-way point of the Atlantic hurricane season, and with Irma have reached 9 named storms, with another potentially to form in the Gulf over the next several days. So I suspect when
all is said and done, the total will be toward the upper end of our predicted range.
I should point out that Bayesian statistics are not new, just not as standard as one might expect, partly because, historically, this method has been hard to compute. So, frequency based methods have decades of a head start, and statistical methodology tends to evolve slowly.