There’s a saying about modeling data, that “with 6 degrees of freedom you can model an elephant.”

It expresses that fact that when we fit curves to data, or smooth it in any way, if we include enough degrees of freedom in our model we can fit just about any pattern. But, that doesn’t mean the fit represents what the physical signal is actually doing. Both your characterization of what the signal *has* done, and especially any forecast of what it *will* do, can be faulty, especially true if you don’t apply statistical significance tests to determine whether or not your model fit might be meaningful. It’s yet *more* suspect when the noise which is superimposed on the signal isn’t white noise. Hence one of the easiest, and alas most common, ways to fool yourself (and others) is to indulge in reckless *curve-fitting*.

Allow me to illustrate. Let’s do some curve-fitting to these data:

We’ll go with a well-known way to smooth these data, a *low-pass filter*. Actually this is a generic term which can refer to a host of smoothing methods, but they have in common that they strongly attenuate high-frequency signals while having little impact on low-frequency signals. It may surprise some to learn that computing a moving average is one version of a low-pass filter. Often one chooses a “cutoff frequency” below which signals are little affected, above which they are strongly attenuated. By doing so we eliminate most of the “fast” fluctuations (on the presumption that they’re mostly noise) and isolate the “slow” fluctuations (presumed to be mostly signal). Let’s start with one of the simplest versions, a “trigonometric filter”:

Clearly, that fit isn’t great. It’s not totally wrong, it is statistically significant, but mostly it shows that the signal is high late during the time span but low early. Certain features of the fit are decidedly wrong, particularly the behavior very near the end of the time series.

We can improve things by tweaking the parameters (raising the cutoff frequency, allowing more fluctuation) and by allowing for a drift in the time series:

That looks much better, but still the fit isn’t so good near the end of the time span. Difficulty near the boundaries is, in fact, a recurrent problem with curve-fitting in general.

One of the difficulties with Fourier-based fits is that sinusoids (which are the basis functions, i.e., the curves we’re fitting to our data) have certain properties which may conflict with the real signal. In particular, sinusoids are *bounded*. In fact they’re bounded in such a way that they can only reach their peak value when the curve flattens out — so if there’s a peak near the beginning or end of the time span, the Fourier fit must either fail to model that peak, or it must flatten out at that point whether the signal flattens or not. All of which means that if we use a Fourier fit to *predict* the future progress of the data, either the fit won’t match the ending peak, or it will necessarily decline, turning back toward the mean value as we progress into the future. Whether the actual signal does or not. Whether there’s any real evidence of an impending decline or not.

There are more sophisticated filters; let’s try a “Butterworth filter”

That looks a bit better. It doesn’t curve back down as much at the end of the time span, but it does flatten out a little. The biggest problem is that the very last point dives back toward zero in an unrealistic way. We can also allow the Butterworth filter to include “drift” in the time series, to allow for the possibility that the time series isn’t stationary:

Note that the time series itself has been de-trended. Again, this fit isn’t bad, but it still seems to be consistently below the data at the end of the time span. Worse, it still has that unrealistic dip at the very end, which is made evident by looking at just the last 20 years (note that the flattening at the end is much reduced):

One might wonder, is there a flattening at the end or not? Of course it’s not possible to answer that question absolutely; these are real data and it’s impossible to separate the signal from the noise with certainty. We could try some other methods to investigate that question: let’s isolate the last 35 years (since 1975) and try fitting polynomials.

Of course a linear regression won’t show a flattening — straight lines don’t flatten out! This is another illustration of the fact that the kind of curves we use — the model functions we choose — can impact the result we get. With trig functions our model curves are bounded and always flatten at their extremes; with a linear fit they never change their slope. So let’s try higher-order polynomials. Here’s a quadratic fit:

That certainly doesn’t flatten out! It accelerates upward. It’s by no means certain that the actual signal has accelerated that rapidly; this is just a reflection of the fact that 2nd-degree polynomials show simple acceleration in only one direction, and the dominant acceleration in this time span is upward — but whether or not it’s a real physical signal is still an open question. Let’s allow a little more variation, with a 3rd-degree polynomial:

Now we see a very pronounced downturn at the end. We also see a pronounced dive at the beginning. It’s questionable whether either is real, especially at the beginning since the data *before* 1975 don’t support that conclusion. Of course, we don’t have data after 2008 (yet). We can see another facet of higher-order polynomial fits, that they tend to fluctuate wildly at the beginning and end of the time spans. This is even more evident with a 5th-degree fit:

Clearly the rapid declines at the beginning and end are the result of our model functions; they’re bending to match the beginning and ending points, regardless whether it’s a meaningful trend or not.

I haven’t yet applied my favorite filter, the lowess smooth (I’ll return to using the entire time span):

Now we see no evidence at all of a recent flattening. If we look at the residuals from this fit, we see no significant deviation in the most recent values:

This certainly makes the case for “no flattening” seem quite plausible, but as I said earlier, it’s impossible to know with noisy data, especially since we can’t see into the future. Also, if we choose a different “time scale” for the lowess smooth we *can* get flattening:

This particular fit is rather choppy, suggesting that some of the fluctuations (including the flattening at the end) may be just due to noise. Also, the smoothing time scale is less than 20 years, therefore less than the “industry standard” 30 years often applied to climate.

Besides, all these ruminations about “Is it flattening or isn’t it?” are based on visual inspection of graphs from analyses with no statistics attached. We really should be applying some analysis for which we can estimate the uncertainty. One of the simplest methods is to apply linear regression to estimate the trend rate, and compute the confidence limits of our estimate. I’ve done so with these data, for time intervals starting every 5 years since 1975 and ending at the present day:

At last we have a clear statistical opinion on the question. No, there isn’t statistically valid evidence of a recent flattening of the temperature curve. In fact for the more recent time spans, the paucity of data makes the uncertainty huge — we just can’t say with confidence that the temperature trend has levelled off. We can’t say with confidence that it hasn’t, either.

If we use monthly rather than annual averages (to get more data), the situation is more subtle. Now the fact that the noise isn’t white becomes an unavoidable complication. But that complication can be overcome, giving a result which is essentially the same, we still can’t say with confidence that the trend has flattened:

The point of all this is that you shouldn’t make claims about changes in trends unless you’ve applied some statistical analysis; drawing conclusions based on nothing more than applying a smoothing filter is risky business. In addition, the characteristics of the smoothing filter can affect the outcome profoundly; that’s one of the reasons that the lowess filter is so popular with statisticians, because it’s designed to be robust against such pitfalls. Even then there are always choices to be made, the characteristic time scale or equivalently the cut-off frequency. Sometimes, a small change in parameter choices can cause a qualitative difference in the visual impression given by the smoothing.

Some of you may already have deduced that the data are annual averages of temperature anomaly for the *Central England Temperature* (CET) time series. There’s no doubt at all that it has risen over the last 30 years. Has it flattened out this decade? Nobody knows for sure. But before I go about claiming that it has, I’ll wait for some statistically sound evidence.

## 35 responses so far ↓

Kevin McKinney// May 11, 2009 at 1:34 pm |Illuminating, as always. So many ways for “cherries” to go bad. . . !

mcsutter// May 11, 2009 at 1:38 pm |Why have you not included any data points beyond the years 2000-2oo1?

MCS

[

Response: Au contraire; I used all the data through 2008 (for annual averages) and through March 2009 (for monthly data).I didn't do regressions starting later than 2000, but as the one starting at 2000 well illustrates, the uncertainty in the slope is dramatically increasing; attempts to deduce an accurate slope from even less data is folly.]Ray Ladbury// May 11, 2009 at 3:01 pm |Tamino,

The source of the quote at the beginning of this (excellent) piece is John von Neumann:

“With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. ”

I believe it comes to us via Freeman Dyson.

george// May 11, 2009 at 3:22 pm |Great post. I love the title.

Keeping with the road theme, I’d like to suggest a sign for WUWT:

” Caution! Slow! Men at work”

[

Response: Perhaps we should remove the 2nd exclamation point.]george// May 11, 2009 at 4:01 pm |I actually considered putting in a comma. :)

Mental processes are one thing but, then again, a couple visits to WUWT separated by just a couple weeks shows that those guys crank their “analysis” out like it was some sort of hot dog eating contest (with about as much thought involved)

There is nothing at all “slow” about the way they work.

Returning to the road theme: If WUWT built a bridge over the Columbia River gorge, would you drive on it?

I’d detour around the entire United States before I would.

dhogaza// May 11, 2009 at 6:46 pm |Hmmm … maybe one of his ancestors was involved in the design of Galloping Gertie (The Tacoma Narrows Bridge)

Ian// May 11, 2009 at 8:09 pm |This post, and your replies to J Bob on RealClimate’s “Monckton’s Deliberate Manipulation” thread, also make good reminders of a general point: finding a significant fit/difference/etc. is not the same as representing reality (or even the full series that you’re modeling). Significance testing often beats eyeballing the data, but J Bob (for example) seems blind to the assumptions that his model choice entails – Type I/II errors are not his only possible mistakes.

The sinusoidal model example is a good one – it’s a choice that heavily constrains the model at the end of the series, so you it’s wrong to model the series, find significance, and then pretend that the modeled end of the series shows you reality.

Dave A// May 11, 2009 at 10:08 pm |Tamino,

Why don’t you go argue some statistics at CA. What’s holding you back?

Tim// May 11, 2009 at 10:53 pm |Fantastic post as always, and slightly less tech.

I shall cross ref for my fellow snowboarders and skiers.

tm

Ray Ladbury// May 11, 2009 at 11:53 pm |Dave A.,

What possible benefit could Tamino or anyone else in touch with reality derive from a visit to CA?

Dan Satterfield// May 12, 2009 at 6:20 am |That’s twice in a row you have taught me something very interesting in statistics- Thanks much. (You make stats much more interesting than my instructors at Okla U. did in 1979.

Dan

walter crain// May 12, 2009 at 4:07 pm |tamino,

thanks a bunch! that was beautiful. i LOVE the graph showing that dramatic “dip” since around 2007… man, that’s funny! (and tragic that people, “scientists” no less, talk that way)

Manu// May 12, 2009 at 9:40 pm |Off topic, but it’s really worth a look: http://www.drroyspencer.com/2009/05/global-warming-causing-carbon-dioxide-increases-a-simple-model/

The variation of CO2 with time (dCO2/dt) is “modeled” as a linear function of SST (coefficient ‘a’) and manmade contribution (coefficient ‘b’). The best fit values are a = 4.6 ppm/yr per deg. C and b=0.1.

Now look at the statement: “The best fit (shown) assumed only 10% of the atmospheric CO2 increase is due to human emissions (b=0.1), while the other 90% is simple due to changes in sea surface temperature.”

First it is obvious that b=0.1 is not 10% if ‘a’ is not 0.9 … but even if ‘a’ was 0.9, it still doesn’t make any sense. Change the unit of SST and ‘a’ changes value too; does that mean that the ratio of SST contribution to dCO2/dt changes too?? More straightforwardly, ‘a’ is in ppm/yr per deg C, ‘b’ is adimensional … comparing both does not make any sense. If a ratio of man influence is to be derived, it is from b*antropo/(dCO2/dt), not from b alone.

Curious// May 12, 2009 at 9:58 pm |Wonderful, tamino. Very enlightening, as usual .

*Regarding J. Bob’s whinges at RC, I fully support your tone towards the skeptical arrogance, tamino. Zero tolerance toward the spreading of lies, please.

Loads of thanks.

Dave A// May 12, 2009 at 10:08 pm |Ray,

Was that a rhetorical question?

dhogaza// May 12, 2009 at 11:05 pm |We know that the oceans are currently a CO2 sink (acidification has been *measured*), so how can his claim make any physical sense whatsoever?

He’s just making shit up and plugging numbers into it until he gets the answer he wants.

Deep Climate// May 12, 2009 at 11:44 pm |Great post, Tamino – I’ve been very interested in smoothing to derive “sensible” trends since you first posted on Lowess a while back.

If I compare the last two linear regression graphs (annual and monthly) from 2000, the annual data has a slope of of 0.01 +/- 0.08, whereas the monthly is ~0 +/- 0.1. The spread looks to be 0.16 for 2000 annual and 0.21 or 0.22 for the monthly set.

Does this mean the monthly data set is so much noisier that the confidence interval (CI) has actually widened slightly? Would it be possible to see the monthly regression at the same scale as the annual (i.e. leaving out 2005)? Finally, are those 95% CIs or 90% (i.e. 5/95)?

Sorry for all the questions …

[

Response: The confidence intervals are 95%. There are some differences that lead to the uncertainty widening. One is that for the monthly data I applied a correction for autocorrelation, while for the annual data I treated the noise as white noise. I should've corrected both for autocorrelation, which will widen the error range for the annual data. The 2nd difference is that for monthly data I used all the data including the first few months of 2009, while for annual data I only used complete annual averages (which ends at then end of 2008).There's also the fact that I just pumped the data through my standard regression program. This estimates parameters (like the variance of the errors) directly from the data, which is fine *except* when the number of data gets very small. This is the case for small time spans of the annual averages (only 10 data points!) so in that case one should estimate the variance from a longer time series to get a more precise value, then apply that to the uncertainty calculation. More rigorous analysis will only widen the error range.]Ray Ladbury// May 13, 2009 at 12:28 am |Dave A., Your assumption that statistics is something you “argue” is revealing. It shows that you aren’t interested in learning–merely entertainment. The rest of us do not feel that way.

Gavin's Pussycat// May 13, 2009 at 8:50 am |Manu, there are problems with Spencer’s approach, clever as it is. (I don’t understand the problem you have with the units; what Spencer does seems formally correct to me.)

The first and obvious problem is the use of multiple regression when two of the independent variables — human emissions and ocean temperature — contain very similar long-term trend patterns. Then, you cannot robustly separate the coefficients for both (Tamino had a recent post on this.)

Effectively what happens is, that Spencer makes an implicit assumption that the coefficient he finds between SST and CO2 for interannual variation — dominated by El Nino, see here — also applies between the long-term secular trends in both. That is a bit rash — though he would say that the assumption explaining 90% of the observed trend tends to show that he is on the right track.

I also note that the Mauna Loa data he uses apparently has its annual cycle already removed, like the SST anomalies. Six months delay makes me very, very suspicious if we are seeing “leakage” of the annual cycle in both, with them being in antiphase.

But the fatal problem I see has to do with simple bookkeeping. He proposes a mechanism in which the heating ocean gradually releases CO2, while the CO2 that we humans release — around twice as much — gets absorbed by some unspecified natural mechanism.

But what is this mechanism? Absorbtion into the land biosphere? But then, oxygen would be released by photosynthesis, which we do not see (see IPCC WG1 Fig 2.3, and text on same page). (Spencer mentions the C13 isotope argument and that he doesn’t believe in it. Hmmm.)

Absorption into the ocean? But then we have the spectacle of two opposing CO2 streams, a natural one coming out of the warming ocean, and an anthropogenically caused one, twice as large, going in, and never the two bumping into each other. How do the CO2 molecules know if they’re natural or anthropogenic? Hmmm…

And then he asks “What could be causing long-term warming of the oceans?”. And proceeds to present his pet theory. But hey, we

knowwhat causes it… You would have to unexplain that well-understood mechanism first ;-)Barton Paul Levenson// May 13, 2009 at 12:25 pm |For the CO2-SST relation, doesn’t Spencer simply have the causality reversed? It’s like doing a regression that shows that fires in homes are caused by “There’s-a-fire” calls to 911.

Manu// May 13, 2009 at 1:57 pm |Gavin’s Pussycat: “Manu, there are problems with Spencer’s approach, clever as it is. (I don’t understand the problem you have with the units; what Spencer does seems formally correct to me.)”

Could you please explain why b = 0.1 means “only 10% of the atmospheric CO2 increase is due to human emissions”.

Manu// May 13, 2009 at 2:52 pm |dhogaza: “We know that the oceans are currently a CO2 sink (acidification has been *measured*), so how can his claim make any physical sense whatsoever?”

It does make physical sense. The sink is variable. Now if you allow the absorption of CO2 by the oceans to vary, you allow them to modulates the CO2 amount in the atmosphere. Now if you allow the absorptivity to be influenced by the temperature (which nobody would dispute, I guess), you get Spencer’s approach. The point is that his mathematics fail.

Horatio Algeranon// May 13, 2009 at 3:56 pm |RE: Spencer’s claims

Spencer is essentially arguing against the findings of CO2 isotope analysis.

Is he justified in doing so?

Horatio, who has looked into this stuff a bit here, thinks Jan Schlorer put it best in “Why does atmospheric CO2 rise? (Version 3.1, October 1996)”

In Schlorer’s words:

Gavin's Pussycat// May 13, 2009 at 6:19 pm |> Could you please explain why b = 0.1 means “only 10% of the

> atmospheric CO2 increase is due to human emissions”.

Manu, because the observable (Mauna Loa based ppms) and the regressor (human emissions ppm equivalent) are in the same unit and describe directly comparable quantities. That is why b is dimensionless.

Correction: actually b = 0.1 is 10% of what is emitted (which is what Spencer actually writes), which means that 20% of the atmospheric increase is due to human emissions… as the total long-term increase is only 50% or so of what is emitted.

In the consensus explanation, all of this 50% increase is due to human emissions, the other 50% of these emissions being absorbed into ocean and biosphere (the actual percentages may be 40/60, too lazy to look up in IPCC)

In the Spencer explanation, only 20% of this 50%, i.e., 10% of total emissions, is due to human emissions, 80% of it coming naturally from release by the warming oceans. The remaining 90% of human emissions goes “somewhere” ;-)

David B. Benson// May 13, 2009 at 9:38 pm |The “industry standard” use of 30 years for climate is related to upper ocean mixing time, methinks.

Manu// May 13, 2009 at 9:57 pm |Gavin’s Pussycat: “Correction: actually b = 0.1 is 10% of what is emitted (which is what Spencer actually writes)”

Excuse me if I misread, but I think this is actually NOT what Spencer writes. He clearly states “The best fit (shown) assumed only 10% of the atmospheric CO2 increase [NOT of what is emitted] is due to human emissions (b=0.1), while the other 90% is simple due to changes in sea surface temperature.”

Similarly, his first graph (before tweaking his relationship away from his best fit) identifies the model as “90% natural, 10% anthropo”.

http://www.drroyspencer.com/wp-content/uploads/simple-co2-model-fig06.jpg

What you’ve read in his statements is simply not what he writes. His point was to show that anthropogenic contributions to total change in CO2 with time were small, and even if his conclusion could remain the same (by luck), he’s wrong by a factor of two (or more) on his ratio (or percentage in the present case). This is because looking at ‘b’ alone does not make mathematical sense at all. Of course his approach has other fundamental flaws, but this bogus interpretation of his results alone is striking.

Douglas Watts// May 14, 2009 at 5:55 am |Excellent post, Tamino. Thank you. These take time to write and prepare, but they are worth it for dubs like me. I’m not a math person and have to read things like this over and over to get the concepts. I’ve learned a lot from your tutorials on curve fitting and the vagaries of statistical analysis and now can say that I at least am beginning to get it.

Bob Webster// May 14, 2009 at 1:10 pm |Well done. A terrific example of how easily curve fitting can be abused as a tool to illustrate just about anything using the same data set.

What puzzles me about this whole issue (stepping far back from data spanning insignificant 100-year time periods) is the lack of perspective this whole issue has with respect to earth’s long-term climate.

The real question that needs addressing is: “What is the significance of recent climate change (in terms of both temperature and rate of temperature change) when compared with past climate change before humans could possibly have any influence?”

If the easily obtainable answer to that question tells us that recent climate is unremarkably within historic climate variability over the past half billion years (which span several ice eras, including the current one), then what is all this climate change fuss about?

[edit]

[

Response: Oh please. This is just a (thinly) veiled attempt to diminish the severity of the changes we'll see in the upcoming century by trying to compare the modern human dilemma to changes from half a billion years ago. I'm not buying it and I'm not allowing your further commentary, which includes pathetic claims about the lack of correlation between CO2 and climate -- that marks you not as an questioner, but an active denialist.The coup de grace is the breathtakingly stupid reference to CO2 saturation, complete with a reference to Jennifer Marohasy's blog. She's in the same league with Anthony Watts.]Phil.// May 14, 2009 at 2:22 pm |Manu // May 12, 2009 at 9:40 pm | ReplyThe variation of CO2 with time (dCO2/dt) is “modeled” as a linear function of SST (coefficient ‘a’) and manmade contribution (coefficient ‘b’). The best fit values are a = 4.6 ppm/yr per deg. C and b=0.1.

Now look at the statement: “The best fit (shown) assumed only 10% of the atmospheric CO2 increase is due to human emissions (b=0.1), while the other 90% is simple due to changes in sea surface temperature.”

First it is obvious that b=0.1 is not 10% if ‘a’ is not 0.9 … but even if ‘a’ was 0.9, it still doesn’t make any sense. Change the unit of SST and ‘a’ changes value too; does that mean that the ratio of SST contribution to dCO2/dt changes too?? More straightforwardly, ‘a’ is in ppm/yr per deg C, ‘b’ is adimensional … comparing both does not make any sense. If a ratio of man influence is to be derived, it is from b*antropo/(dCO2/dt), not from b alone.The problem with Spencer’s ‘model’ is that he appears to have made a numerical error!

According to his graphs the present dCO2/dT ≃ 2ppm/yr

His best fit SST term is 4.6*SST, as the present SST anomaly is ~0.21 (from his graph) this gives dCO2/dT (from SST)≃ 0.97 or about half of the total this should mean that about half should be due to Anthro.

However he says that the Anthro term is 0.1*Anthro, since Anthro ≃ 4.2 dCO2/dT (from Anthro)≃ 0.42.

Clearly this is about 20% not 10% but clearly the equation as written doesn’t give the result as given in the graph, to do so would need a value for b of ~0.25 (about 50%). It’s possible that his equation for the Anthro term uses different units than presented in his graph but even so it’s inescapable that that term must yield about half the dCO2/dT not 10%.

So Spencer needs to go back to his spreadsheet and correct his math (and correct the description in his blog).

Gator// May 14, 2009 at 7:43 pm |Spencer posits that the rate of CO2 going into the atmosphere depends on the SST. We observe the rate is going up. Presumably this means the SST is going up as well. I.e., global warming is real.

It would be interesting to integrate his “model,” which posits net CO2 leaving the ocean to go into the atmosphere, and compare that to the measurements of acidification of the oceans over the same period. It is clear that the oceans are a net sink from these measurements.

Dave A// May 14, 2009 at 10:16 pm |Ray,

I responded to you in two posts last night. Unfortunately, both were moderated out.

I don’t really understand this because the first was fairly innocuous and the second said some things about statistical methods used by well known climate scientists. If the latter was a problem why didn’t I get the usual Tamino broadside?

Paul Middents// May 15, 2009 at 1:56 am |Dave A

Maybe Tamino is trying to encourage you to take your nonsense somewhere else and discourage the rest of us from encouraging you by giving you any attention at all. Your noise level is seriously degrading the value of this blog.

Kevin McKinney// May 15, 2009 at 12:34 pm |Spencer’s analysis is unlikely to be worth anything, because it begins with a tacit assumption which is nonsense: that the CO2 we know we emit does

notend up in the ocean or the atmosphere.So where does it go, fairyland?

Crashex// May 15, 2009 at 5:07 pm |Why do the different filters and fits applied here change the underlying data? The values for the “x” data curves are noteably different for the different examples. I thought the varous filters would only change the “bold” trend line plotted through the data.

Crashex// May 15, 2009 at 5:13 pm |Maybe a better way to ask that is, what do you mean by “Drift”?

[

Response: The smoothing using the trigonometric and Butterworth filters was done using the package "mFilter" in R; whoever wrote that package included a "drift" function which I presume fits a straight line to model drift in the data. That drift is removed from the data itself, which causes the data to shift. I'd have programmed it differently...]