Open Mind

How Not to Analyze Data, part 4

April 7, 2008 · 26 Comments

I’ve already shown one of the reasons Anthony Watts’ and Basil Copeland’s analysis of the solar-cycle/temperature connection is not just wrong, it’s an example of grotesque incompetence as data analysts. Let’s have a look at another reason.


They start with the HadCRUT3v global average temperature data, then apply a Hodrick-Prescott filter to smooth the data using a parameter value of 7. This effectively smooths the data on about a 6-year timescale. Then they take the “1st differences,” i.e., the difference between each year’s smoothed value and the previous year’s smoothed value, as an estimate of the warming rate for that year. One has to wonder, can they really use that method to determine the warming rate with sufficient accuracy for their analysis to be correct? What’s the probable error of such an estimate?

If we knew the true warming rate, then of course we’d know how much the estimates are in error. So let’s generate some artificial data. I’ll start with an underlying signal which follows a straight line perfectly, warming at a rate of 0.005 deg.C/yr from 1850 to 2008. Then I’ll add random noise with a standard deviation of 0.1, about the noise value we see in actual global temperature temperature data. Then I’ll apply a Hodrick-Prescott filter with the same parameter value they use. Here’s the result:

Keep in mind that we already know exactly what the underlying signal is. It’s a straight line, and the warming rate is the same over the entire time span: 0.005 deg.C/yr. Let’s take the 1st differences of the smoothed values as an estimate of the warming rate in deg.C/yr:

Well! That’s certainly jumping all over the place. The estimated rate varies between -0.036 and +0.044 deg.C/yr. But we know, without any doubt, that the true rate is +0.005 deg.C/yr. Exactly.

The Hodrick-Prescott filter is a good filter. The 1st differences are indeed an estimate of the warming rate. But like all estimates, there’s an error associated with those values. Watts and Copeland never even considered how big those errors might be; they just took the values they computed as correct and tried to correlate their peaks with the peaks of the solar cycle — bungling that job miserably. They’ve bungled this job too; failure even to consider the size of the likely error is yet another sign of incompetence, 1st-year statistics students get an “F” for that kind of blunder. Especially when the range of the error, in this case 0.08 deg.C, is fully 16 times as large as the true value.

The noise level in this artificial data is comparable to the noise level in actual temperature data. The range of the 1st differences of the artificial data is comparable to the range computed by Watts and Copeland. And, yes, the errors in the estimates from the artifical data are comparable to the errors in the estimates of Watts and Copeland. Namely: a lot bigger than the true warming rate. Which blows their supposed “connection” right out of the water.

Mr. Watts, will you update your post (a highly visible update, not just a comment) to admit that it’s completely wrong? Or shall I go on?

Categories: Global Warming

26 responses so far ↓

  • TCO // April 7, 2008 at 3:04 am

    Watts is not mathematically adept or even thoughtful. And Basil likes to throw around fancy terms, but it covers him not really having deep knowledge and thoughtfullness either. I really don’t see why you’re spending all this time on him…instead of…well…what I want.

    [Response: Watts keeps saying “we need to do a better job with figure 5″ but won't bring himself to admit that it's wrong. Copeland, on this very blog, stated that “I’ll look into the data plotted in Figure 5, and if there’s an error, we’ll correct it.” But it hasn't been corrected -- not even the data point that's plotted wrong.

    But don't worry, I'll be spending plenty of time on actually *useful* stuff. And it only took about 10 minutes to do this post -- but then, it only took 15 seconds to know what was wrong with his post.]

  • TCO // April 7, 2008 at 3:06 am

    (on topic post) the issues with smoothing have been discussed a lot on skeptic web sites, so it’s bizarre watching Watts and Basil sink deep into this type of analysis. I told them and meant it that if an AGWer were doing this sort of analysis, we would criticize it.

  • Eli Rabett // April 7, 2008 at 3:27 am

    For giggles, run a Fourier transform on that random data. Me thinks there is a ~9-11 year cycle in there, basically imposed by filtering everything < 1/6 year-1 out.

  • John Mashey // April 7, 2008 at 4:02 am

    This (and many others) are eerily reminiscent of parapsychology experiments in which masses of data were gathered and then tortured until results (sort-of) appeared.

    I.e., one person looks at a card, second person attempts to read their mind, many times.
    + Oh, results are no better than chance.
    + But, maybe there’s a lag, and the reader is one behind … or two behind … or three behind.
    +no? well, maybe the reader is precognitive too, so they’re one head, two ahead….

    + No, well maybe they’re ahead sometimes and behind sometimes? Let’s try that.

    One might look at Princeton’s recently-terminated PEAR, which did large numbers of tests:
    http://www.princeton.edu/~pear , although that’s human manipulation of machines, remote perception, etc.

    You can also read about it in:
    http://skepdic.com/pear.html

    and Skeptical Inquirer has covered such thingss on occasion.

  • Bob North // April 7, 2008 at 4:55 am

    Isn’t the first difference a measure of the change in the rate of warming, not the warming rate itself?

    [Response: No, the 1st difference is an estimate of the change in the temperature over a unit time, i.e. the warming rate.]

  • Julian Flood // April 7, 2008 at 9:14 am

    Very informative. Would it be possible to show us a graph using .014deg/year and a smoothing such as that used by HADCRU? It might be instructive to see how closely it mimics the uncorrected* SSTs from 1910 to 2008.

    JF
    *I’m dubious about Folland and Parker’s adjustment.

  • Marko // April 7, 2008 at 2:10 pm

    Tamino - thanks for doing the auditing. Personally, I would also like to see you use your skills to audit Mann, et al. more often. That might put you in the dog house, but it would improve your street cred tremendously. Imagine a Tamino/McIntyre joint audit - that would shake the pillars!

    [Response: I've done 5 posts on PCA, 2 of them specifically on non-centered PCA, and it hasn't brought anyone any closer to agreement. Nobody can claim that work hasn't been subjected to intense scrutiny.]

  • Hansen's Bulldog // April 7, 2008 at 2:57 pm

    Perhaps I’ve had a small impact on Watts’ site, as evidenced by this comment on this site.

    Although Watts’ admission is absolutely minimal, I think it’s as much as I’m gonna get. In any case, I’m happy to move on to more interesting things.

  • climatewonk // April 7, 2008 at 3:08 pm

    That might put you in the dog house, but it would improve your street cred tremendously. Imagine a Tamino/McIntyre joint audit - that would shake the pillars!

    Tamino has enough “street cred” on his own merit and needs no help and would get no boost from a collaboration with McIntyre, except among denialist types. IMO, it would only give more credibility to McIntyre.

  • Hank Roberts // April 7, 2008 at 3:22 pm

    I’d suggest —
    when patience allows — editing the five PCA posts into one piece of writing, and if open to comments at all, not letting the average-to-marginal comments accumulate. A thread in which only people with something to contribute add posts would be very much welcome.

  • Lee // April 7, 2008 at 3:36 pm

    I love the last sentences of Watts’ update:

    “We are continuing to look at different methods of demonstrating a correlation. Please watch for future posts on the subject.”

    Translate the first sentence as, ‘proper analysis shows no correlation, but we’re going to keep looking until we find a way to show one, dammit!’

    On the second sentence: I wonder if that will be before or after the part three of the ‘change the baseline and the data changes’ series, where he promised over 5 weeks ago that he would address his errors?

    I would ask him directly, but I’m still blocked from his blog.

  • Hansen's Bulldog // April 7, 2008 at 4:05 pm

    Although I consider Watts hopelessly inept as an analyst, I’m not convinced that he’s necessarily deliberately deceptive. So I’ve removed that phrase “Lies, damned lies, …” from the title. Which is a bit of a pity, because it’s a clever paraphrase of Mark Twain’s comment about “Lies, damned lies, and statistics.”

    But for the record: I’m not convinced that he’s deliberately deceptive with his analyses. I expect he actually believes what he’s posted.

  • Brian D // April 7, 2008 at 5:43 pm

    HB: I’m not convinced that he’s deliberately deceptive with his analyses. I expect he actually believes what he’s posted.

    Case in point: his repeated posting of the RSS temperature graph, drawing emphasis to the giant la Nina drop in late ‘07 to early ‘08. If he honestly believes his analyses, then we can expect confirmation bias to rear its ugly head, meaning he may not notice the similar drops earlier in the record (i.e. the one-year drop between 1988 and 1989).

    I only have a few minutes spare time at the lab (read: this), but in between experiments I’ve been putting together a few graphs in the same style as his, truncating the record (i.e. emulating what information would have been available then) right after those drops, just to illustrate (in terms he evidently finds compelling, as the graphs are in his style) that a one-year drop tells you nothing about the overall trend. (I hate to have to resort to this, but if you can’t dazzle them with brilliance, I might baffle them with [family-friendly blog]). When I get them up, I’ll see if he lets them through in his comments.

  • Steve Bloom // April 7, 2008 at 7:02 pm

    Watts says that at one time he believed in a dominant CO2-climate connection and even engaged in promoting it, but at some point ( ten to fifteen years ago IIRC) realized that the evidence points to the sun as the dominant factor. This conversion sounds a bit Lomborgesque in that there’s no evidence of the prior state or of the conversion.

    The underpinning of the sun-climate idea has changed greatly over that time as the proposal that TSI variations are straightforwardly the major direct centennial-scale climate driver has had to be abandoned in favor of mechanisms such as cosmic ray influences and enhanced climate sensitivity to small TSI changes. As the evidence piles up against those as well, rather than adjusting to it Watts amd his fellow travelers simply crank up their efforts to find the solar explanation that they seem to just know is out there. Similar to Arthur Clarke’s remark about any sufficiently advanced technology being indistinguishable from magic, at a certain point such a belief becomes indistinguishable from astrology.

  • dhogaza // April 7, 2008 at 7:58 pm

    Watts says that at one time he believed in a dominant CO2-climate connection and even engaged in promoting it, but at some point ( ten to fifteen years ago IIRC) realized that the evidence points to the sun as the dominant factor. This conversion sounds a bit Lomborgesque in that there’s no evidence of the prior state or of the conversion.

    It’s a classic science-denialist tactic to try to get the reader to believe that the writer is fully educated in whatever scientific field they are denying.

    Thus the creationist “I used to believe in evolution until I looked at the evidence …”, “I used to believe that HIV causes AIDS until I looked at the evidence…”, “I used to believe in AGW until I looked at the evidence…”.

    Underlying this tactic is conviction that the scientists in the field being denied are engaged in fraud, or really don’t believe in (evolution, AGW, etc) but couch their scientific output in the consensus framework because it’s the only way to get funding (if you don’t, why, you’re “Expelled!”, right?), denialists don’t publish in the literature because the prevailing (”darwinist”, climate science, etc) hierarchy won’t publish papers exposing the truth, etc etc.

  • non27 // April 7, 2008 at 8:21 pm

    Watts forget to mention the lowest snowcover on record for eurasia in March and second lowest value for NH,strange.
    http://climate.rutgers.edu/snowcover/chart_anom.php?ui_set=1&ui_region=eurasia&ui_month=3

  • cthulhu // April 7, 2008 at 10:30 pm

    The comments to the posts over at Watts blog are retarded. Simply retarded. I would at least feign some decency but I have just dredged this post
    http://wattsupwiththat.wordpress.com/2008/04/06/co2-monthly-mean-at-mauna-loa-leveling-off/#comments

    and it is truely and utterly astounding how clueless so many of his readers are.

    And I notice he doesn’t step in to correct them. There is nothing educational about his blog, it’s anti-educational, it guides them towards incorrect conclusions and lets them wallow in them.

  • Cherry Pick // April 8, 2008 at 12:39 am

    re: non27

    But February was higher than the anomaly in both cases, as was January and December. Cherry picking all around.

  • Lee // April 8, 2008 at 1:15 am

    Anthony has gone back and systematically removed responses questioning his methods and conclusions, in some cases several weeks after first approving them and posting them. He has scrubbed his own record over there - just one example, at one time he accused me of ‘hypocrisy,’ refused to post my response to that accusation, and then later removed the accusation without note or apology. He promises to address issues, and challenges or insults people who press him on errors, telling them to ‘wait for the followup post, - and then never makes the followup post. and then sometimes going back later and removing the response pressing him on the issue.

    The man is fundamentally dishonest, even if he DOES believe what he is saying.

  • non27 // April 8, 2008 at 6:36 am

    Re “Cherry Pick”

    So what?
    I’ve not written that we must shutup on January record snowcover as there’s no reason to forget march record.
    Furthermore snowcover is subjected to strong variability during winter and there’s no significant downward trend in January, sometimes a cold weather event favors extensive snowcover over Asia; anyway this can change quickly,march shows a clear long term decline in snowcover and continuos warm air advection this year favored lowest value on record even in a still rather cold world.

  • Wouter Lefebvre // April 8, 2008 at 9:54 am

    Tamino,

    is it possible to use some of your figures in a (popular) publication? Of course, a link to your blog will be included as source, together with the source of the respective data. It makes me win some time, as then, I will not need to make these graphs myself.

    Thanks in advance

    [Response: You may use the graphs, even without crediting this blog, but I think it would be necessary to credit the sources of the data.]

  • Raphael // April 8, 2008 at 5:59 pm

    Shouldn’t you now pick 14 peaks, label them 10-23 and show how nifty they look on a “figure 5″ plot?

  • chriscolose // April 8, 2008 at 8:07 pm

    Thanks Tamino. It did turn out that HadCRUT was shifted up about 0.1 C (I used GISS reference period). It does turn out that GISS (green) and Had (blue) deviate a bit at the end though, so I could not replicate what you got (ignore the red line)

    http://img2.putfile.com/main/4/9816033471.jpg

  • chriscolose // April 8, 2008 at 8:10 pm

    bigger one
    http://img2.putfile.com/main/4/9816091560.jpg

  • EliRabett // April 9, 2008 at 3:39 am

    Lee, Eli is somewhat of the Groucho persuasion with respect to appearing in Anthony’s blog. If I looked it would be admitting I cared.

  • Horatio Algeranon // April 9, 2008 at 6:38 pm

    Lookin’ for Correlation in all the wrong places…

Leave a Comment