There has been extensive coverage of our research in the National Post, a leading Canadian newspaper, over the last few days. The entire article from Natuurwetenschap & Techniek was re-printed in two installments, together with a National Post editorial, all of which are online:
Breaking the hockey stick
The lone Gaspe cedar
Let science debate begin

There have been many letters to the editors, most of which are not online. Some of the letters have been from scientists, including a professor of plant science who expressed serious concerns about the use of tree ring widths as a means of coming to confident conclusions about temperature history.

On a personal basis, this coverage has been very gratifying, since my friends and family are not academics and being covered in the National Post seems much more tangible to them than publication in Geophysical Research Letters.


I’ve set up this blog at the suggestion of John Andrews of England, a computer consultant interested in climate change. I had posted some thoughts at, but it was difficult to post up comments on that layout. So John located a more appropriate host, converted past musings to blog layout and set up the framework extremely efficiently. Thank you to John. This is my first post to this layout, so I’m learning how the format works. I’m not sure how regularly how I will post; I’ll see. Ross McKitrick and I have been having a lot of hits on our respective websites and there has been a considerable amount of comment at, sci.environment, and UKWeatherworld, to name a few chatlines. So there is obviously some topical interest and some need for further explanations. I’ll try to pick up on some themes raised by posters. during the next while. In particular, I’ll try to post some comments on these topics (other suggestions welcome):
- the “Hockey Team”
- do the MBH98 errors “matter"? (They do.)
- a retrospective scorecard on MM03
- PC selection rules in MBH98
- the supposed MBH99 adjustment of the bristlecone pines


New research published on MBH98

We (McIntyre and McKitrick) are profiled in the cover story of the Feb. 1, 2005 edition of Natuurwetenschap & Techniek (NWT), a prominent European science magazine (both Dutch and English versions at ). The cover story is based on two new peer-reviewed papers being published in the well-known science journals Geophysical Research Letters and Environment and Energy.

Our article “Hockey Sticks, Principal Components and Spurious Significance” has been accepted for publication in Geophysical Research Letters, copyright 2005 American Geophysical Union (doi: 2004GL012750). A pre-publication version is at Further reproduction or electronic distribution is not permitted.

Our article “The M&M Critique of the MBH98 Northern Hemisphere Climate index: Update and Implications” has been accepted for publication by Energy & Environment and is available at

Our research shows fundamental flaws in the “hockey stick graph” used by the Intergovernmental Panel on Climate Change (IPCC) to argue that the 1990s were the warmest decade of the millennium. The original hockey stick study was published by Michael Mann of the University of Virginia and his coauthors Raymond Bradley and Malcolm Hughes. The main error affects a step called principal component analysis (PCA). We showed that the PCA method as used by Mann et al. effectively mines a data set for hockey stick patterns. Even from meaningless random data (red noise), it nearly always produces a hockey stick.

This “backgrounder” provides a road map and summary of the 3 articles. While these papers have been under review, Mann et al. have opened up their own weblog and criticized some of our earlier work. We include some comments here on this commentary and some FAQ.



Spot the real Hockey Stick…

Here is a postscript version and pdf version of the graphic showing simulated PC1s with a hockeystick shape and the MBH98 Northern Hemisphere reconstruction. Can you find the “real” MBH98 hockeystick?


AGU Meeting

The AGU conference is unbelievably big. I’m told there were over 10,000 people there. The printed program is 512 pages long and for individual papers only lists authors and titles. I was surprised how little attention was paid to climate in the last 1000 years.

Climate in the past 3 million or 30 million years was a big topic, with many presentations from ODP drilling. There were several presentations on the climate effect of the closure of the Panama Seaway (due to plate tectonics). Impressionistically, the plate tectonics people tend to date this about 4 MM years ago and the ODP climate people about 2.8-3 MM years ago. The closure ended Pacific waters moving into the North Atlantic and bringing heat to the Arctic and initiated a long-term cooling through the Pliocene and Pleistocene. The warming in the Holocene (10000-12000 years), including modern warming, is still well short of the Pliocene warm period. The Pliocene is not dinosaur vintage, but is getting to a recognizably modern configuration. But once the Panama Seaway closed, ocean currents had to re-arrange themselves and ultimately the modern Conveyor Belt was established, in which warm waters now come around the Cape of Good Hope. The re-arrangement is a big job, when you think of it. One of Nof’s students presented a poster arguing that one of the big changes in the Holocene has been the opening of the Bering Strait, which even though is very shallow, de-bottlenecks Arctic water.

Bob Carter showed me some very interesting material from his ODP drilling on the east New Zealand offshore. This is the main entry point for cold Antarctic waters into the Pacific - there are surprisingly few entry points. Bob has found that there are fluctuations at all scales - in fact, his data would indicate that changes of 0.8 deg C in a century are pretty much the norm for millions of years and lesser changes (the premise of the hockey stick) would seem to be the exception.

My poster session was on the Friday afternoon and I was reasonably busy. I talked to about 40-50 people. I had visited the poster session for Nonlinear Methods and quite a few of these mathematically oriented people reciprocated. They got the point of the principal components argument almost immediately and you could almost see them laughing at the punchline. I had printed off the computer program from Mann’s FTP site and highlighted the 4 lines that contain the programming error.

People who were not mathematically inclined were intrigued by a graphic showing 8 hockeysticks - 7 simulated and 1 MBH (the same sort of graphic as the one put up here a while ago, but just showing 1 simulation.) Quantity seems to matter in the demonstration. No one could tell the difference without being told. I’ll insert this graphic here in a day or two.

Again, I was surprised how little was on recent climate. There was very little representation from tree ring people. Casper Amman had a presentation describing his emulation of MBH98; he is planning to web up R code, which should be interesting. He outlined many issues pertaining to problems in emulating MBH98 - most of which will be familiar to any followers of our work, but conspicuously made no reference to our efforts. He said that he could emulate MBH98 results, but made no reference to principal components calculations or such esoterica as the MBH98 editing of the Gaspe series. (I’m still trying to get the new data from Jacoby et al., who say that the old data is more “temperature sensitive” and should be used in preference to the new data, which they refuse to archive. LOL)


Rutherford et al [Journal of Climate 2004]

The Dec. 4 post “False Claims” refers to an article by Mann and his associates [Rutherford et al. 2004], supposedly discrediting our work. There is nothing in this article about the two main points in our Nature submissions:

1) the modification of the PC algorithm such that it produces hockey sticks in the PC1 from red noise;
2) the ad hoc and unique extrapolation of the Gaspé series.

It regurgitates the comments made in the Nov. 2003 Internet response in which Mann et al. attributed the difference in PC results to a stepwise procedure used in MBH98, but never previously reported. However, the calculations in our 2004 submissions implemented the stepwise procedures and isolated the difference to a completely different cause: the weird centering method introduced at line 168 of their Fortran program. Rutherford et al [2004] do not discuss this matter, although Mann et al. are obviously aware of the issue, as it is referred to in the first paragraph of the web posting.

It is interesting that this matter is being discussed in a Journal of Climate article. Jones and Mann [Rev Geophysics 2004] refers to a submission by Mann et al. to Climatic Change, which was said to have refuted our first submission. Does it strike anyone as odd that this submission mentioned in print over 6 months ago has not seen the light of day? Is it possible that something happened to the article at Climatic Change?


If 2 PCs are used in the AD1400 North American network along with conventional (centered) PC calculations, we argued in our Nature submissions that MM-type results are obtained. This is now effectively acknowledged by MBH. To try to salvage MBH98, they now argue that they should be entitled to increase the number of PCs in the AD1400 North American network from 2 to 5 and that our not doing so is “incorrect". They point out that, using centered PC methods, the PC4 (instead of the PC1) has a hockey stick shape (from the bristlecone pines) and, as long as they can use the PC4, the PC4 now drives world climate history. Doesn’t this just seem silly? Now we’re not dealing with a “dominant” pattern of world climate, but a PC4. ROTFLOL.

Secondly, I defy anyone to show me how the actual retention of PC series in MBH98 can be derived from the Preisendorfer criteria now said to be used in MBH98 for tree ring networks (although MBH98 itself only talked about spatial distributions for tree ring PC retention). Below are two plots made on the same basis as the plot shown at Mann’s blog for the AD1400 North American network - only here for the AD1600 Vaganov and AD Stahle/SWM networks. In the first case, MBH98 retained 2 PCS and in the second case, MBH98 retained 9 PC series. I do not believe that there is any rational policy here. I sure can’t see how the actual retention can be linked to Preisendorfer. It would be helpful to see some source code here. Maybe there’s something weird and inconceivable like their centering method.

Thirdly, what does this do to their claims of robustness? A robust reconstruction obviously should not stand or fall on whether 2 or 5 PCs are used in the AD1400 North American network - but this is exactly what Mann et al. are saying. Remember all the grandiose claims about MBH98 being robust to the presence or absence of dendroclimatic indicators altogether (see both MBH98 and Mann et al.[2000]). Now it seems that MBH98 is not even robust to the presence or absence of a PC4. Also remember that Mann et al. have known about the lack of robustness to the bristlecones for a long time - look at the PC1 in the BACKTO_1400-CENSORED directory. It’s almost exactly the same as ours. Maybe someone can explain to me how you can claim robustness after doing the CENSORED calculations.


In the figures above, lines are from Preisendorfer-type simulations using AR1 coefficients. Red is using centered calculations; black is MBH98 method, showing both the archived value and our emulation. (I did these calculations a few months ago; I haven’t reconciled why the emulation differs from the archived value in the Vaganov AD1600 network, but the discrepancy is not large and is non-existent in the Stahle/SWM network. Again riddle me this: why does the AD1600 Vaganov network have 2 PCs and the AD1700 Stahle/SWM network have 9 PCs?

Mann and some of his colleagues have set up a blog at the above address. A couple of Mann’s first postings have been arguments against our papers. I’ll post up a two quick comments [below]. I’m going to be in San Francisco at the AFU conference all next week, where I’m presenting a paper [see abstract] and will post some more when I get back.


Hasn’t Mann’s data “always” been available?

There’s a difference between the underlying proxy data, source code and supporting calculations, and the situation is different for each category.

Proxy data:
There is an archive of proxy data located at Mann’s FTP site at the University of Virginia and this is not a current problem area. This data has not “always” been available. The FTP site was started on or about July 30, 2002, about 4 years after publication of MBH98, so it was not available prior to then. The proxy data is presently located at the url and bears a date stamp from 2002.

The FTP site has private areas
For example, the directory presently located at was formerly located at and was not indexed. If you knew the exact url, you could retrieve it, but not otherwise. When the url was changed to its present location, the date stamp for the directory did not change, so it looks like this directory has been public since December 2003, but it hasn’t been.

(BTW my access to Mann’s FTP site from my computer has been blocked, although I can still get to it from computers at the University of Toronto. This seems a little petty.)

Prior to publication of our first article, there had been no references to this url, even at Mann’s FTP site. I believe that it is quite possible that the directory was re-located - similar to the relocation of the MANNETAL98 directory and was only accessible since Nov 2003. Be that as it may, the proxy data is currently available.

Source code:
There is source code at Mann’s FTP site for the calculation of tree ring principal components only. There is no source code for the calculation of reconstructed principal components, for the calculation of NH average temperature, for the Preisendorfer-type simulations, etc. etc. We have requested source code and been refused; we have sought intervention from Nature and the U.S. National Science Foundation without success. For the calculations where code is available and a full reconciliation is possible, there were obviously major discrepancies between the procedures as described in Nature and the procedures as actually used. Who would ever have thought that they used an uncentered algorithm on de-centered data? There were also material discrepancies between the series listed as used and the series actually used. Who would have expected this? Mann et al. have provided a vast new Supplementary Information attached to the July 2004 Corrigendum, which Nature has stated to us finally provides a complete and accurate description of their procedures. Since Nature has not obtained the source code and reconciled it, how can they possibly know that the SI is an accurate description. (By the way, the Supplementary Information was not edited by Nature and almost certainly not peer reviewed.)

Supporting calculations:
The supporting calculations that I most want to see are the calculations for the AD1400 step which is in controversy. The only information available on this step is an RE statistic (of 0.51). Nature has refused to provide supporting calculations for the RE statistics. It is significant that the R2 and other verification statistics have not been provided. My calculations indicate that the R2 and other verification statistics are embarrassingly low, which is probably why no one wants to disclose them or to provide the supporting calculations from which they can be conclusively calculated. The attitude of Nature is that an interested party can calculate their own values. This is hardly an adequate response both given the widespread reliance on this study and the prior track record of inaccurate disclosure of both data and methods. I’d like to see the exact calculation and the refusals make me all the more interested.

So while quite a bit of new information has been provided, there’s a lot of material which has not. It should be easy to simply archive the programs. You’d think that it would be easier to archive the source code than to fight about it.


Other Multiproxy Studies

We are sometimes asked about other multiproxy studies which are held to somehow support Mann. A couple of comments. First, if Mann’s calculations are wrong, the fact that other studies get similar results is neither here nor there. Equally, a critique of MBH98 doesn’t refute these other studies, nor have we claimed this. Second, I’m not convinced that these studies are anywhere near as mutually supporting as claimed. When I get to it, I’m going to try to quantify exactly what is supposedly being shown by the spaghetti diagrams and see if they rise statistically above spaghetti diagrams from our simulated hockey sticks (see the Oct. 25 comment for an example). Third, the record for other multiproxy studies is, in all but one case, worse than MBH98. Here is a brief summary:

    Crowley and Lowery (2000)
    After nearly a year and over 25 emails, Crowley said in mid-October that he has misplaced the original data and could only find transformed and smoothed versions. This makes proper data checking impossible, but I’m planning to do what I can with what he sent. Do I need to comment on my attitude to the original data being “misplaced"?

    Briffa et al. (2001)
    There is no listing of sites in the article or SI (despite JGR policies requiring citations be limited to publicly archived data). Briffa has refused to respond to any requests for data. None of these guys have the least interest in some one going through their data and seem to hoping that the demands wither away. I don’t see how any policy reliance can be made on this paper with no available data.

    Esper et al. (2002)
    This paper is usually thought to show much more variation than the hockey stick. Esper has listed the sites used, but most of them are not archived. Esper has not responded to any requests for data.

    Jones and Mann (2003); Mann and Jones (2004)
    Phil Jones sent me data for these studies in July 2004, but did not have the weights used in the calculations, which Mann had. Jones thought that the weights did not matter, but I have found differently. I’ve tried a few times to get the weights, but so far have been unsuccessful. My surmise is that the weighting in these papers is based on correlations to local temperature, as opposed to MBH98-MBH99 where the weightings are based on correlations to the temperature PC1 (but this is just speculation right now.) The papers do not describe the methods in sufficient detail to permit replication.

    Jacoby and d’Arrigo (northern treeline)
    I’ve got something quite interesting in progress here. If you look at the original 1989 paper, you will see that Jacoby “cherry-picked” the 10 “most temperature-sensitive” sites from 36 studied. I’ve done simulations to emulate cherry-picking from persistent red noise and consistently get hockey stick shaped series, with the Jacoby northern treeline reconstruction being indistinguishable from simulated hockey sticks. The other 26 sites have not been archived. I’ve written to Climatic Change to get them to intervene in getting the data. Jacoby has refused to provide the data. He says that his research is “mission-oriented” and, as an ex-marine, he is only interested in a “few good” series.

    Jacoby has also carried out updated studies on the Gaspé series, so essential to MBH98. I’ve seen a chronology using the new data, which looks completely different from the old data (which is a hockey stick). I’ve asked for the new data, but Jacoby-d’Arrigo have refused it saying that the old data is “better” for showing temperature increases. Need I comment? I’ve repeatedly asked for the exact location of the Gaspé site for nearly 9 months now (I was going to privately fund a re-sampling program, but Jacoby, Cook and others have refused to disclose the location.) Need I comment?

    Jones et al (1998)
    Phil Jones stands alone among paleoclimate authors, as a diligent correspondent. I have data and methods from Jones et al 1998. I have a couple of concerns here, which I’m working on. I remain concerned about the basis of series selection - there is an obvious risk of “cherrypicking” data and I’m very unclear what steps, if any, were taken to avoid this. The results for the middle ages don’t look robust to me. I have particular concerns with Briffa’s Polar Urals series, which takes the 11th century results down (Briffa arguing that 1032 was the coldest year of the millennium). It looks to me like the 11th century data for this series does not meet quality control criteria and Briffa was over-reaching. Without this series, Jones et al. 1998 is high in the 11th century.

These studies are less “independent” than they appear. Many proxies recur in nearly all studies (e.g. Tornetrask, Polar Urals, Tasmania). If you look at all the authors, there is much overlap. Mann is in 4 of the studies; in addition to Jones et al 1998 and the two articles with Mann, Jones is a co-author in Briffa et al. 2001 and supplied much of the data to Crowley and Lowery. Bradley and Jones have been frequent co-authors.



Maybe I’ll start blogging some odds and ends that I’m working on. I’m going to post up some more observations on some of the blog criticisms.

One of the most common arguments against our criticism of MBH98 is that it is supported by other multi-proxy studies. This “support” is usually shown by “spaghetti diagrams", usually showing the results on an almost unintelligible scale. Here’s an oddity from spaghetti diagrams from Briffa et al. (2001) and Jones and Mann (2004). In the Briffa et al. spaghetti diagram, the Crowley-Lowery reconstruction is the “coldest” in the 17th and 18th centuries and does not intersect the MBH99 reconstruction. In the Jones and Mann spaghetti diagram, the Crowley-Lowery reconstruction intersects the MBH99 reconstruction in the late 17th century.

Figure 1. Spaghetti Diagram, Briffa et al. (2001))

From this site. The Crowley-Lowery (2000) reconstruction is in a orange-brown and is the lowest strand in the 17th and 18th centuries, lower than MBH98/99 (purple) or Briffa et al (2000) (green).

Figure 2. Spaghetti Diagram, Jones and Mann (2004)

Here Crowley-Lowery (2000) is in black In this diagram, it intersects MBH99 in the late 17th century and is consistently above Briffa et al. (2001).

In the Jones and Mann spaghetti diagram, the reconstructions were “scaled by linear regression against the smoothed instrumental NH series over the common interval 1856-1980, with the exception of Briffa et al (2001), which has been scaled over the shorter 1856-1940 interval owing to a late 20th century decline in temperature response in some of the underlying data". This supposed “decline in temperature response” is an interesting story in itself. The Briffa spaghetti diagram states that all values are re-expressed relative to a 1961-1990 mean (see y-axis label). I’m not expressing any views on this right now, merely noting it.

