Stephen McIntyre and Ross McKitrick
Corrections to the Mann et al (1998) Proxy Data Base and Northern Hemisphere Average Temperature Series


This is the web site for the above paper, published in Energy and Environment 14(6) 751-772.

The companion web site Climate2003 contains extensive additional information on the data and computations. It includes a step-by-step guide to the proxy data audit, an annotated discussion of the data sources and a step-by-step guide to the computations used to try and replicate the Mann results and to compute the corrected results.

Steven McIntyre

Ross McKitrick

Toronto Ontario Canada

M4K 2W1

smcintyre25@yahoo.ca
Steve's Bio

Department of Economics

University of Guelph

Guelph Ontario Canada

rmckitri@uoguelph.ca

Ross' bio available here



Abstract:
The data set of proxies of past climate used in Mann, Bradley and Hughes (1998, “MBH98” hereafter) for the estimation of temperatures from 1400 to 1980 contains collation errors, unjustifiable truncation or extrapolation of source data, obsolete data, geographical location errors, incorrect calculation of principal components and other quality control defects. We detail these errors and defects. We then apply MBH98 methodology to the construction of a Northern Hemisphere average temperature index for the 1400-1980 period, using corrected and updated source data. The major finding is that the values in the early 15th century exceed any values in the 20th century. The particular “hockey stick” shape derived in the MBH98 proxy construction – a temperature index that decreases slightly between the early 15th century and early 20th century and then increases dramatically up to 1980 -- is primarily an artefact of poor data handling, obsolete data and incorrect calculation of principal components.

THE PAPER IS AVAILABLE ON-LINE AT ENERGY AND ENVIRONMENT.



UPDATE: November 13 In the note posted below we comment on the deletion of pcproxy.mat and pcproxy.txt from the ftp site in question. We mistakenly thought pcproxy.txt was still at the site because it appeared on-screen under an exact address call. That was a browser cache copy. The file itself has been deleted from the ftp site. Also, the time quoted (Nov 11 1:46 PM) is local Toronto time.



UPDATE: November 11 Our response to the replies thus far from Professor Mann and his colleagues will be presented in three parts. Our overarching goal is to ascertain exactly what data and what computational steps were used by MBH98, so as to focus in as quickly as possible on the real sources of differences between our results. But along the way there are a few new isses that must also be dealt with.
Part 1, available here in PDF format responds to the claim that the data we audited was prepared in April 2003 in response to McIntyre's request to Mann, and that we ought to have gone to Professor Mann's ftp site instead. We show that the data file we were sent was in existence long before April 2003 and had we gone to the ftp site we would have found it contains the same data anyway. We also discuss some other pertinent file identity issues. This document, by establishing the practical equivalence between Professor Mann's ftp site and the data file we were sent, returns our focus to the basic question of data quality and sets the stage for the subsequent parts in which we will extend our existing critique.
Part 2 will present a detailed examination of the contents of Professor Mann's FTP site, in light of the claim that it is the official repository for the MBH98 data. This document has been sent to some colleagues for their comments and will be made available shortly thereafter.
Part 3, now under way, will seek to resolve the outstanding differences between our computational methods and those of MBH. Completion of this part will be contingent on our receiving the specific computer programs MBH used, and we are seeking this disclosure.



UPDATE: November 6 2003. Mr. McIntyre has a cold. Mr. McKitrick is going to an economics workshop in Manitoba for a couple of days, to discuss the question "Does the possibility of climate change imply that we should wash our laundry in cold water?" His presentation, if you are interested, is here. So there won't be any updates until next week.



UPDATE: November 4 2003. Professors Mann, Bradley and Hughes have revised their reply to our paper, see here.
They have also corrected some errors in their goodness-of-fit calculations.

UPDATE: November 3 2003. Professors Mann, Bradley and Hughes have made a more detailed reply to our paper, available here in PDF.
We will have a response prepared shortly.

UPDATE: October 29 2003. Professor Mann has made a preliminary reply to our paper. His reply and our response are available here in MSWord and here in PDF.


Background:

The well-known study
  • Mann, M.E., Bradley, R.S. & Hughes, M.K. (1998) Global-Scale Temperature Patterns and Climate Forcing Over the Past Six Centuries, Nature, 392, 779-787, 1998.

    is one of the most influential scientific papers of the past 10 years. It introduced the “multiproxy” method to the study of past climates, and produced what was purported to be a 600-year history of the average temperature of the Northern Hemisphere. It is the basis for the claim by Environment Canada (and many other governmental agencies) that the Earth is “warmer” now than it has been for 600 years. A companion paper published a year later in Geophysical Research Letters extended the 600-year series back to 1000 and spliced a surface temperature record to 1998, producing the famous hockey stick graph of the NH climate.



    This graph figures prominently in the Third Assessment Report of the Intergovernmental Panel on Climate Change and has been reproduced many times. It was the basis for the claim in pamphlets mailed by the Government of Canada to Canadians in 2002 that said “The 20th Century was the warmest globally for the past 1,000 years.” The pamphlets were sent to generate support for ratifying and implementing the Kyoto Protocol in Canada.

    In 2003, Steven McIntyre, a Toronto business man who specialized in mathematics at university, got interested in the process by which IPCC Reports were being put together and used for driving major policy decisions. Long experience in the mining industry, including close observation of the delinquent accounting that led to the Bre-X scandal, gave him a good nose for promotions based on unaudited claims. It also taught him that when big investments are at stake, due diligence requires relentless testing and independent verification of the data by all parties at every stage. Also, attention must be paid to potential conflicts of interest—for instance the author of a project feasibility study should not also be a major shareholder in the project. These are rigorous requirements in the private sector, yet in the case of the IPCC, chapter authors routinely promote their own research. This makes it even more important that there be external auditing of the reports’ foundation.

    The Mann hockey stick curve was given central prominence in the 2001 IPCC Report. The IPCC claims it has a rigorous review process. If this is true, the Mann, Bradley and Hughes paper should have no problem passing a detailed audit. Since governments around the world (including here in Canada) are making some very expensive policy decisions based on uncritical acceptance of the IPCC Report, an independent review seemed in order, and indeed should be a mere formality.

     

     

    The Project

    Mr. McIntyre obtained the underlying data set from Professor Michael Mann of the University of Virginia. Based on some apparent difficulties experienced by Mann's associates in supplying the data set, he surmised that it was possible that no one had ever previously requested the data set and that it would be a worthwhile endeavour to try to replicate the famous graph.

     

    In the summer of 2003 he contacted Ross McKitrick, an Associate Professor of Economics at the University of Guelph and coauthor of Taken By Storm: the Troubled Science, Policy and Politics of Global Warming, to discuss his findings to that point. They joined forces to write up the results and publish them. Their paper has been published in the British journal Energy and Environment.

     

    Their conclusion, after detailed study of the Mann et. al. data base, is that the “hockey stick” graph is an artefact of poor data handling, selective use of sources, reliance on obsolete versions of source data and erroneous statistical calculations. Correcting the copying errors and updating the source data yields the following revision to the original graph:

    The top diagram is from the Mann et. al. study (with the error bars removed). The vertical axis measures “anomalies” or departures from a notional hemispheric “average temperature” in tenths of a degree C. The bottom diagram is based on the corrected data. Applying the Mann et. al “multiproxy” procedure on their own data, when updated and correctly collated, contradicts the claim that the late 20th century climate is unusually warm or variable.




    The above shows the same comparison using 20-year moving averages.
    You can get the spreadsheet that produced these pictures here.


    Here is the same figure in b&w.

     

     THE PAPER IS AVAILABLE ON-LINE AT ENERGY AND ENVIRONMENT.

     

    Supplementary Material

     

    The data, guides to all sources, statistical programs etc. are all presented in step-by-step detail at Climate2003

     

    http://www.climate2003.com/index.html

    The proxy data set we received from Mann (pcproxy.txt). The text version is just as received from Mann (and re-sent back to him to verify). The .xls version (BIG file - 3.8MB) has been put into an Excel spreadsheet and the columns colour-coded to identify those with problematic data. The subset available in each time interval are put onto separate worksheets. The final revised versions of the proxy data are in .txt and excel formats.

     

    pcproxy.txt



    MBHproxydata.xls

    revised.txt

    revised.xls

    The 16 temperature principal components taken from Mann’s web site.

    temp16pc.txt

     

    The final weighting matrix on proxies. The NH temperature series is derived from a matrix equation that yields a set of weights to average up the 112 proxies into a statistic that is called the “Northern Hemisphere temperature anomaly.” This spreadsheet gives the weights.

     

    weights.xls



    Questions and Answers, in case you were wondering….

     

     

    Who paid for this research?

    • No one. We neither sought nor received financial support for this project.

     

    Was your article peer-reviewed?

    • Yes. Our article was read by numerous colleagues in Canada, the US, Australia and Europe, including experts in mathematics and statistics, geology, paleoclimatology, climatology and physics. It was refereed for Environment and Energy by reviewers selected by the editor.

     

    If there were all the errors in Mann et al (1998) that you allege, how could it have passed peer review for a prestigious journal like Nature?

    • You would have to ask Nature about the steps taken by their peer reviewers to verify the results in Mann et al. (1998). However, a peer review is not an audit. It is extremely unlikely that the peer reviewers for Nature even requested the original MBH dataset, much less that they carried out the quality control tests that we carried out.

     

    How can a third party decide whether you are right or Mann et al. are right?

    • We have created an audit trail so that third parties can verify these findings for themselves.  This includes what we think is the first Internet posting of the original proxy data used in Mann et al (1998). Some of the points are very easy to verify. To verify the collation errors resulting in duplication of 1980 entries in the data, one needs only inspect a few numbers. We’ve created excerpts from the data and directions to the exact locations in the original data base. Anyone can check this.  Similarly, we’ve created excerpts and pointers in the data base so that anyone can verify the extrapolations and “fills” merely by inspection. To verifying that the MBH data base contains obsolete data, we’ve made graphs to show the differences between the MBH versions and the updated version in every case found (so far); we’ve also included data files showing both versions together and URLs for the updated data. Anyone can check this for themselves. We’ve included computer scripts in R, which will collect the data from the URL site and make the graphs. Verifying the principal components calculations is more work, but we’ve also made the tools available to do this. We’ve provided collated data files for the underlying tree ring series as well as descriptions of how to collect the data. We’ve provided computer scripts showing our principal component calculations and the explained variance using MBH principal components. We’ve also provided a collated version of all the data and scripts for how we replicated the MBH reconstruction. We believe that audit trails are extremely important for this type of analysis and that the Internet provides an ideal mechanism for ensuring public accessibility to such audit trails.

     

    Why didn’t IPCC pick up these errors?

    • You’d have to ask them. IPCC have not described what measures of due diligence they carried out. One would surmise that they did not carry out the type of data quality control tests that we did. We understand that Mann was a lead chapter author and, in his IPCC capacity, may not have carried out any due diligence on his own work.

     

    Why has no one else picked up these errors?

    • Our guess is that no one else ever examined the data in detail. MBH never placed the compilation at the World Data Center for Paleoclimatology or at their own FTP sites, as one might have expected. [Nov 4/03: This is not correct. An FTP site was identified in the responses to our paper. It is ftp://holocene.evsc.virginia.edu/pub/MBH98/.] [Nov 11/03 The previous correction was premature. See this update for some further comments on FTP disclosure. Professor Mann has asserted that the data we analyzed was not the data behind MBH98. But it turns out to be identical to what was on the ftp site. So either we did audit the right data or the MBH98 data still haven't been FTP-posted.] When Prof. Mann arranged for the data to be provided in April 2003, it was not immediately available and it’s possible that no one ever requested it before.

     

    What led you to request the data from Prof. Mann?

    • McIntyre has a background in the mineral exploration business. He wanted to see the underlying proxies (before any statistical manipulations by Mann et al.) for exactly the same reason that mining engineers want to look at drill cores in calculating ore reserves. He had seen other proxy data which did not suggest that proxies were behaving differently in the late 20th century. He had also seen comments by Briffa that tree ring proxies had declined in the second half of the 20th century and, since MBH data was heavily based on tree rings, wondered how this was reflected in the MBH data. Since he was unable to locate the data in a public archive, he requested it from Prof. Mann.

     

    Did you show this paper to Prof. Mann or ask Prof. Mann for comments prior to publication?

    • In late September 2003, we asked Prof. Mann for additional information on his reconstruction methodology. Prof Mann advised us that he was unable to provide us with such additional information and would be unable to respond to further inquiries, owing to the numerous demands on his time.

     

    Are you qualified to verify this data?

    • Ultimately, to borrow a phrase, “the proof is in the pudding”. If we’ve identified material errors and defects in this data base, this would prove that we were qualified to do so. As a more detailed answer, both of us have strong backgrounds in handling data and in assessing data quality. McIntyre’s intuition that the data should be examined like drill core shows that the practical experience and scepticism that one acquires in the mineral exploration industry was not misplaced here. Moreover, the paper is about statistical and “accounting” issues, both of which are well within our ranges of experience and competence. While McIntyre’s background is more on the practical side and McKitrick’s more on the academic side, both have strong mathematical skills and statistical training.

     

    Do you have any ties to the energy sector or anti-Kyoto think tanks?

    • McKitrick is a Senior Fellow of the Fraser Institute, a Canadian policy think tank that has taken a stand against Kyoto. McIntyre has worked many years in the mineral exploration industry. McIntyre is a shareholder of a micro-capital energy exploration company, CGX Energy, has acted in the past as a consultant to CGX and sub-leases office space from CGX. CGX is not a producing company and, as a company, has no views on Kyoto and has provided no financial support to this study.

     

    Your graph seems to show that the 15th Century was warmer than today’s climate: is this what you’re claiming?

    • No. We’re saying that Mann et al., based on their methodology and corrected data, cannot claim that the 20th century is warmer than the 15th century – the nuance is a little different. To make a positive claim that the 15th century was warmer than the late 20th century would require an endorsement of both the methodology and the common interpretation of the results which we are neither qualified nor inclined to offer.

     

    What led you to publish in E&E rather than Nature?

    • After receiving the MBH98 data from Scott Rutherford and Michael Mann, McIntyre posted a series of observations about curiosa in the data on the internet discussion group climateskeptics. Sonja Boehmer-Christiansen invited McIntyre to consider writing up his work for submission and McIntyre agreed. Subsequent to this, McKitrick joined with McIntyre in the analysis and preparation of an article. McKitrick suggested that an article be submitted to Nature and a 1500-word version (to fit the word limit in Nature) was drafted. But after showing it to some scientific colleagues who were not familiar with the issue, we were advised that it was too short a format to convey the scope of the argument. So we chose to write a longer paper first in order to get the full body of material out. It has been suggested to us that we write a letter to Nature summarizing what is spelled out in the longer paper and we are considering this.

     

    What if someone comes along and finds errors in your work?

    • We've made it as easy as possible for them to do so. We’ve displayed all our data and all our methods. We welcome the scrutiny.

     

    How closely did you replicate the original MBH98 results using the data they supplied you?

    • As we state in the paper we achieved substantial success in replication, but some differences remained between their results and ours. We were unable to obtain advice from them on either data questions or methdology questions, so we carried on with the rest of our analysis. Figure 6 in our paper shows the comparison of their temperature PC1 and ours. The comparison of the final NH index versions looks very similar to that between the PC1 versions:

      In their reply (see above) MB&H can also obtain a large variation in the 15th century, similar to ours, by making some changes that approximate some of the changes between our versions.

     



    QUESTIONS FOR PROFESSORS MANN, BRADLEY AND HUGHES THAT ARISE FROM THIS ANALYSIS.

    These questions summarize the results of our audit of the data set. Answers to these questions are required to settle the contradiction between the original and corrected results.

     

    1.       Does the database contain truncations of series 10, 11 and 100? (and of the version of series 65 used by MBH98)?

     

    2.       Are the 1980 values of series #73 through #80 identical to 7 decimal places? Similarly for the 1980 values of series #81-83?  And for the 1980 values of series #84 and #90-92? What is the reason for this?

     

    3.       Where are the calculations of principal components for series in the range #73-92 that would show that these have been collated into the correct year? Do you have any working papers that show these, and if so, would you make them FTP or otherwise publicly available?

     

    4.       Do the following series contain "fills": #3, #6, #45, #46, #50-#52, #54-#56, #58, #93-#99?

     

    5.       How did you deal with missing closing data in the following series: #11, #102, #103, #104, #106 and #112?

     

    6.       What is the source for your data for series #37 (precipitation in grid-box 42.5N, 72.5W)?  Did you use the data from Jones-Bradley Paris, France and if so, in which series?  More generally, please provide, identifications of the exact Jones-Bradley locations for each of the series #21-42. Where are the original source data?

     

    7.       Did you use summer (JJA) data for series #10 and #11 rather than annual data. If so, why?

     

    8.       Does your dataset contain obsolete data for the following series: #1, #2, #3, #6, #7, #8, #9, #21, #23, #27, #28, #30, #35, #37, #43, #51, #52, #54, #55, #56, #58, #65, #105 and #112?

     

    9.       Do you use the following listed proxies: fran003, ital015, ital015x, spai026 and spai047?  If so, where?

     

    10.   Did you commence your calculation of principal components after the period in which all dataset members were available for the following series: #69-71, #91-92, #93-95, #96-99?

     

    11.   What is the basis for inclusion of some tree ring sites within a region in regional principal component calculations and others as individual dataset components?

     

    12.   Did you commence your calculation of principal components before the period in which all dataset members were available for the following series: #72-80, #84-90? If so, please describe your methodology for carrying out these calculations in the presence of missing data and your justification for doing so?

     

    13.   What is the explained variance under your principal component calculation for the period of availability of all members of your selected dataset?  Would you please make  your working papers that show this FTP or otherwise publicly available?

     

     

     

     

     


    <<== Go to Ross McKitrick's homepage