IPCC Fourth Assessment Report: Climate Change 2007
Climate Change 2007: Working Group I: The Physical Science Basis

1.3.2 Global Surface Temperature

Shortly after the invention of the thermometer in the early 1600s, efforts began to quantify and record the weather. The first meteorological network was formed in northern Italy in 1653 (Kington, 1988) and reports of temperature observations were published in the earliest scientific journals (e.g., Wallis and Beale, 1669). By the latter part of the 19th century, systematic observations of the weather were being made in almost all inhabited areas of the world. Formal international coordination of meteorological observations from ships commenced in 1853 (Quetelet, 1854).

Inspired by the paper Suggestions on a Uniform System of Meteorological Observations (Buys-Ballot, 1872), the International Meteorological Organization (IMO) was formed in 1873. Its successor, the World Meteorological Organization (WMO), still works to promote and exchange standardised meteorological observations. Yet even with uniform observations, there are still four major obstacles to turning instrumental observations into accurate global time series: (1) access to the data in usable form; (2) quality control to remove or edit erroneous data points; (3) homogeneity assessments and adjustments where necessary to ensure the fidelity of the data; and (4) area-averaging in the presence of substantial gaps.

Köppen (1873, 1880, 1881) was the first scientist to overcome most of these obstacles in his quest to study the effect of changes in sunspots (Section 2.7). Much of his data came from Dove (1852), but wherever possible he used data directly from the original source, because Dove often lacked information about the observing methods. Köppen considered examination of the annual mean temperature to be an adequate technique for quality control of far distant stations. Using data from more than 100 stations, Köppen averaged annual observations into several major latitude belts and then area-averaged these into a near-global time series shown in Figure 1.3.

1.3

Figure 1.3. Published records of surface temperature change over large regions. Köppen (1881) tropics and temperate latitudes using land air temperature. Callendar (1938) global using land stations. Willett (1950) global using land stations. Callendar (1961) 60°N to 60°S using land stations. Mitchell (1963) global using land stations. Budyko (1969) Northern Hemisphere using land stations and ship reports. Jones et al. (1986a,b) global using land stations. Hansen and Lebedeff (1987) global using land stations. Brohan et al. (2006) global using land air temperature and sea surface temperature data is the longest of the currently updated global temperature time series (Section 3.2). All time series were smoothed using a 13-point filter. The Brohan et al. (2006) time series are anomalies from the 1961 to 1990 mean (°C). Each of the other time series was originally presented as anomalies from the mean temperature of a specific and differing base period. To make them comparable, the other time series have been adjusted to have the mean of their last 30 years identical to that same period in the Brohan et al. (2006) anomaly time series.

Callendar (1938) produced the next global temperature time series expressly to investigate the influence of CO2 on temperature (Section 2.3). Callendar examined about 200 station records. Only a small portion of them were deemed defective, based on quality concerns determined by comparing differences with neighbouring stations or on homogeneity concerns based on station changes documented in the recorded metadata. After further removing two arctic stations because he had no compensating stations from the antarctic region, he created a global average using data from 147 stations.

Most of Callendar’s data came from World Weather Records (WWR; Clayton, 1927). Initiated by a resolution at the 1923 IMO Conference, WWR was a monumental international undertaking producing a 1,196-page volume of monthly temperature, precipitation and pressure data from hundreds of stations around the world, some with data starting in the early 1800s. In the early 1960s, J. Wolbach had these data digitised (National Climatic Data Center, 2002). The WWR project continues today under the auspices of the WMO with the digital publication of decadal updates to the climate records for thousands of stations worldwide (National Climatic Data Center, 2005).

Willett (1950) also used WWR as the main source of data for 129 stations that he used to create a global temperature time series going back to 1845. While the resolution that initiated WWR called for the publication of long and homogeneous records, Willett took this mandate one step further by carefully selecting a subset of stations with as continuous and homogeneous a record as possible from the most recent update of WWR, which included data through 1940. To avoid over-weighting certain areas such as Europe, only one record, the best available, was included from each 10° latitude and longitude square. Station monthly data were averaged into five-year periods and then converted to anomalies with respect to the five-year period 1935 to 1939. Each station’s anomaly was given equal weight to create the global time series.

Callendar in turn created a new near-global temperature time series in 1961 and cited Willett (1950) as a guide for some of his improvements. Callendar (1961) evaluated 600 stations with about three-quarters of them passing his quality checks. Unbeknownst to Callendar, a former student of Willett, Mitchell (1963), in work first presented in 1961, had created his own updated global temperature time series using slightly fewer than 200 stations and averaging the data into latitude bands. Landsberg and Mitchell (1961) compared Callendar’s results with Mitchell’s and stated that there was generally good agreement except in the data-sparse regions of the Southern Hemisphere.

Meanwhile, research in Russia was proceeding on a very different method to produce large-scale time series. Budyko (1969) used smoothed, hand-drawn maps of monthly temperature anomalies as a starting point. While restricted to analysis of the NH, this map-based approach not only allowed the inclusion of an increasing number of stations over time (e.g., 246 in 1881, 753 in 1913, 976 in 1940 and about 2,000 in 1960) but also the utilisation of data over the oceans (Robock, 1982).

Increasing the number of stations utilised has been a continuing theme over the last several decades with considerable effort being spent digitising historical station data as well as addressing the continuing problem of acquiring up-to-date data, as there can be a long lag between making an observation and the data getting into global data sets. During the 1970s and 1980s, several teams produced global temperature time series. Advances especially worth noting during this period include the extended spatial interpolation and station averaging technique of Hansen and Lebedeff (1987) and the Jones et al. (1986a,b) painstaking assessment of homogeneity and adjustments to account for discontinuities in the record of each of the thousands of stations in a global data set. Since then, global and national data sets have been rigorously adjusted for homogeneity using a variety of statistical and metadata-based approaches (Peterson et al., 1998).

One recurring homogeneity concern is potential urban heat island contamination in global temperature time series. This concern has been addressed in two ways. The first is by adjusting the temperature of urban stations to account for assessed urban heat island effects (e.g., Karl et al., 1988; Hansen et al., 2001). The second is by performing analyses that, like Callendar (1938), indicate that the bias induced by urban heat islands in the global temperature time series is either minor or non-existent (Jones et al., 1990; Peterson et al., 1999).

As the importance of ocean data became increasingly recognised, a major effort was initiated to seek out, digitise and quality-control historical archives of ocean data. This work has since grown into the International Comprehensive Ocean-Atmosphere Data Set (ICOADS; Worley et al., 2005), which has coordinated the acquisition, digitisation and synthesis of data ranging from transmissions by Japanese merchant ships to the logbooks of South African whaling boats. The amount of sea surface temperature (SST) and related data acquired continues to grow.

As fundamental as the basic data work of ICOADS was, there have been two other major advances in SST data. The first was adjusting the early observations to make them comparable to current observations (Section 3.2). Prior to 1940, the majority of SST observations were made from ships by hauling a bucket on deck filled with surface water and placing a thermometer in it. This ancient method eventually gave way to thermometers placed in engine cooling water inlets, which are typically located several metres below the ocean surface. Folland and Parker (1995) developed an adjustment model that accounted for heat loss from the buckets and that varied with bucket size and type, exposure to solar radiation, ambient wind speed and ship speed. They verified their results using time series of night marine air temperature. This adjusted the early bucket observations upwards by a few tenths of a degree celsius.

Most of the ship observations are taken in narrow shipping lanes, so the second advance has been increasing global coverage in a variety of ways. Direct improvement of coverage has been achieved by the internationally coordinated placement of drifting and moored buoys. The buoys began to be numerous enough to make significant contributions to SST analyses in the mid-1980s (McPhaden et al., 1998) and have subsequently increased to more than 1,000 buoys transmitting data at any one time. Since 1982, satellite data, anchored to in situ observations, have contributed to near-global coverage (Reynolds and Smith, 1994). In addition, several different approaches have been used to interpolate and combine land and ocean observations into the current global temperature time series (Section 3.2). To place the current instrumental observations into a longer historical context requires the use of proxy data (Section 6.2).

Figure 1.3 depicts several historical ‘global’ temperature time series, together with the longest of the current global temperature time series, that of Brohan et al. (2006; Section 3.2). While the data and the analysis techniques have changed over time, all the time series show a high degree of consistency since 1900. The differences caused by using alternate data sources and interpolation techniques increase when the data are sparser. This phenomenon is especially illustrated by the pre-1880 values of Willett’s (1950) time series. Willett noted that his data coverage remained fairly constant after 1885 but dropped off dramatically before that time to only 11 stations before 1850. The high degree of agreement between the time series resulting from these many different analyses increases the confidence that the changes they are indicating are real.

Despite the fact that many recent observations are automatic, the vast majority of data that go into global surface temperature calculations – over 400 million individual readings of thermometers at land stations and over 140 million individual in situ SST observations – have depended on the dedication of tens of thousands of individuals for well over a century. Climate science owes a great debt to the work of these individual weather observers as well as to international organisations such as the IMO, WMO and the Global Climate Observing System, which encourage the taking and sharing of high-quality meteorological observations. While modern researchers and their institutions put a great deal of time and effort into acquiring and adjusting the data to account for all known problems and biases, century-scale global temperature time series would not have been possible without the conscientious work of individuals and organisations worldwide dedicated to quantifying and documenting their local environment (Section 3.2).