In honor of Google’s latest diversity kerfuffle, I continue with my diversity initiative on WUWT with a guest post by Nick Stokes.~ctm
By Nick Stokes,
There is an often expressed belief at WUWT that temperature data is manipulated or fabricated by the providers. This persists despite the fact that, for example, the 2015 GWPF investigation went nowhere, and the earlier BEST investigation ended up complementing the main data sources. In this post, I would like to walk through the process whereby, in Australia, the raw station data is immediately posted on line, then aggregated by month, submitted via CLIMAT forms to WMO, then transferred to the GHCN monthly unadjusted global dataset. This can then be used directly in computing a global anomaly average. The main providers insert a homogenization step, the merits of which I don’t propose to canvass here. The essential points that you can compute the average without that step, and the results are little different.
The accusations of data corruption got a workout with the recent kerfuffle over a low temperature reading on a very cold morning at Goulburn, NSW in July, so I’ll start with the Bureau of Meteorology online automatic weather station data. I counted recently a total of 712 such stations, for which data is posted online every half hour, within ten minutes of being measured. You can find the data by states – here is NSW. You can find other states from the bar at the top, under “latest observations”. Here is a map of the stations in NSW in this table:
For context, I have marked with green the stations of Goulburn and Thredbo top which had temperatures of below -10C flagged on that very cold morning in July. On that BoM table, you can see stations listed like this (switching now to Victoria):
I switched because I am now following a post from Moyhu here, and I want a GHCN station which I could follow through. But it is the same format for all stations. This data is from 4 December 2016, and I have highlighted in green the min/max data that will flow through (unchanged except for possible quality control flagging) to GHCN unadjusted. It shows for Melbourne Airport, the most recent temperature (22.4) at 7pm, various other data, and then the min and max, along with time recorded. The min is incomplete; it showed the latest 7pm temperature, but would no doubt be lower by 9am the next day, which is the cut-off. The max probably wouldn’t change. You can see the headings by linking to the page here.
If you click on the station name, it brings up a full table of the half-hourly readings for the last three days, in this style:
Apologies for jumping forward to now (7 Aug), but I didn’t record this back in December. It shows the headings relevant to the above too; the top line is present (a few minutes ago), going back. Now you can see that this has to be automated; no-one is hovering over this stream of data with an eraser. If you click on the “Recent months”, it brings up the following table (an extract here, and we’re back in Dec 2016):
That was taken at the same time (just after 7pm, 4 Dec), and you’ll see that it shows the minimum attributed to Sunday 4th (before 9am), at 9.1, but not yet the max. If you look below that table you’ll see a list of the last 13 months linked, for which you can bring up the complete table. Here is what that Dec 2016 table now looks like:
The max of 31.7 is there; the min went down to 15.7. The other data hasn’t changed. Further down on that page, as it appears now, are the summary statistics for the month:
At the end of Dec 2016, that was transmitted to the WMO as a CLIMAT form, which you can see summarized at the Ogimet site
You can see that the min and max are transmitted unchanged. The mean of the two has also been calculated and is marked in brown. If you want further authenticity, that site will show you the code that the met office transmitted.
Finally, the CLIMAT form is transcribed into the GHCN unadjusted file, which you can see here. It’s a big file, and you have to gunzip and untar. You can also get a file for max and min. Then you have a text file, which, if you search for 501948660002016TAVG (which includes the Melb code) you see this line:
There is the 19.5 (multiplied by 100, as GHCN does). The other numbers will appear in the GHCN TMAX and TMIN files.
You can even go through to the adjusted file, and, guess, what, it is still unchanged. That is because homogenization rarely modifies recent data. But older data may be. GHCN unadjusted does not change, except if the source notifies an error. There are quality controls, which don’t change numbers, but may flag them.
There have been endless articles at WUWT about individual site adjustments, but no-one has tried to calculate the whole picture of the effect of adjustment. With the unadjusted vs adjusted files, it is possible to do that. I have been calculating a global anomaly every month, using the unadjusted GHCN data with ERSST. The June result is here; there is an overview page here, with links to the methods and code. This post compares the result of unadjusted vs adjusted GHCN; the difference is small. Here from it is a plot from 1900 to start 2015 showing TempLS (my program) unadjusted (blue) vs adjusted (green) and GISS (brown), 12 month running man. It’s an active plot, so you can see more details at the linked site.
If you want more convenient access to the station data, I have a portal page here. The heading line looks like this:
The BoM AWS link takes you to this page, listing all station names with links to their current month data page. BoM also posts the metadata for all their stations, and that link takes you to this page, which lists all stations (not just AWS, and including closed stations) with links to metadata. The GHCN Stations button links to this page, which links to the NOAA summary page for each GHCN station by name, or if you click the radio buttons, to station annual data in various formats.
I have shown, for Australia (BoM) at least, that you can follow the unadjusted temperature data right through from within a few minutes of measurement to its incorporation into the global unadjusted GHCN, which is then homogenized for global averages. Of course, I can only show one example of how it goes through without change, but the path is there, and transparent. Those who are inclined to doubt should try to find cases where it is modified.