URV logo GE-Min-AEMET logo UniBonn logo

Web site of the MULTITEST project


Project funded by the Spanish Ministry of Economy and Competitiveness (CGL2014-52901-P), to be developed between 2015 and 2017.

Principal investigator: Manola Brunet1
Official participants: José A. Guijarro2, José A. López2, Enric Aguilar1 and Javier Sigró1.
External participants: Peter Domonkos and Victor Venema3.

1: Universitat Rovira i Virgili, Tarragona, Spain.
2: State Meteorological Agency (AEMET), Spain.
3: Meteorological Institute, University of Bonn, Germany.

(Page under construction: Some sections may lack their intended contents.)

(Last update: 11/09/2017)


Presentation Methodology Homogenization results Summary Scripts

Presentation

Climatological series are affected by unwanted perturbations due station relocations, changes in instrumentation, in observation practices or in the environment. These perturbations must be removed (homogenization) before analyzing the series to avoid misleading conclusions about climate variability.

Many homogenization methods have been developed so far. The most relevant were compared in the successful COST Action ES0601 (HOME: 2006-2011). However, as homogenization methods, implemented in the form of software packages, have been evolving since then, new intercomparisons are needed to assess their performance, but implementing a new edition of that Action would be too costly. Therefore, the only feasible alternative is to perform automatic comparisons, although only methods that can be run in this mode can be tested.

The MULTITEST project was an initiative of Peter Domonkos when he was working at the Centre for Climate Change of the University of Tarragona (Spain). The aim of the project was to update and improve the results of a preliminary comparison exercise (Guijarro, 2011) by using better synthetic datasets of monthly values of temperature and precipitation, and testing more homogenization software packages over a variety of inhomogeneity problems.


Guijarro, JA (2011): Influence of network density on homogenization performance. Seventh Seminar for Homogenization and Quality Control in Climatological Databases jointly organized with the Meeting of COST ES0601 (HOME) Action MC Meeting, Budapest, 24-27/October, WMO WCDMP-No. 78, pp. 11-18.

Methodology

Summary

Synthetic homogeneous databases were generated, composed by 100 homogeneous series with (mostly) 60 years of monthly values. Then, for every method, database and experimental setting, runs of 100 tests were made by:

  1. Randomly sampling a subset of the series (true solution).
  2. Applying inhomogeneities to them (problem series).
  3. Homogenizing them by the tested method (results, with backward adjustment).
  4. Comparing the results with the true solutions, computing RMSE and errors in trends, means and standard deviations.

Note that as these methods are applied in an automatic way, they are run with default settings, and their results may not be as optimal as when properly tuned to each problem network.

Synthetic master networks

Three master networks of mean monthly temperature, named Tm1, Tm2 and Tm3, were generated following these steps:

  1. 100 station locations were distributed randomly over a 4 x 3° lon-lat area.
  2. Mean monthly homogenized temperatures from Valladolid (Duero basin, Spain) were assigned to the first point, located by the center of the area.
  3. The closest location to this first point was assigned the same series plus white noise drawn from C*N(0, 1.5), and the same procedure was applied to attribute data to the closest point to any already assigned points, until all the network was filled with data.
  4. Three different coefficients C (0.18, 0.30 and 0.65) were used to obtain the three master networks, with decreasing cross-correlation between the simulated stations.
  5. Finally, series were biased to account for simulated elevation, a trend of 2°C/100yr was added, and their yearly cycle amplitude was varied +/-20%.

Other three master networks of monthly precipitation were built simulating three different climates: Atlantic temperate (PEir), Mediterranean (PMca) and Monsoonal (PInd). Real series from Ireland and Mallorca, and gridded series from SW India, were respectively used to derive variograms, gamma coefficients and frequencies of zeroes, which were used to compute their synthetic series by means of the R package gstat, preserving the spatial correlation structure.

Tested homogenization packages

The homogenization packages to be tested were chosen among those that participated in the COST Action ES0601, with the requirement that they should be able to run in a completely automatic mode, implemented in bash scripts on a Linux PC:

A few implementation details:

(Other packages can be added if developers or advance users provide scripts that read input problem series and return their homogenized solutions.)


Domonkos P (2015): Homogenization of precipitation time series with ACMANT. Theor. Appl. Climatol., 122:303-314.
Guijarro, JA (2016). https://cran.r-project.org/web/packages/climatol/index.html
Mestre O, Domonkos P, Guijarro J, Aguilar E (2012). http://www.homogenisation.org/Documents/HOMER.R
Szentimrey T. https://www.met.hu/en/omsz/rendezvenyek/homogenization_and_interpolation/software (Accessed 29/9/2011)
Wang XL, Feng Y (2013). http://etccdi.pacificclimate.org/software.shtml

Homogenization results

The results of the tests are evaluated through the comparison of the solutions provided by the tested packages with the true original homogeneous series. Four metrics have been computed from these comparisons:

  1. Root Mean Squared Error (RMSE) serves to measure how far are the returned homogenized series form the original.
  2. Errors of the trends in the returned solutions when compared to the original series is another important metric to check the impact of homogenization in climate change detection assessments.
  3. Errors in the means of returned series can be important when producing climatic maps from them.
  4. Errors in the standard deviations would show the impact in studies of variability and extreme values, but they would be more relevant in the case of daily series than for the monthly series simulated here.

(Results for metrics 3 and 4 will be added soon.)

It is important to note that all these metrics have been computed only on series whose problem was inhomogeneous, but new calculations will be done in the future involving all series, to account for cases in which the packages may have corrected false inhomogeneities.

Results are presented in separate web pages, grouped by types of experiments, by means of box-and-whisker plots showing the spread of the solutions provided by every package:

Precipitation

Temperature: | First five experiments | Several experiments with Tm2



Summary

(Summary tables of the results will be presented here in a near future.)



Scripts

(Master networks and scripts needed to run these tests will be available here in a near future for the sake of transparency and reproducibility. A work of tidying up and ordering all that stuff is needed to avoid confusion to any potential user.)