This site is maintained because it has been publicized in various documents and other web sites, but most of its former contents have been moved to other sites.

Data rescue activities

Current data acquisition is generally either made by automatic means or routinely entered into computer files, ready to be used for different applications. But a good wealth of old data remains in original paper support, and Data Rescue (DARE) activities are needed to ensure their preservation and digitization in order to extend backwards our knowledge about climate. NMHS have DARE projects in different degrees of development, and there are several international initiatives promoting DARE activities worldwide:

Data Rescue activities can be seen as composed of a few distinctive steps:

  1. Imaging of the historic data documents, either by photographing or scanning them in standard digital computer formats. Any analogue micro-forms must also be converted to digital files. Afterwards, paper documents must be preserved in adequate archives, as digital images should be the primary source for the following steps. Conservation of digital images will include backup copies and a policy of conversion to open emerging formats to avoid obsolescence.
  2. Digitizing the images obtained in step 1 to allow computer processing of the data. Scanned typewritten documents can be treated with an OCR (Optical Character Recognition) program; otherwise human mechanization will be needed, either with ad hoc input programs or with simple spread sheets. After some basic Quality Controls on the data, they can be transferred to the preferred database system.
  3. Data rescue ends here, but before being used for climate analysis, rescued series must be further quality controlled and homogenized to remove the frequent alterations due to non-climatic factors. Errors originated during the digitization process must be corrected, but otherwise the raw series must be kept untouched, since different homogenization procedures can yield (slightly) different homogenized series. Homogenized series, however, will be preferred source for climate studies.

According to an old Chinese proverb, an image is worth more than a thousand words, but it also occupies more space in digital storage! To limit the file size of images, binary black&white scan modes are often used. But since the price of computer storage is continuously decreasing, gray or color scale modes are to be preferred, since the black&white threshold cannot be adapted to the requirements of all pages of a document and therefore information will be lost in the faintest sheets, where a grey scale image could still be interpretable by a human eye.

Additional information can be found at the WCDMP web page, including the publication Guidelines on Climate Data Rescue (WMO-Td No. 1210).