Tag Archives: dataset

GHCN v3 in SQL

The Global Historical Climatology Network-Monthly (GHCN-M) dataset by NCDC is particularly important data set if your research deals with climate data. It is widely accepted. Its major advantage is quality control and a variety of data sources combined together. I used it several times as reference data for validation of calculated climate surfaces. It is also great for uncertainty assessment of climate interpolation methods.

But it is distributed as text files of specific format only. And you will have to write a parser to fetch the data.

This week I decided to load the GHCNv3 into MySQL to make it flexible for fetching. I can fetch different subsets of the data into CSV files just with composing a proper select query. That made a significant speed up in experiments with interpolation techniques.

I share these SQL scripts to enable others researchers to load GHCN v3 into their own SQL servers. You can restore GHCN at your server and perform requests to it. Just download the script, execute it. And you are able to get the data you need. Fast :)

The scripts do not contain CREATE DATABASE statements. Thus create an empty database by hand and then execute the proper script.

Atmospheric pressure data archive

For those who want to practice some data processing skills and time series model fitting I publish the following archives:

2011 whole year atmospheric pressure archive
2012 whole year atmospheric pressure archive
2013 whole year atmospheric pressure archive

The files are compressed CSV. Each line of the files is one minute average of sensor measurements reporting the values every 5 seconds.

The sensor is located at 55.73080°N 37.42206°E at altitude 221m
I use Toradex OakP v1.2a sensor

The data is raw in a sense that it could contain gaps and slight time shifts.

Feel free to use it. Any acknowledgements are appreciated if you use the data in your research ;-)