Combining timeseries data with RDF

The Resource Definition Framework (RDF) language was initially designed as a metadata model to represent information in the Web.

In the MELODIES project we are using RDF for our data, with the goal of being able to link our measurements or results with other open datasets. We are using Strabon as a repository of RDF triples, and SPARQL as query language to retrieve them.

But, is it necessary to store all the single data values as RDF?

Storing all the values as RDF makes sense if we need to filter data based in its values. Usually, it is so, but since our objective is to filter and link data based on raster maps or time series, we decided to refer them as a group of values instead of each single value (e.g.  each observation value in a given location and time or each raster pixel).

Part of our data consists of time series with thousands of individual values (such as, water level or water quality evolution). Given that the main purpose of linking time series data is not necessarily to access each of the individual value of the time series of point data as RDF but the whole series, only the catalog is stored as RDF while the individual values are stored in more convenient relational databases.

In the RDF catalog, instead of the actual values for a time series, a link to retrieve the time series values is stored. This link is a link to a web service with the necessary parameters to retrieve the set of values of this time series as WaterML.

WaterML stands for Water Markup Language and it’s an XML schema designed for the transfer of hydrologic data on the net. It follows the information model of ODM (Observation Data Model) database, with the addition of elements for querying and transferring data. It was developed by CUAHSI as the data format for the WaterOneFlow web services.

CUAHSI’s WaterOneFlow is a family of SOAP Web Services for the transfer of hydrologic data between hydrologic data servers and users computers.

In order to provide a link in the catalog to retrieve the time series values, a series of REST web services have been implemented, following the same structure and parameters as the WaterOneFlow web services.

One of the advantages of this setup is that if the data is already available online in a standard format, the already existing methods can still be used to access the data; the only thing needed is to add a RDF catalog to allow other applications to discover which data is available and how to obtain the desired values.

Given that the data does not have to be converted to a new format, this approach is useful to make already existing data available as Linked Open Data without having to agree in a new data model or defining new ontologies for the values. 

Add new comment