What is Linked Data?

Well, that’s a tricky question. A quick online search will provide you with lots of explanations of linked data, but most of these dive headlong into technical details and processes without a break for air. For learning about something new from scratch, I personally find non-technical introductions are the key and Martin Moore’s article A dummy’s introduction to linked data (me being the dummy) is a great start.

So, what did we learn from this article? Well, we learnt that at its heart, linked data is based on the same system of links that we’re used to using in our browsers every day – but with two key differences:

Firstly, the things that we’re linking are different. In our everyday use, a link is a connection between two web pages – a one-way wormhole between two HTML files. In the Linked Data world, the link is a connection between two specific, discrete “things”. These things can be anything from “a cat” to “youth” to “happiness” to “your mum”.

Of course, there’s no actual wormhole between a cat and your mum - which may be great news for your mum – but in the linked data world there is a web-based definition of your mum (e.g. her Facebook page) and a similar one for a cat (e.g. The Wikipedia article on cats). If created properly, these web-based definitions are unique and unequivocal; the link that we build in linked data actually joins together these two different definitions of your mum and a cat.

The second difference between “normal” links and the links in linked data, is that the link itself provides you with contextual information, letting you know what the relationship between these two things is. In webpages, the context for a link is usually clear from the text which surrounds it, for example:

Your mum is hostile towards cats. For more information about cats click here

In the linked data world, this same information could be provided as follows:

Your mum dislikes Cats

It doesn’t have the poetry of the first sentence, but it actually tells us more. It defines your mum (or in this case the concept of “your mum”, via the hilarious Urban Dictionary) and a cat, as before. It also uses a specific definition of the term “dislikes”, i.e. “hostile” in this case. So in three pieces of information we have uniquely defined a subject (your mum), the object of the relationship (a cat) and the nature of the relationship between them (is hostile). These three pieces of information form the core of all linked data.

Now, you may have noticed that the term "definition" has come up quite a lot in this article so far - and for good reason. When we are creating semantic links between things in the linked data world, then the references that we use to define these things are absolutely vital. We can identify people and objects by the addresses of their Facebook and Wikipedia pages and beyond this we can define concepts, relationships, technical terms etc using specialist "vocabularies".

The linked data community is working hard to develop systematic, unequivocal, machine-readable vocabularies - called “ontologies” - which we can use to define things and the relationships between them. The ontological landscape is changing and growing every day. In the example above, I used the semantic science ontology’s definition of the term “hostile”. The ontology itself was found using the Linked Open Vocabularies engine – a quick look there will show you the scale of libraries that are being created.

And these ontologies are also the key which unlocks the enormous potential of linked data. Once you have unique, machine-readable definitions for all conceivable things and the relationships between them then, using the right software, you can ultimately analyse anything which is related – in any way – to anything else. Once all data is linked, you just have to follow the links.

The potential is mind boggling, isn’t it?

Add new comment