RDF and the Time Dimension – Part 2

December 10, 2008

In part 1 I claimed that you will run into problems when you try to model dimensional data in RDF: In basic RDF there is no way to properly model any form of dimension and with Named Graphs we are only able to model discrete dimensions. I received some feedback saying that the way I described the problem (in particular the distinction between discrete and continuous dimensions) was a bit unfortunate (and I agree). So before I get to describing my solutions, I’d like to rephrase the problem.

RDF and Dimensions – The Problem

What we want to model in OxPoints is the development of the University of Oxford over time. A very simple example would be the name of the Oxford University Computing Services (OUCS). OUCS exists since 1957 and was originally named “Computing Laboratory”. In 1969 OUCS was split in two, one branch becoming “Oxford University Computing Services”. Let’s say we simply want to model, that a resource identified by “OUCS” existed since 1957 and was named “Computing Laboratory” from 1957 to 1969. It then changed its name to “Oxford University Computing Services”. Now, why can’t we encode this in RDF? In part 1 I’ve outlined a proof sketch describing why the basic form of RDF does not support describing this kind of conditional data: The problem is RDFs notion of entailment. However, with the extension of “named graphs”, encoding this data should be an easy exercise: Create one graph named “1957-1969” that describes OUCS as “Computing Laboratory” and one graph named “1969-“ that describes it as “Oxford University Computing Services”. As I was pointed out (and I have to agree with it), not having enough names for all possible values does not stop us from talking and reasoning about a dimension (or call it a set of values). Take the real numbers as an example. There are many more real numbers than we can make up names for them. However, this does not stop us from happily talking and proving concepts about them. So what is the difference with RDF? If I asked you whether 2 is in the set of ]1,2[ \subset \mathbb{R}, then you would probably tell me, that it is not. However, you can only give me what I see as really the correct answer if you have the same understanding of the name ]1,2[ as I have (which is that ]a,b[ describes the open interval: $latex x \in \mathbb{R} | a < x < b}). If we have the common understanding, then we can deduce information from a given name. However, for a computer the name “1957-1969” is only a bunch of characters without any more meaning attached to it. Therefore a computer could not tell me whether or not 1960 is part of “1957-1969”. The name of a named graph is just a name (or more precisely a URI), and nothing that you could (or should) deduce information from. So how are we to encode our data in such a way, that allows us to query for:

  • What is (was) the name of OUCS today (in 1960)
  • What names did OUCS have from 1060-1980
  • When is the following true: OUCS is called “Oxford University Computing Services”

I have come up with two ideas on how to solve this problem. One using reification and relaxing the notion of entailment and one using named graphs.

Read the rest of this entry »

RDF and the Time Dimension – Part 1

November 28, 2008

In RDF – an Introduction I claimed that introducing any kind of continuous dimension (for example, a time dimension) is not possible, if you follow the official interpretation given in the RDF specifications. Actually it is even worse: In basic RDF even discrete dimensions cannot be modeled.

In this post I will elaborate on my claims giving a detailed description of the problem. In part 2 I will propose a new interpretation of RDF Graphs, allow for dimensions into RDF. If you are new to RDF, or terms such as reification, entailment, fact or model don’t mean much to you, you might want to read my introduction to RDF since we need these terms to talk about RDF’s incapability of modeling dimensions. I will try to present everything in a semi formal way, using some mathematical notation, but to always try to keep the post understandable for those that would not define themselves as “math people”. However, I feel that a certain amount of formality is necessary, to outline the problem and proposed solution.

Continuous and Discrete Dimensions

Let’s start by trying to give you an idea, of what I mean by continuous and discrete dimensions in RDF. Think of a dimension as a variable d that can take values from a specified set (e.g. 1 and 2). You now define your triples (or facts) relative to the d. This means, that for d = 1 you have a different set of facts than for d = 2. Whether I now speak of a continuous or discrete dimension depends on the cardinality (number of elements) of the value set for d. If the value set contains an infinite number of elements I speak of a continuous dimension and if the number of elements is finite I speak of a discrete dimension. Since in our example the cardinality of the value set was 2 (|{1,2}|=2) we have a discrete dimension. Read the rest of this entry »

OxPoints and the Semantic Web

November 22, 2008

In OxPoints – Providing geodata for the University of Oxford I told you about the old OxPoints system which is currently providing geolinking information for the University of Oxford and talked about what is wrong with it and why we want to start from scratch to create a new OxPoints.

Before we start talking about solutions let’s start off by defining what we want the new system to look like:

Blackfriars College on Google Maps

Blackfriars Hall on Google Maps

As we have seen, the old OxPoints system stores geo- and some additional information (such as for example images and postal addresses) on all 38 colleges and the other important university entities. It is able to export its information as KML (an XML based language for expressing geographic annotations) which can be imported into, for example, Google Maps or Google Earth. A simple frontend allows users to query the data and display the results directly in either Google Maps or Google Earth, or as KML.

But even though it wouldn’t tell you, the old system is already a bit more powerful than that. Let’s have a look at a typical OxPoints record like the one on Blackfriars: Read the rest of this entry »