Erewhon workshop

December 15, 2008

On Friday December 5th we held our first Erewhon workshop — an opportunity for us to tell people about the aims of the project, get their feedback on some of our initial ideas, and give them a chance to make suggestions of their own. Despite a few last-minute cancellations we still had about 40 attendees (staff and students) — not a bad turnout for the last day of term!

The first half of the workshop was all about ‘setting the scene’, showing the technological landscape we’re working in. Tim started this off with a lively overview of the capabilities of smartphones, with demonstrations of a wide variety of tools on the iPhone, the G1 and the HTC TyTN — the aim being to show people just how much functionality is already available and in use now (and, by extension, what imagined possibilities might be reality by this time next year…). We wanted to make it clear that we’re not just talking about browsing the web on a small screen; we’re talking about the phone as a platform and an interface in its own right.

From the technological landscape we moved to the physical landscape, and our attempts to map it; I gave an overview of the work we’d done so far on ‘OxPoints’ (the original name for our fledgling geo database), the data we’d amassed, the simple services already available making use of that data (more about that on the handout — see link below), and the direction the new data model was taking; building on this, Sebastian then talked about some of the more exciting future possibilities for mapping, creating visualisations, and enhancing existing services.

Read the rest of this entry »


RDF and the Time Dimension – Part 2

December 10, 2008

In part 1 I claimed that you will run into problems when you try to model dimensional data in RDF: In basic RDF there is no way to properly model any form of dimension and with Named Graphs we are only able to model discrete dimensions. I received some feedback saying that the way I described the problem (in particular the distinction between discrete and continuous dimensions) was a bit unfortunate (and I agree). So before I get to describing my solutions, I’d like to rephrase the problem.

RDF and Dimensions – The Problem

What we want to model in OxPoints is the development of the University of Oxford over time. A very simple example would be the name of the Oxford University Computing Services (OUCS). OUCS exists since 1957 and was originally named “Computing Laboratory”. In 1969 OUCS was split in two, one branch becoming “Oxford University Computing Services”. Let’s say we simply want to model, that a resource identified by “OUCS” existed since 1957 and was named “Computing Laboratory” from 1957 to 1969. It then changed its name to “Oxford University Computing Services”. Now, why can’t we encode this in RDF? In part 1 I’ve outlined a proof sketch describing why the basic form of RDF does not support describing this kind of conditional data: The problem is RDFs notion of entailment. However, with the extension of “named graphs”, encoding this data should be an easy exercise: Create one graph named “1957-1969” that describes OUCS as “Computing Laboratory” and one graph named “1969-“ that describes it as “Oxford University Computing Services”. As I was pointed out (and I have to agree with it), not having enough names for all possible values does not stop us from talking and reasoning about a dimension (or call it a set of values). Take the real numbers as an example. There are many more real numbers than we can make up names for them. However, this does not stop us from happily talking and proving concepts about them. So what is the difference with RDF? If I asked you whether 2 is in the set of ]1,2[ \subset \mathbb{R}, then you would probably tell me, that it is not. However, you can only give me what I see as really the correct answer if you have the same understanding of the name ]1,2[ as I have (which is that ]a,b[ describes the open interval: $latex x \in \mathbb{R} | a < x < b}). If we have the common understanding, then we can deduce information from a given name. However, for a computer the name “1957-1969” is only a bunch of characters without any more meaning attached to it. Therefore a computer could not tell me whether or not 1960 is part of “1957-1969”. The name of a named graph is just a name (or more precisely a URI), and nothing that you could (or should) deduce information from. So how are we to encode our data in such a way, that allows us to query for:

  • What is (was) the name of OUCS today (in 1960)
  • What names did OUCS have from 1060-1980
  • When is the following true: OUCS is called “Oxford University Computing Services”

I have come up with two ideas on how to solve this problem. One using reification and relaxing the notion of entailment and one using named graphs.

Read the rest of this entry »