RDF and the Time Dimension – Part 1

November 28, 2008

In RDF – an Introduction I claimed that introducing any kind of continuous dimension (for example, a time dimension) is not possible, if you follow the official interpretation given in the RDF specifications. Actually it is even worse: In basic RDF even discrete dimensions cannot be modeled.

In this post I will elaborate on my claims giving a detailed description of the problem. In part 2 I will propose a new interpretation of RDF Graphs, allow for dimensions into RDF. If you are new to RDF, or terms such as reification, entailment, fact or model don’t mean much to you, you might want to read my introduction to RDF since we need these terms to talk about RDF’s incapability of modeling dimensions. I will try to present everything in a semi formal way, using some mathematical notation, but to always try to keep the post understandable for those that would not define themselves as “math people”. However, I feel that a certain amount of formality is necessary, to outline the problem and proposed solution.

Continuous and Discrete Dimensions

Let’s start by trying to give you an idea, of what I mean by continuous and discrete dimensions in RDF. Think of a dimension as a variable d that can take values from a specified set (e.g. 1 and 2). You now define your triples (or facts) relative to the d. This means, that for d = 1 you have a different set of facts than for d = 2. Whether I now speak of a continuous or discrete dimension depends on the cardinality (number of elements) of the value set for d. If the value set contains an infinite number of elements I speak of a continuous dimension and if the number of elements is finite I speak of a discrete dimension. Since in our example the cardinality of the value set was 2 (|{1,2}|=2) we have a discrete dimension. Read the rest of this entry »


RDF – an Introduction

November 26, 2008

After deciding to implement the new OxPoints system with Semantic Web technologies (see OxPoints and the Semantic Web) I started to read up on all I could find on RDF (Resource Description Framework) and related technologies like RDFS and OWL. In particular I was looking for

  • specifications,
  • best practices and
  • reports on projects using RDF.

I was astonished to find that, even though many people talk about RDF, it seems that only very few have actually ever used it (i.e. outside academic studies). Or if they have, they at least did not tell anyone about it.
However, one thing, that I did definitely not expect to find was that there seems to be a fundamental design flaw in RDF. I thought about this a lot, and hope that by blogging about it, you will either tell me, that I am wrong and how to do it right, or that we might find a solution on how to solve the problem.

But before talking about what I think is wrong with RDF and proposing one way to solve that problem (yes, luckily I think there is a solution), we need to establish a common language, which is what I want to achieve with this introduction. If you are already familiar with RDF, you might want to have a look at the sections: Triples are Facts, Reification and Entailment. If you are new to RDF, I hope that this will give you a first start. However, I kept this introduction very short and so many aspects are missing. If you want to learn more about RDF I would recommend you to start with the RDF Primer, the introduction to RDF from the W3C. In most sections I have also linked the specific sections from the RDF Specifications.

I will try to assume as little previous knowledge as possible, but since RDF is not a trivial topic, I have to start somewhere. Basic knowledge of XML and some knowledge of mathematical notation would therefore probably be of help.

RDF (Resource Description Framework)

The Resource Description Framework (or short RDF) is a set of W3C specifications which were first published in 1999 and revised in 2004 (more information on the history of RDF can be found at its Wikipedia page or at the W3C pages on RDF). RDF is “a language for representing information about resources in the World Wide Web” (RDF Primer [http://www.w3.org/TR/REC-rdf-syntax/]).

So what are resources in the World Wide Web?

Read the rest of this entry »


Life with a Google phone

November 22, 2008

When the much-herald Google phone was launched in the UK towards the end of October, and we started the Erewhon project at about the same time, I decided that fate had intended me to acquire my first smartphone. I therefore went out on day 1, signed up for the exorbitant initial contract offering from T-Mobile, took home a white G1 and have been using it ever since.

If you have not heard of it, the G1 phone is the first device to use Google’s Android operating system, intended to be freely available to any phone manufacturer, and enabling application developers to write high-quality tools which will run on a wide variety of phones. If it is successful, you can expect to see more
manufacturers picking up on Android. This first model is made by the respected HTC company from Taiwan.

In summary, what the G1 and Android give us today is

  • phone (as you might expect!)
  • always-on internet (seamless switch between 3G and wireless networks)
  • touch-screen interface, enhanced by slide-out keyboard
  • storage on micro SD card (I added an 8 gbyte one)
  • complete syncing integration with Google apps (mail, contacts, calendar, maps, search)
  • camera
  • GPS
  • web-browsing
  • media playing
  • .. and as many other applications as people care to write
G1 Google phone desktop

G1 Google phone desktop

Reviews of the G1 are widely available, and vary wildly in their assessment depending on the standpoint of the reviewer. To those who love the Apple iPhone, it is a cranky weak imitation; to long-time
phone-watchers, it’s a ho-hum bit of hardware with an interesting operating system; to “openness” advocates, it is the second coming, real-Linux-onna-phone. Read the rest of this entry »


OxPoints and the Semantic Web

November 22, 2008

In OxPoints – Providing geodata for the University of Oxford I told you about the old OxPoints system which is currently providing geolinking information for the University of Oxford and talked about what is wrong with it and why we want to start from scratch to create a new OxPoints.

Before we start talking about solutions let’s start off by defining what we want the new system to look like:

Blackfriars College on Google Maps

Blackfriars Hall on Google Maps

As we have seen, the old OxPoints system stores geo- and some additional information (such as for example images and postal addresses) on all 38 colleges and the other important university entities. It is able to export its information as KML (an XML based language for expressing geographic annotations) which can be imported into, for example, Google Maps or Google Earth. A simple frontend allows users to query the data and display the results directly in either Google Maps or Google Earth, or as KML.

But even though it wouldn’t tell you, the old system is already a bit more powerful than that. Let’s have a look at a typical OxPoints record like the one on Blackfriars: Read the rest of this entry »


OxPoints – Providing geodata for the University of Oxford

November 18, 2008

Two of the core deliverables for the Erewhon project are the creation of technical specifications for using geolocation for university resources and to compile a report on dynamic location-dependent information delivery services. Now, this certainly sounds very nice and there is even more information on the two deliverables to be found in the JISC application, but I thought it would be a good idea to tell you a bit more about what it is that we actually want to do and how we plan to meet the deliverables.

In this post I will tell you about OxPoints a simple geodatabase, which is currently in use at the University of Oxford and which we intend to redo, since it does not fulfill our requirements of a geodatabase for university resources.

OxPoints – the current system

Map of all colleges on www.ox.ac.uk

Map of all colleges on http://www.ox.ac.uk

If you browse the University websites you might come across a dynamically generated map of all of Oxford’s colleges. This map is generated using the Google Maps API and data (the longitude and latitude for each college) provided by a system called OxPoints. OxPoints was developed at OUCS to provide geolinking information for the University of Oxford and is able to output its data, for example, as KML which is the input format used by Google Maps and Google Earth.

A good question to ask now would be: “It seems to do the job. So why do you want to create a new one?”

To answer this, we have to dig a bit deeper into the current system and have a look at how it stores its data.
OxPoints uses an XML language called TEI (more information on TEI) to store information about colleges and units and associated buildings, rooms etc. A typical OxPoints record looks something like this:

<place type="college" xml:id="alls">
   <placeName>All Souls College</placeName>
   <place subtype="primary" type="building">
       <placeName>Lodge</placeName>
       <location when="2007-01-29T13:08:55.535Z">
           <geo rend="0">-1.253042221069336 51.75278555467572</geo>
       </location>
   </place>
   <place type="building">
       <place type="room">
           <placeName>Wharton Room</placeName>
       </place>
   </place>
</place>

What this bit of XML tells us is, that there is a college called All Souls College and that it owns two buildings, one located at -1.25 51.75 (longitude, latitude) and the other one without any geoinformation but with a room called Wharton Room.

It is easy to see, that this system allows us to store colleges and information on all buildings that a college owns and even all the rooms inside each building. So we should be able to answer queries of the form: “Give me a list of all rooms, owned by college A, that have a capacity greater than X and show them on a map”. But what about this query: “Give me a list of all the rooms, used by college A”?
The problem with this query is, that colleges tend to use buildings that they do not own, which is something that we cannot express directly in the current storage format. Since the information that college A owns building B is stored implicitly through the XML hierarchy, one solution would be to start copying all the building records for each used building into our college record, ending up in something like this:

<place type="college" xml:id="alls">
   <placeName>All Souls College</placeName>

   <!-- our own buildings -->
   <place subtype="primary" type="building" ownershipStatus="owned-by-us">
       <placeName>Lodge</placeName>
       <location when="2007-01-29T13:08:55.535Z">
           <geo rend="0">-1.253042221069336 51.75278555467572</geo>
       </location>
   </place>

   <!-- buildings that we use -->
   <place subtype="primary" type="building" ownershipStatus="used-by-us">
       <placeName>Museum</placeName>
       <location when="2007-01-23T10:21:44.462Z">
           <geo>-1.26018762588500 51.75536912069192</geo>
       </location>
   </place>
</place>

Now suppose, that the University consisted of only 10 colleges, each owning only one building, but using the buildings of all the other colleges. Instead of having 10 records, one for each building, we’d end up in having 100 records, 10 for each building, duplicating all the information. Obviously, this solution is not a really good one.

You might now say: “Well, XML knows about IDs. Why not use mechanisms to link to other elements”. Let’s have a look at how this might look like:

<place type="college" xml:id="alls">
   <placeName>All Souls College</placeName>

   <!-- our own buildings -->
   <place xml:id="some-building-id" subtype="primary" type="building" ownershipStatus="owned-by-us">
       <placeName>Lodge</placeName>
       <location when="2007-01-29T13:08:55.535Z">
           <geo rend="0">-1.253042221069336 51.75278555467572</geo>
       </location>
   </place>

   <!-- buildings that we use -->
   <place linksto="#some-building" ownershipStatus="used-by-us"/>
</place>

This is clearly a much better design, since we are not storing any redundant information in our system anymore. However, suppose one of our colleges stops using the rooms of college A. How would we reflect that in the database? One simple and efficient way to reflect that change would be to simply remove the link. Our database would after that change, again, reflect the current status of the University, but the information, that the college once did use those rooms would be gone forever.

When we thought about that problem and realized, that it would be indeed very nice to be able to have that extra dimension (allowing for queries like: “Give me a list of all the colleges that were present from 1500 to 1600”), we had to admit that the old system’s XML (and indeed any hirarchical XML) would not give us the flexibility that we want for our geolocation database.

One of the first tasks in Erewhon is therefore to create a new database schema, that gives us a great flexibility for expressing relationships between various university entities, that knows about time and is able to annotate any statement with time information and that is extendable so that all the information that we cannot yet think about, but that really should be in the system, can be added without changing the underlying schema (otherwise we’d end up, where we are at the moment, having to redo everything again, which is clearly something that we would like to avoid).

So much for the old OxPoints. I’ll try to keep you posted on any development and I’d be more than happy for any comments.


Gripe: Surveymonkey and iPhone

November 13, 2008
It won't let me input!

It won't let me input!

Whilst travelling on public transport this morning, I wanted to check how one of our surveys was doing on surveymonkey.com. I opened up mobile Safari on an iPhone and headed over to survey monkey and proceeded to login, only I couldn’t actually type anything into the login window. It appears that survey monkey’s login window is actually a javascript window which creates a rather nice opaque overlay of the rest of the page to draw attention to the box. But for some reason, Apple’s Safari will not allow input into the said box despite displaying it perfectly.

A quick investigation using lynx shows that the login functions are entirely javascript based. Apple’s mobile Safari is supposed to implement virtually every aspect of the desktop version (which incidentally handles the page fine), if this is true then the problem lies with the UI aspect of the browser. However it’s worth noting that surveymonkey’s xhtml does not validate well on W3C’s checks so perhaps it’s a combination of problems.

If you know of any websites that appear to work fine on desktop Safari but not on the mobile (excluding flash and java obviously), please drop a comment below.