Geolocating ducks in Essex

July 31, 2009

Earlier this week, Sebastian and I gave a workshop about geolocation at IWMW 2009. Despite ongoing struggles with the wireless networking it all went fairly smoothly, and the 12 or so workshop attendees seemed interested and engaged — and even willing to do the ‘audience participation’ section! This was a re-run of what we did in a local workshop, but with the added advantage that this time the participants came from a range of institutions — so we were keen to see whether our examples and suggestions were things they could all relate to.

Happily, it seems we weren’t being too Oxford-centric, as there was plenty of discussion around our ideas (particularly on the topics of library books and energy usage) and several interesting new suggestions.

Snapshot: whiteboard writeup of the suggestions made by the three groups in our geolocation workshop.

Workshop whiteboard notes

We particularly liked:

Analysing PC/wireless provision and usage to help users determine the likelihood of finding a free PC nearby
It’s easy enough to show the location of currently free PCs, but by the time you’ve got there, what are the chances of there still being one available? Enhancing existing usage metrics with geodata would help users head for the best ‘hotspots’ without wasting time trekking from one bit of campus to another in search of a workstation. However, there was a concern that this might also look like an open invitation to burglars, showing them a map of all the unattended computers on campus!
SMS reminders for courses/meetings with directions tailored to user preferences
Enhance course reminders (already provided by EduTxt) with directions appropriate to the user’s location, mobility, mode of transport, etc. It’d be difficult to do this dynamically based on the user’s location at the time, but possible to allow users to set more general preferences for the sort of reminders/directions they want.

But the firm favourite was one delegate’s suggestion of geolocating a duck: apparently students at York have a pet duck and would love to be able to find its current location and follow its progress! Ducks have generally been less quick to join the smartphone revolution than students, but this problem could be overcome by attaching a lightweight GPS data-logger to the duck. While of course this service would have clear benefits for the duck-watchers, opinion was divided over the benefit to the duck itself: on the one hand it might be more likely to get fed and looked after in a timely fashion, but on the other hand it might not want the constant attention…

Ducks by the lake at Essex University's Colchester campus

Ducks: how can institutional geolocation services benefit them?

See the IWMW2009 website for details of the workshop (including all our slides). Thanks again to everybody who attended the workshop – please feel free to comment here with follow-up, further suggestions or discussion!


A simple library mashup

June 1, 2009

At the Erewhon workshop in December we asked people to choose/suggest applications for geodata. One of the favourites was: “Find the nearest copy of a book from a reading list (bearing in mind which libraries you can use, and the opening hours of libraries)” so we decided to use this as an example of how we’d begin to use Oxpoints data to enhance other services.

Ingredients:

A library search results page

A library search results page

We couldn’t easily get hold of the patron data (i.e. which libraries a user has access to), and the opening hours looked fairly indigestible in their current form (see example); so we decided to leave these out of this mashup. Read the rest of this entry »


Exploring Oxpoints: directions from place to place

June 1, 2009

In order to demonstrate the data now available in Erewhon’s
geodata store, Oxpoints, I wanted a way to quickly browse
the holdings on a simple web page. Oxpoints already has a way
to show a map of all departments (for example), by opening
http://m.ox.ac.uk/oxpoints/departments.kml, but that means opening
Google Earth or loading it into Google Maps. If instead we
specify
http://m.ox.ac.uk/oxpoints/departments.xml, an XML
representation of all the data is returned, which can be fairly easily
transformed into HTML for display.

Since we have a lot of XSLT expertise to hand in the Erewhon project,
I decided to write this service as an XSL transform, mediated by a
Perl CGI wrapper. The result is shown at
http://m.ox.ac.uk/cgi-bin/oxshow.pl:
oxshow1 Read the rest of this entry »


New OxPoints and how to use it!

May 29, 2009

It’s been a while since we’ve posted anything about OxPoints, but at last we now have a lot of Arno Mittelbach’s work in Gaboto implemented into our new OxPoints system. With a new home at http://m.ox.ac.uk/oxpoints !

Last Friday we announced the availability of a few ways to access queries from our new system at an OxPoints workshop in OUCS. Here’s a brief overview of what you can get out of our new oxpoints system as of today.

We’re aiming to improve features incrementally and as demand dictates, so for now we’re offering a simple URL access method to get at a few pre-defined queries (more on complex queries later). The stem for all OxPoints queries is:

http://m.ox.ac.uk/oxpoints/

Following this, you’ll have to specify the query. As of today, we support two main query types –

  1. Query by item type (receives list of all items of a certain type).
  2. Query by OUCS/OLIS code (receive a list of all items associated with a unit, as specified by unique id from computing services or library services)

Item types available –

  1. colleges
  2. departments
  3. museums
  4. libraries
  5. carparks

Next we need to add a file type extension. We support a number of types but these are our primary formats:

Full Data Output

  1. .json (Direct JSON transformation of all data related to a query)
  2. .xml (RDF XML – not completely compliant.. yet)

Simplified Output (usually adequate for 95% of uses) – omits irregularly used fields

  1. .kml (Keyhole Markup Language – perhaps the most common, used by Google Earth & Maps)
  2. .gjson (Simplified version of .json, omitting irregularly used fields and is easier to read)
  3. Almost any format supported by GPSBabel

Some simplified formats output via GPS Babel (not supported but should work):

  • .tomtom
  • .csv
  • .yahoo

Examples:

All college data represented in JSON: http://m.ox.ac.uk/oxpoints/colleges.json

Oxford University Car Parks in TomTom format: http://m.ox.ac.uk/oxpoints/carparks.tomtom


The new OxPoints

March 20, 2009

After spending lots of time looking into RDF and how we could use RDF as the data model for the new OxPoints system we finally decided to give it a try. One of the big problems we had with RDF was the difficulty of storing dimensional data as one of the key features of the new OxPoints system should be its ability to record the change of Oxford University over time. However, RDF does not really cope with change and we had to think about how to work around this. We presented a detailed description and two possible workarounds in previous posts:

Both solutions are somewhat similar as they try to add additional RDF statements to define the validity of other statements. Solution one was to use statement reification and to add temporal information on individual statements. The other way we saw was to use named graphs and use different graphs to group statements that talk about the same time span. In the end we decided to go with the second solution, as it seemed to be the cleaner and more efficient solution.

As the platform for the new system we chose Java with Jena as the underlying RDF triple store.

We have spend the last weeks developing the prototype and so far the results are promising. The minimal requirements for the new system were to do everything the old system does, which was basically to allow for point queries (ask for a specific unit, or for a specific type) and to transform the results into KML.
Besides KML the new system offers transformations into JSON and RDF for any kind of resultset produced by the system and it is able to handle the temporal aspect of the data. In fact the new system is much more powerful and not specific to OxPoints at all. It is an RDF to Java object mapper (comparable to object relational mapping systems) and we are currently working on documentation and we are hoping to make more information available soon.

A first demo we produced, built upon the new OxPoints, is a Greasemonkey script that adds links to the OLIS (Oxford Libraries Information System) result pages, to display where a particular library is, using Google maps. The demo is not yet publically available but we are working hard on publishing this and more demos soon.

Screenshot of library demo

Screenshot of library demo


RDF and the Time Dimension – Part 2

December 10, 2008

In part 1 I claimed that you will run into problems when you try to model dimensional data in RDF: In basic RDF there is no way to properly model any form of dimension and with Named Graphs we are only able to model discrete dimensions. I received some feedback saying that the way I described the problem (in particular the distinction between discrete and continuous dimensions) was a bit unfortunate (and I agree). So before I get to describing my solutions, I’d like to rephrase the problem.

RDF and Dimensions – The Problem

What we want to model in OxPoints is the development of the University of Oxford over time. A very simple example would be the name of the Oxford University Computing Services (OUCS). OUCS exists since 1957 and was originally named “Computing Laboratory”. In 1969 OUCS was split in two, one branch becoming “Oxford University Computing Services”. Let’s say we simply want to model, that a resource identified by “OUCS” existed since 1957 and was named “Computing Laboratory” from 1957 to 1969. It then changed its name to “Oxford University Computing Services”. Now, why can’t we encode this in RDF? In part 1 I’ve outlined a proof sketch describing why the basic form of RDF does not support describing this kind of conditional data: The problem is RDFs notion of entailment. However, with the extension of “named graphs”, encoding this data should be an easy exercise: Create one graph named “1957-1969” that describes OUCS as “Computing Laboratory” and one graph named “1969-“ that describes it as “Oxford University Computing Services”. As I was pointed out (and I have to agree with it), not having enough names for all possible values does not stop us from talking and reasoning about a dimension (or call it a set of values). Take the real numbers as an example. There are many more real numbers than we can make up names for them. However, this does not stop us from happily talking and proving concepts about them. So what is the difference with RDF? If I asked you whether 2 is in the set of ]1,2[ \subset \mathbb{R}, then you would probably tell me, that it is not. However, you can only give me what I see as really the correct answer if you have the same understanding of the name ]1,2[ as I have (which is that ]a,b[ describes the open interval: $latex x \in \mathbb{R} | a < x < b}). If we have the common understanding, then we can deduce information from a given name. However, for a computer the name “1957-1969” is only a bunch of characters without any more meaning attached to it. Therefore a computer could not tell me whether or not 1960 is part of “1957-1969”. The name of a named graph is just a name (or more precisely a URI), and nothing that you could (or should) deduce information from. So how are we to encode our data in such a way, that allows us to query for:

  • What is (was) the name of OUCS today (in 1960)
  • What names did OUCS have from 1060-1980
  • When is the following true: OUCS is called “Oxford University Computing Services”

I have come up with two ideas on how to solve this problem. One using reification and relaxing the notion of entailment and one using named graphs.

Read the rest of this entry »


RDF and the Time Dimension – Part 1

November 28, 2008

In RDF – an Introduction I claimed that introducing any kind of continuous dimension (for example, a time dimension) is not possible, if you follow the official interpretation given in the RDF specifications. Actually it is even worse: In basic RDF even discrete dimensions cannot be modeled.

In this post I will elaborate on my claims giving a detailed description of the problem. In part 2 I will propose a new interpretation of RDF Graphs, allow for dimensions into RDF. If you are new to RDF, or terms such as reification, entailment, fact or model don’t mean much to you, you might want to read my introduction to RDF since we need these terms to talk about RDF’s incapability of modeling dimensions. I will try to present everything in a semi formal way, using some mathematical notation, but to always try to keep the post understandable for those that would not define themselves as “math people”. However, I feel that a certain amount of formality is necessary, to outline the problem and proposed solution.

Continuous and Discrete Dimensions

Let’s start by trying to give you an idea, of what I mean by continuous and discrete dimensions in RDF. Think of a dimension as a variable d that can take values from a specified set (e.g. 1 and 2). You now define your triples (or facts) relative to the d. This means, that for d = 1 you have a different set of facts than for d = 2. Whether I now speak of a continuous or discrete dimension depends on the cardinality (number of elements) of the value set for d. If the value set contains an infinite number of elements I speak of a continuous dimension and if the number of elements is finite I speak of a discrete dimension. Since in our example the cardinality of the value set was 2 (|{1,2}|=2) we have a discrete dimension. Read the rest of this entry »