Recently, we delivered an IST Seminar on Open Data here at the University of Waterloo. Today we explore the public\goosewatch endpoint in our Open Data API through the lens of 5 Star data.

What is 5 Star data?

Developed by Tim Berners-Lee in 2010, 5 Star Open Data is a system designed "to encourage people […] along the road to [improved] data. […] More stars [are awarded] as [data providers] make [their data] progressively more powerful [and] easier for people to use.”

5-star data diagram, 1 star = PDF, 2 stars = XLS, 3 stars = CSV, 4 stars = RDF, 5 stars = LOD.

5 Star data: a progressive enhancement approach to improving data.

In more details:

★ Open License (OL): make your stuff available on the Web (whatever format) under an open license. Most importantly, to get any star at all, the data must be licensed to be explicitly as Open.

★★ (Machine) Readable (RE): make it available as structured data (e.g., Excel instead of image scan of a table).             

★★★ Open Formats: use non-proprietary formats (e.g., CSV instead of Excel).            

★★★★ Uniform Resource Identifier (URI): use URIs to denote things, so that people can point at your stuff.               

★★★★★ Linked Data (LD): link your data to other data to provide context. 
From <>

A Goose Story

As the geese are on campus for spring mating season, we can highlight one of our more light-hearted (albeit very useful) services: The public/goosewatch end point.  

For ★ data, we look at data as it was first published back in 2013 before the Student Success Office's collaboration with the Open Data initiative.

Below, we have an image with information about the locations of the nests. The image quickly went viral and student developers quickly asked if the data could be made open in the form of Open Data.

One option would have been to take the data to ★★ data. We could have done this by moving the data into a format such as an .xls as shown below. While the data itself is open, the format is proprietary. We note that this issue does not seem as pressing in the higher education environment as software to manipulate proprietary files is commonplace or severely discounted. While proprietary formats can be the easiest to use for most users, proprietary software presents a barrier to expert users.

image of Excel spreadsheet detailing the location of goose nests.

The XLS version of the location of goose nests: proprietary but usable for the average user.

We could have taken the data to ★★★ data if we delivered the data in a Comma Separated Values CSV.

The CSV version of the location of goose nests: Open Format and preferred by expert users.

Since we had the resources and ability to raise the bar, we quickly moved to take the data to ★★★★ data—in the form of locating the data at a Uniform Resource Identifier (URI) via our Application Programming Interface (API). We also invited users to submit geese nests on GitHub. In the API form, users can easily take the data and mash it up with other services (such as the Google Maps API) and make new uses for it.

Why is an API better? It serves as a way for developers to request bite-sized pieces of data when they are needed on a mobile device.

For example, imagine there was a list of all nests on campus for all time. An API provides a way for the expert user to deliver a mobile service that only requires it to load the list of nests within the desired time frame rather than having to download and process the entire list and then filter it later.

As described in the docs, for example  pulls up everything related to term 1145 only and pulls up the current term (1141).

Shortly after releasing the data, James McCarthy of the MAD lab made a site that used the Open Data API to help guide students from location A to location B through the campus either away or towards the nests as per their preference.

This year, the Goose Watch tool has gone the extra step and has allowed users to more easily submit locations of geese via its interface. Instead of consuming the data, it is now a source of moderated crowd-reported Open Data!

There you have it! With our partners—the Student Success Office and the Faculty of Environment Mapping/Analysis/Design Lab (MAD)—Open Data took a ★ data set to ★★★★ unleashing a torrent of utility beyond its original form.

Tell us what sort of data you think could benefit from some Star Power in the comments below!

Thanks to our guest blogger, Nathan Vexler.

  1. 2020 (4)
  2. 2019 (4)
    1. November (2)
    2. August (1)
    3. July (1)
  3. 2018 (6)
    1. October (2)
    2. July (2)
    3. April (1)
    4. January (1)
  4. 2017 (2)
    1. November (1)
    2. October (1)
  5. 2016 (4)
    1. September (1)
    2. July (3)
  6. 2015 (13)
    1. October (1)
    2. August (1)
    3. July (1)
    4. June (1)
    5. May (2)
    6. April (2)
    7. March (2)
    8. February (3)
  7. 2014 (10)
    1. December (2)
    2. May (2)
    3. April (1)
    4. March (2)
    5. February (2)
    6. January (1)
  8. 2013 (23)
    1. December (1)
    2. November (2)
    3. October (3)
    4. September (2)
    5. August (2)
    6. July (5)
    7. June (4)
    8. May (4)