Cleaning up messy metadata with OpenRefineExport this event to calendar

Tuesday, May 21, 2019 — 12:00 PM to 1:00 PM EDT

The largest collection in UWSpace, the Library-hosted repository of Waterloo research, is the university’s ever-growing collection of electronic theses and dissertations. As these are added to the collection by their authors—not library staff—their metadata are of varying quality, and the subject keyword vocabulary is uncontrolled.

In this presentation,  Jordan Hale and Larisa Smyk  will discuss their semi-automated approach to cleaning up the subject index, from the technical constraints posed by the DSpace repository platform to the special considerations of working within a STEM environment, and will demonstrate how OpenRefine provides us with a starting point for tidying up this research collection. OpenRefine is a freely available application for data wrangling—data cleanup and transformation to other formats.

Location 
LIB - Dana Porter Library
Meeting room 428
200 University Avenue West

Waterloo, ON N2L 3G1
Canada
Event tags 

S M T W T F S
26
27
28
29
30
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
2
3
4
5
6
  1. 2019 (79)
  2. 2018 (99)