La Faculté des Lettres
PAS building room 2401
I have no data yet. It is a capital mistake to theorize before one has data.
A. Conan Doyle
Adventures of Sherlock Holmes
Adventure 1. A Scandal in Bohemia
Professor emeritus in the David R. Cheriton School of Computer Science and a member of the Database Research Group.
Teaching and research interests span the fields of data structures and databases, particularly the design of text management systems suitable for maintaining large reference texts (including the Oxford English Dictionary) and large, heterogeneous text collections.
In addition to advising and supervising students, courses taught cover introductory to advanced topics aimed at specialists, non-specialists, and software professionals.
Earlier research was conducted through the UW Centre for the New OED and Text Research. More recently, research activities fall in the areas of XML data management and database security.
The long-range objective of my research has been (and will continue to be) to develop a unified methodology for designing data structures from the individual users' models through the enterprise model to the storage structures. This has involved the development of formal models, the development and analysis of effective algorithms, and the application of the ideas to solving large, practical problems.
We pursued this objective first as it applies to conventional record-oriented databases. Doctoral students working with me concentrated on defining and analyzing properties of "normal forms" as part of the design of a conceptual model [Osborn PhD:77, Ling PhD:78]. Collaborating in part with Professor Gaston Gonnet, we examined formalisms for describing data structures to support efficient algorithms [Tompa 1977, Gonnet-Tompa 1983], concentrating particularly on the specification of their abstract structures [Santoro PhD:79, Tompa 1980] and on the design of efficient storage structures and policies [Ramirez PhD:80, Ziviani-Tompa 1982]. More recently, students working with me have concentrated on supporting user models of the data: examining the problems of processing database updates that are expressed in terms of a partial view of the data [Medeiros PhD:85, Brodnik-Tompa 1993] and of keeping a materialized view up-to-date in the presence of change to the underlying stored data [Blakeley PhD:87]. We have also examined algorithms to process users' queries by the most efficient means available [Icaza PhD:87].
Since 1981, we have examined database concerns for non-standard databases, first concentrating on videotex databases. Because the fundamental assumptions about the nature of the data and its uses distinguishes videotex databases from conventional ones, we developed a page-oriented database model that includes query and update facilities [Tanguay MMath:86]. Because of videotex's orientation towards large public-access systems for casual users, several students working with me considered the support of individualized views. We investigated powerful browsing facilities [Raymond MMath:84], graphical query languages [Böggild MMath:86], and the effectiveness of users' classification systems [Raymond-Cañas-Tompa-Safayeni 1989]. Although declining interest in videotex reduced the impact of our work, it served well as a precursor for ongoing work with hypertexts [Tompa 1990, Tompa-Raymond 1991, Tompa-Blake-Raymond 1993].
Since 1984, we have concentrated on more general text-dominated databases. The thrust of this research has been directed towards the development of a database system that will be capable of supporting the needs of text creators (such as the editors at the Oxford University Press who will maintain and enhance the Oxford English Dictionary), while simultaneously supporting the needs of text users (writers, editors, humanist scholars) who will access such machine-readable texts interactively. Again the conventional assumptions about data were found to be inappropriate -- even the fundamental concept of ``entity'' does not apply -- and, in close collaboration with Professor Gaston Gonnet, we developed two new models of text data [Blake-Bray-Tompa 1992, Salminen-Tompa 1994]. Because of the great potential, an Ontario company, Open Text Corporation, began operations in July 1989 to develop and market products based on our research. Open Text, which currently employs over 8000 individuals worldwide, has developed the Livelink Intranet application suite including Livelink Search, which has evolved from our research.
Later research includes the design and development of a text/relational database management system, based on a federated model that provides a hybrid query processor that supports extensions to SQL to accommodate structured text such as described using SGML, and the design and development of a system for data resource discovery for deployment on the internet.
The following list of major collaborations is indicative of the value of the research to industry:
A. Salminen and F. Tompa, Communicating with XML, Springer, 2011, 224+xiii pp.
Recent research on combining text and relational databases and publications arising from that work are described elsewhere.
Except where indicated, proceedings are available for $20.00 each.
Every experienced and budding researcher in computer science should read Jennifer Widom's tips on writing and presenting conference papers:
Administrative interests include promoting university-level education and curricula through the multi-university Consortium on Graduate Education in Software Engineering (ConGESE: aimed at providing continuing education for practicing computing professionals). Other recent service includes serving as the founding Director of the School of Computer Science, membership on various School and University committees, membership on the University Computer Science Accreditation Council (sponsored by CIPS), and serving as a founding member of the Boards of Directors of Communications and Information Technology Ontario (CITO) and of Open Text Corporation.
La Faculté des Lettres
PAS building room 2401
The University of Waterloo acknowledges that much of our work takes place on the traditional territory of the Neutral, Anishinaabeg, and Haudenosaunee peoples. Our main campus is situated on the Haldimand Tract, the land granted to the Six Nations that includes six miles on each side of the Grand River. Our active work toward reconciliation takes place across our campuses through research, learning, teaching, and community building, and is co-ordinated within the Office of Indigenous Relations.