Speaker: | Greg Drzadzewski |
Abstract: |
On-Line Analytical Processing (OLAP) systems are commonly used on top of structured data to help users make sense of large data collections by providing them with summary information that can be examined at various levels of detail. Partial materialization has been used as part of these OLAP systems as a way of reducing the time required to calculate summaries as well as satisfying the constraints of limited storage and available time for updates. When dealing with large collections of tagged documents, one would also benefit from the summarization operations provided by an OLAP system. Such a system could make it less time consuming for users to explore and understand the information contained in large document collections. Tagged document collections, however, require different types of measures for summarizing the data, and the data exhibits considerably different properties than is the case with the data in traditional OLAP. To address these issues, an OLAP system for documents will require a different design and partial materialization approach. |
200 University Avenue West
Waterloo, ON N2L 3G1
Canada