Speaker: | Greg Drzadzewski |
Abstract: |
On-Line Analytical Processing (OLAP) systems are commonly used on top of structured data to help users make sense of large data collections by providing them with summary information that can be examined at various levels of detail. Partial materialization has been used as part of these OLAP systems as a way of reducing the time required to calculate summaries as well as satisfying the constraints of limited storage and available time for updates.
When
dealing
with
large
collections
of
tagged
documents,
one
would
also
benefit
from
the
summarization
operations
provided
by
an
OLAP
system.
Such
a
system
could
make
it
less
time
consuming
for
users
to
explore
and
understand
the
information
contained
in
large
document
collections.
Tagged
document
collections,
however,
require
different
types
of
measures
for
summarizing
the
data,
and
the
data
exhibits
considerably
different
properties
than
is
the
case
with
the
data
in
traditional
OLAP.
To
address
these
issues,
an
OLAP
system
for
documents
will
require
a
different
design
and
partial
materialization
approach. |