M3
2134
Candidate
Manda Winlaw , Applied Mathematics, University of Waterloo
Title
Algorithms and Models for Tensors and Networks with Applications in Data Science
Abstract
Big
data
plays
an
increasingly
central
role
in
many
areas
of
research
including
optimization
and
network
modeling.
We
consider
problems
applicable
to
large
datasets
within
these
two
branches
of
research.
We
begin
by
presenting
a
nonlinearly
preconditioned
nonlinear
conjugate
gradient
(PNCG)
algorithm
to
increase
the
convergence
speed
of
iterative
unconstrained
optimization
methods.
We
provide
a
concise
overview
of
several
PNCG
variants
and
their
properties
and
obtain
a
new
convergence
result
for
one
of
the
PNCG
variants
under
suitable
conditions.
We
then
use
the
PNCG
algorithm
to
solve
two
different
problems:
computing
the
rank-R
canonical
tensor
decomposition
and
finding
the
solution
to
a
latent
factor
model
where
latent
factor
models
are
often
used
as
important
building
blocks
in
many
practical
recommendation
systems.
For
both
problems,
the
alternating
least
squares
(ALS)
algorithm
is
typically
used
to
find
a
solution
and
as
such
we
consider
it
as
a
nonlinear
preconditioner.
Note
that
the
ALS
algorithm
can
be
viewed
as
a
nonlinear
preconditioner
for
the
NCG
algorithm
or
alternatively,
NCG
can
be
viewed
as
an
acceleration
process
for
ALS.
We
demonstrate
numerically
that
the
convergence
acceleration
mechanism
in
PNCG
often
leads
to
important
pay-offs
for
difficult
tensor
decomposition
problems,
with
convergence
that
is
significantly
faster
and
more
robust
than
for
the
stand-alone
NCG
or
ALS
algorithms.
As
well,
we
show
numerically
that
the
PNCG
algorithm
requires
much
fewer
iterations
and
less
time
to
reach
desired
ranking
accuracies
than
stand-alone
ALS
in
solving
latent
factor
models.
We
next
turn
to
problems
within
the
field
of
network
or
graph
modeling.
A
network
is
a
collection
of
points
joined
together
by
lines
and
networks
are
used
in
a
broad
variety
of
fields
to
represent
connections
between
objects.
Many
large
real-world
networks
share
similar
properties
which
has
garnered
considerable
interest
in
developing
models
that
can
replicate
these
properties.
We
begin
our
discussion
of
graph
models
by
closely
examining
the
Chung-Lu
model.
The
Chung-Lu
model
is
a
very
simple
model
where
by
design
the
expected
degree
sequence
of
a
graph
generated
by
the
model
is
equal
to
a
user-supplied
degree
sequence,
k
=
(k1,...,kn),
where
ki
is
the
user-supplied
degree
of
node
i
and
n
is
the
number
of
nodes
in
the
graph.
We
explore
what
happens
both
theoretically
and
numerically
when
simple
changes
are
made
to
the
model
and
when
the
model
assumptions
are
violated.
As
well,
we
consider
an
algorithm
used
to
generate
instances
of
the
Chung-Lu
model
that
is
designed
to
be
faster
than
the
traditional
algorithm
but
find
that
it
only
generates
instances
of
an
approximate
Chung-Lu
model.
We
explore
the
properties
of
this
approximate
model
under
a
variety
of
conditions
and
examine
how
different
the
expected
degree
sequence
is
from
the
user-supplied
degree
sequence.
We
also
explore
several
ways
of
improving
this
approximate
model
to
reduce
the
approximation
error
in
the
expected
degree
sequence
and
note
that
when
the
assumptions
of
the
original
model
are
violated
this
error
remains
very
large.
We
next
design
a
new
graph
generator
to
match
the
community
structure
found
in
real-world
networks
as
measured
using
the
clustering
coefficient
and
assortativity
coefficient.
Our
graph
generator
uses
information
generated
from
a
clustering
algorithm
run
on
the
original
network
to
build
a
synthetic
network.
Using
several
real-world
networks,
we
test
our
algorithm
numerically
by
creating
a
synthetic
network
and
then
comparing
the
properties
to
the
real
network
properties
as
well
as
to
the
properties
of
another
popular
graph
generator,
BTER,
developed
by
Seshadhri,
Kolda
and
Pinar.
Our
graph
generator
does
well
at
preserving
the
clustering
coefficient
and
typically
outperforms
BTER
in
matching
the
assortativity
coefficient,
particularly
when
the
assortativity
coefficient
is
negative.