Wednesday, November 10, 2021 12:00 pm
-
12:00 pm
EST (GMT -05:00)
Probability/Statistics & Biostatistics seminar series
Pragya
Sur Link to join seminar: Hosted on Zoom |
Precise
high-dimensional
asymptotics
for
AdaBoost
via
max-margins
&
min-norm
interpolants
This
talk
will
introduce
a
precise
high-dimensional
asymptotic
theory
for
AdaBoost
on
separable
data,
taking
both
statistical
and
computational
perspectives.
We
will
consider
the
common
modern
setting
where
the
number
of
features
p
and
the
sample
size
n
are
both
large
and
comparable,
and
in
particular,
look
at
scenarios
where
the
data
is
asymptotically
separable.
Under
a
class
of
statistical
models,
we
will
provide
an
(asymptotically)
exact
analysis
of
the
max-min-L1-margin
and
the
min-L1-norm
interpolant.
In
turn,
this
will
characterize
the
generalization
error
of
AdaBoost,
when
the
algorithm
interpolates
the
training
data
and
maximizes
an
empirical
L1
margin.
On
the
computational
front,
we
will
provide
a
sharp
analysis
of
the
stopping
time
when
boosting
approximately
maximizes
the
empirical
L1
margin.
Our
theory
provides
several
insights
into
properties
of
AdaBoost;
for
instance,
the
larger
the
dimensionality
ratio
p/n,
the
faster
the
optimization
reaches
interpolation.
Our
statistical
and
computational
arguments
can
handle
(1)
finite-rank
spiked
covariance
models
for
the
feature
distribution
and
(2)
variants
of
AdaBoost
corresponding
to
general
Lq-geometry,
for
q
in
[1,2].
This
is
based
on
joint
work
with
Tengyuan
Liang.