Candidate:
Ruizhou
Xu
Title:
Compression
and
Analysis
of
Pre-trained
Language
Model
with
Neural
Slimming
Date:
August
3,
2022
Time:
11:00
Place:
online
Supervisor(s):
Karray,
Fakhri
Abstract:
The
neural
networks
are
powerful
solutions
to
help
with
decision
making
and
solve
complex
problems
in
recent
years.
In
the
domain
of
natural
language
processing,
BERT
and
its
variants
significantly
outperform
other
network
structures.
It
learns
general
linguistic
knowledge
from
a
large
corpus
in
the
pre-training
stage
and
utilizes
that
to
solve
downstream
tasks.
However,
the
size
of
this
kind
of
model
is
enormous,
and
it
causes
the
issue
of
overparameterization.
This
makes
the
model
deployment
on
small
edge
devices
less
scalable
and
flexible.
In
this
thesis,
we
study
how
to
compress
the
BERT
model
in
a
structured
pruning
manner.
We
proposed
the
neural
slimming
technique
to
assess
the
importance
of
each
neuron
and
designed
the
cost
function
and
pruning
strategy
to
remove
the
neurons
that
make
zero
or
less
contribution
to
the
prediction.
After
getting
fine-tuned
on
the
downstream
tasks,
the
model
can
learn
a
more
compact
structure,
and
we
name
it
SlimBERT.
We
tested
our
method
on
7
GLUE
tasks
and
used
only
10%
of
the
original
parameters
to
recover
94%
of
the
original
accuracy.
It
also
reduced
the
run
time
memory
and
increased
the
inference
speed
at
the
same
time.
Compared
to
knowledge
distillation
methods
and
other
structured
pruning
methods,
the
proposed
approach
achieved
better
performance
under
different
metrics
with
the
same
compression
ratio.
Moreover,
our
method
also
improved
the
interpretability
of
BERT.
By
analyzing
neurons
with
a
significant
contribution,
we
can
observe
that
BERT
utilizes
different
components
and
subnetworks
according
to
different
tasks.