Please note: This seminar will take place in DC 1304.
Ibrahim
Numanagić
Canada
Research
Chair
in
Data
Science
and
Computational
Biology
Assistant
Professor,
Department
of
Computer
Science
University
of
Victoria
The scale of biological data has increased exponentially over the last two decades. As Moore’s Law continues to slow, scientists can no longer rely on hardware to catch up with the ever-increasing size of biological data and need both scalable algorithms and effective programming environments to implement and test them.
In the first part, I will take a short walk through “conventional” bioinformatics by presenting various computational methods for analyzing repeat regions of the human genome at scale. These methods—large-scale repeat detection, pharmacogene genotyping, and read alignment resolution—employ various algorithmic strategies to efficiently deal with NP-hard optimization problems and large-scale datasets.
Later, I will shift focus to the engineering side of the equation and argue that substantial improvements can be made if we pay closer attention to the programming languages used for bioinformatics software development. To that end, I will introduce Codon, a framework for accelerating Python codebases that can use domain-specific knowledge to improve software performance and will show how we used it to accelerate bioinformatics pipelines. I will also introduce Sequre, a Codon-enabled Pythonic framework for secure multi-party computing that allows efficient yet secure computation on top of sensitive biomedical datasets. I will further discuss our practical experience with using Codon in bioinformatics and secure computing, as well as its future prospects.
Bio: Ibrahim Numanagić is a Canada Research Chair in Data Science and Computational Biology and an Assistant Professor at the University of Victoria’s Department of Computer Science. Before that, he was a postdoctoral associate in the Computation and Biology Group at MIT CSAIL. He obtained his academic degrees from Simon Fraser University in Canada and the University of Sarajevo in Bosnia and Herzegovina.
Research interests include the development of scalable combinatorial algorithms, programming languages, tools and development frameworks for secure and rapid analysis of genomic sequencing data.