2017 Cheriton Research Symposium

Thursday, September 28, 2017 3:00 pm - Friday, September 29, 2017 4:30 pm EDT (GMT -04:00)

The David R. Cheriton School of Computer Science will hold its annual Cheriton Research Symposium September 28–29, 2017 in the Davis Centre.

This year's research symposium will consist of talks by industry leaders and members of the School. If you missed a talk, don't worry. We video recorded them and have embedded the videos at the bottom of the page.

Posters by David R. Cheriton ​Graduate Scholarship recipients will be on display in the Great Hall, Davis Centre from 10:00 am to 3:00 pm on September 29, 2017.

Schedule 

Thursday, September 28, 2017

Time Description
3:00 p.m.

DC 1302 – Mark Giesbrecht – Welcome and Opening Remarks

3:15 p.m.

DC 1302 – David R. Cheriton, Stanford University

The Age of Incompetence and Software Evolution

Human beings have become increasingly incompetent to perform almost any useful task because of automation, and this trend is destined to continue, if not accelerate. You would not hire a person to do a job if they are 10 times slower at it, can work less than a 1/4 of the time,  make far more mistakes and cost you far more than an alternative hire; the comparison to automation is even more extreme. Realistically, humans are pathetic and now are becoming dangerous. The monkeys could run the jungle, but not competent to run a zoo, and dangerous if they have any control over an automated zoo. How can human civilization survive this age as we transition to full automation?    But wait, don’t these humans write the software that is at the core of automation?

I claim we are rapidly moving to a model in which software evolves in the Darwinian sense, and is not “written” in the conventional sense. Humans feed in software mutations and the environment decides which mutations survive, and which don’t. Thus, us pathetic human beings don’t have to be smarter than the software to make the software smarter.  I will talk about some approaches I see for better software evolution.

4:00 p.m.

DC 1302 – Steve Woods and Verna Friesen, Google Waterloo

28 Years of UW Community from Two CS Grads (1989–2017)

The University of Waterloo has been credited as “...a source of [highly diverse] graduates with sparkling new ideas” (source). Industry giants like Google, along with burgeoning startups in the Waterloo Region, Silicon Valley and beyond have been shaped by UW Math/CS alumni. In this talk, Math/CS alumni Dr. Steven Woods (Sr Engineering Director & Engineering Site Lead, Google) and Verna Friesen (Sr Software Engineering Manager, Google) will share their stories, exploring the value that UW and the Computer Science school have added to the global tech community, the community values that are being exported around the world with each alumni and the challenges we all must face and overcome as we all work to see this community reach its true potential for global impact.

Friday, September 29, 2017

Time Description
10:00 a.m.

DC 1302 – Matei Zaharia, Stanford University

Composable Parallel Processing in Apache Spark and Weld

Giving every developer easy access to modern, massively parallel hardware, whether at the scale of a datacenter or a single modern server, remains a daunting challenge. In this talk, I’ll cover one powerful weapon we can use to meet this challenge: enabling efficient *composition* of parallel programs. Composition is arguably the main way developers are productive writing software, but unfortunately, it has taken a back seat in the design of many parallel processing APIs. For example, composing MapReduce jobs required writing data to files between each job, which was slow and error-prone, and many single-machine parallel libraries face similar problems. I’ll show how composability enabled much higher productivity in the Apache Spark API, and how this idea has been taken much further in recent versions of Spark with “structured” APIs such as DataFrames and Spark SQL. In addition, I’ll discuss Weld, a research project at Stanford that aims to enable much more efficient composition between parallel libraries on a single server (either for the CPU and GPU). We show that the traditional way of composing libraries in this setting, through function calls that exchange data through memory, can create order-of-magnitude slowdowns. In contrast, Weld can transparently speed up applications using libraries such as NumPy, Pandas and TensorFlow by up to 30x through a novel API that lets it optimize across the library calls used in each program.

11:00 a.m. Poster Session – David R. Cheriton ​Graduate Scholarship recipients
12:00 p.m. Lunch 
1:00 p.m. Poster Session – David R. Cheriton ​Graduate Scholarship recipients

2:00 p.m.

DC 1302 – Tim Brecht – University of Waterloo

Understanding and Improving 802.11 (WiFi) Network Performance

Forecasts predict that 3 billion WiFi devices will be shipped this year and more than 9 billion devices will be in use by the end of the year. These large numbers are driven by smart phones, tablets, laptops, entertainment devices, and a growing number of devices used in the Internet of Things. Despite the introduction of 4G and 5G cellular data technologies, 802.11 (WiFi) networks are the dominant technology used for wireless communications on mobile devices. In 2016, 60% of potential cellular network traffic was off-loaded to WiFi networks and this number is expected to increase.

Obtaining peak throughput in 802.11 networks depends on software that chooses the combination of physical layer features (the transmission data rate) best suited for the current channel conditions. For 802.11n and 802.11ac devices there can be up to 128 and 640 combinations, respectively. The goal, for each packet, is to choose the combination that maximizes throughput by trading off high transmission data rates with low frame error rates.

In this talk, I will describe research we are conducting to characterize, better understand, and improve the operation of WiFi networks. We have found that interesting relationships exist between the large number of transmission data rates that can be chosen when sending data. Some exciting and interesting properties of these relationships include: (1) they persist even when mobile devices create highly variable channel conditions; (2) they may change overtime; (3) despite such changes, relationships have been found to exist over periods of up to one hour. After describing these relationships and our findings, I will show some results from an example application of this work, describe other potential implications of this research, and outline several compelling avenues for future work.

This is joint work with Ali Abedi.

2:45 p.m. Coffee break

3:00 p.m. 

DC 1302 – Shai Ben-David – University of Waterloo

Clustering — What both theoreticians and practitioners are doing wrong

Unsupervised learning is widely recognized as one of the most important challenges facing machine learning nowadays. However, in spite of hundreds of papers on the topic being published every year, current theoretical understanding and practical implementations of such tasks, and in particular of clustering, is very rudimentary.

My talk will focus on clustering. The first challenge I will address is model selection — how should a user pick an appropriate clustering tool for a given clustering problem, and how should the parameters of such an algorithmic tool be tuned? In contrast with other common computational tasks, in clustering, different algorithms often yield drastically different outcomes. Therefore, the choice of a clustering algorithm may play a crucial role in the usefulness of an output clustering solution. However, currently there exists no methodical guidance for clustering tool selection for a given clustering task. I will explain the severity of this problem and describe some recent proposals aiming to address this crucial lacuna.

The second aspect of clustering that I will address is the complexity of computing a cost minimizing clustering (given some clustering objective function). While most of the clustering objective optimization problems are computationally infeasible, they are being carried out routinely in practice. This theory-practice gap has attracted significant research attention recently. I will survey some of the theoretical attempts to address this gap and discuss how close do they bring us to a satisfactory understanding of the computational resources needed for achieving good clustering solutions.

Poster session winners

In total, 29 graduate students participated in the 2017 Cheriton Research Symposium poster competition.

Congratulations to the competition winner and runner ups!

  • Corwin Sinnamon — Winner, $300 prize • A Faster Data Structure for Distributive Lattices
  • Ivana Kajić — Runner up, $200 prize • A Biologically Plausible Neural Network Model of Semantic Memory Search 
  • Akshay Ramachandran — Runner up, $200 prize • Paulsen Problem and Operator Scaling

Videos of the symposium presenters

David R. Cheriton • Stanford University • The Age of Incompetence and Software Evolution

Remote video URL

Steve Woods and Verna Friesen • Google Waterloo • 28 Years of UW Community from Two CS Grads (1989–2017)

Remote video URL

Matei Zaharia • Stanford University • Composable Parallel Processing in Apache Spark and Weld

Remote video URL

Tim Brecht • University of Waterloo • Understanding and Improving 802.11 (WiFi) Network Performance

Remote video URL

Shai Ben-David • University of Waterloo • Clustering — What Both Theoreticians and Practitioners are Doing Wrong

Remote video URL