Meet Yuntian Deng, a computer scientist who studies natural language processing and machine learning

Thursday, October 31, 2024

Yuntian Deng joined the Cheriton School of Computer Science as an Assistant Professor in August 2024. Before coming to the University of Waterloo, he was a Postdoctoral Researcher at the Allen Institute for AI (AI2), a non-profit research institute that conducts high-impact AI research and engineering in service of the common good.

Professor Deng holds a PhD in Computer Science from Harvard University, a Master’s degree in Language Technologies from Carnegie Mellon University, and a Bachelor of Engineering from Tsinghua University. His research focuses on advancing natural language processing (NLP) and machine learning.

He has received several fellowships and awards, including the NVIDIA Fellowship, Baidu Fellowship, University of Chicago Rising Stars in Data Science, ACL 2017 Best Demo Paper Runner-Up, ACM Gordon Bell Special Prize for COVID-19 Research, the Argonne National Lab Impact Award, and the DAC 2020 Best Paper Award. As of October 2024, his research papers have been cited collectively more than 5,100 times according to Google Scholar, with an h-index of 19.

What follows is a lightly edited transcript of a conversation with Professor Deng, where he discusses his research, advice for aspiring computer scientists, and his excitement about joining the Cheriton School of Computer Science.

Professor Yuntian Deng in the Davis Centre

Tell us a bit about your research. What do you see as your most significant contribution?

I work on language models, which are the underlying technology for chatbots like ChatGPT.

One of my recent projects is WildChat, where we built a chatbot that provided users with free access to ChatGPT in exchange for their consent to publish their data. We collected over a million chatbot conversations — actual ChatGPT conversations users had — to allow researchers to study real-world interactions and identify mistakes and model limitations.

Another line of my work is called “implicit chain of thought reasoning,” which teaches language models to reason about complex problems by internalizing the reasoning process. A parallel can be drawn to human learning. When babies learn to walk, for example, they focus on individual muscle movements, but over time the process becomes internalized and unconscious, allowing them to build on this skill to learn more complex tasks, such as dancing. Similarly, we’ve demonstrated that smaller language models can be trained to directly predict outcomes — such as solving 20-digit by 20-digit multiplications — without relying on intermediate steps by internalizing these reasoning processes.

What challenges in natural language processing and machine learning do you find most exciting to tackle?

One of the challenges I find exciting is figuring out how to teach language models to reason and communicate in ways different from humans. Since their architecture is fundamentally different from our brains, the optimal way for them to solve problems and communicate with each other may not resemble how we do it. For instance, while humans need step-by-step reasoning for complex tasks, language models can sometimes internalize these steps entirely, as seen in my implicit chain of thought work. The challenge is discovering the best way to teach them when we don’t know the optimal method ourselves.

The hope is to leverage the potential of language models by teaching them to reason in the most effective way, which is very likely to be different from the way humans reason.

What advice would you give to students interested in pursuing research in your area?

I would advise students to gain some research experience before committing to a PhD program in NLP. The landscape of NLP research is evolving rapidly, with simple, scalable methods often proving more impactful than intricate, hand-designed systems. A significant portion of progress comes from system engineering efforts, often driven by large collaborative teams.

Before committing to a PhD, I recommend that students explore what doing NLP research feels like today by joining a lab or replicating papers they like, and only commit if they find themselves deeply excited by the research and willing to spend their most productive years on it.

Do you see opportunities for collaborative research at the Cheriton School of Computer Science?

Absolutely. Waterloo’s CS department is large, with experts in many different areas. I’m already collaborating with Professor Pengyu Nie — an expert in software engineering — on a coding-related project. Software engineering is a potential application to test the ability of language models to reason. I see many more opportunities to collaborate as I get to know other colleagues.

What aspect of joining the Cheriton School of Computer Science excites you the most?

I’m most excited to be joining a department with so many experts doing impactful work. It’s an honour to become colleagues with them.

Who has inspired you in your career?

My PhD advisor, Professor Alexander (Sasha) Rush, inspired me to pursue a career in academia. I greatly enjoyed my PhD life, and I hope to become an advisor like him. My co-advisor, Professor Stuart Shieber, also had a big influence, teaching me to think with a long-term perspective and consider what will matter in the long run. I was also inspired by my postdoc advisor, Professor Yejin Choi, whose career path and perseverance showed me how pursuing an initially underappreciated line of work can eventually lead to impactful, widely accepted research.

They encouraged me to pay it forward, and I now strive to support my students with the same dedication and mentorship I received.

What do you do in your spare time?

I used to spend my spare time swimming, hiking, skiing, reading, and road tripping. But since becoming a parent, most of my spare time is now spent enjoying activities with my son.