Seminar • Artificial Intelligence • Robust and Trustworthy NLP Through the Lens of Text Summarization

Thursday, February 10, 2022 11:30 am - 11:30 am EST (GMT -05:00)

Please note: This seminar will be given online.

Yue Dong
School of Computer Science, McGill University
Mila

Natural language processing (NLP) offers incredible opportunities for automating tasks that involve human languages. However, numerous studies show that instead of learning, modern systems frequently memorize artifacts and biases. Furthermore, the texts produced by such models often contain factual errors.

In this talk, I’ll examine these fundamental issues via the lens of text summarization, a crucial but challenging task in NLP that involves condensing texts while retaining important information. I’ll discuss new training and inference algorithms that leverage reinforcement learning to ensure the model’s robustness. I will also highlight our efforts to improve generalization to data-scarce domains, such as scientific and medical texts. Towards trustworthy generation, I will discuss strategies for hallucination reduction that use edit-based models and adversarial training. Finally, I will wrap off with a discussion of our recent work on knowledge-enhanced text generation that incorporates knowledge graph (KG) into text summarization.


Bio: Yue Dong is a final-year Ph.D. student in Computer Science at McGill University and Mila. Her primary research interests are text summarization and conditional text generation, with a focus on building robust and trustworthy language technology. She has gained extensive research experience in the academic lab and through industrial internships (Google AI, AI2, Microsoft, and Huawei Noah’s Ark) with 13 papers (8 as the first author or first co-author) published at top-tier NLP and ML conferences, including ACL, EMNLP, and AAAI. One of her first-author papers has won the best paper award at the Canadian AI conference. She has also co-organized many workshops, including NewSum at EMNLP 2021, ENLSP at NeurIPS 2021, and NAACL 2022 Text Editing Models tutorial.