PhD Seminar • Information Retrieval • Enhancing Zero-Shot Text Retrieval with Large Language ModelsExport this event to calendar

Wednesday, January 17, 2024 — 12:30 PM to 1:30 PM EST

Please note: This PhD seminar will take place in DC 1304.

Xueguang Ma, PhD candidate
David R. Cheriton School of Computer Science

Supervisor: Professor Jimmy Lin

Neural retrieval systems have proven effective across a range of tasks and languages. However, creating fully zero-shot neural retrieval pipeline remains a challenge when relevance labels are not available.

In this presentation, I will introduce two of our works, Hypothetical Document Embeddings (HyDE) and Listwise Reranker with a Large Language Model (LRL), which leverage large language models to enhance text retrieval without the need for human relevance judgement. HyDE uses large language models to generate ‘hypothetical’ documents for a given query. These documents capture relevance patterns but are not real and may contain hallucinations. This hypothetical document is then encoded into an embedding vector by an unsupervised dense retriever, such as Contriever. This vector identifies a neighbourhood in the corpus embedding space, from which similar real documents are retrieved. HyDE significantly outperforms the state-of-the-art unsupervised dense retriever, Contriever, and demonstrates comparable effectiveness as supervised dense retrievers. On the other hand, LRL introduces a listwise reranking paradigm in which a large language model is prompted to generate a reordered list of document identifiers from the given candidate documents. LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but it can also function as a final-stage reranker to enhance the top-ranked results of a pointwise method. Experiments on web search and multi-lingual information retrieval datasets show the effectiveness of our proposed methods.

Location 
DC - William G. Davis Computer Research Centre
DC 1304
200 University Avenue West

Waterloo, ON N2L 3G1
Canada
Event tags 

S M T W T F S
28
29
30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
  1. 2024 (132)
    1. June (1)
    2. May (13)
    3. April (41)
    4. March (27)
    5. February (25)
    6. January (25)
  2. 2023 (296)
    1. December (20)
    2. November (28)
    3. October (15)
    4. September (25)
    5. August (30)
    6. July (30)
    7. June (22)
    8. May (23)
    9. April (32)
    10. March (31)
    11. February (18)
    12. January (22)
  3. 2022 (245)
  4. 2021 (210)
  5. 2020 (217)
  6. 2019 (255)
  7. 2018 (217)
  8. 2017 (36)
  9. 2016 (21)
  10. 2015 (36)
  11. 2014 (33)
  12. 2013 (23)
  13. 2012 (4)
  14. 2011 (1)
  15. 2010 (1)
  16. 2009 (1)
  17. 2008 (1)