Engineering 2 1156 High Street, Santa Cruz, California 95064

Presenter: Honglei Zhuang

Description: While Large Language Models (LLMs) exhibit strong zero-shot capabilities, carefully designed inference-time strategies are crucial for unlocking their full potential. This talk delves into two tasks where this is particularly evident: text ranking and retrieval-augmented generation (RAG). In text ranking, we investigate inference-time prompting strategies to elicit relevance judgments and ranking preferences from LLMs, illustrating how to create effective zero-shot text rankers from LLMs. In RAG, we show that combining sophisticated inference-time strategies such as incorporating demonstrations and iterative prompting enables LLMs to better utilize long context windows, achieving better performance in the long-context regime beyond simply increasing the number of retrieved documents. We also develop a quantitative model to predict the optimal strategies based on a given context window budget.


Bio: Honglei Zhuang is a research scientist at Google DeepMind. His research interests include information retrieval, natural language processing, data mining and machine learning. He is particularly interested in building the next-generation technology to revolutionize how to access and leverage information with/for LLMs.


Hosted by: Professor Hao Ye / ECE


Zoom link: https://www.google.com/url?q=https://ucsc.zoom.us/j/99916319246?pwd%3DwlipSeBqozlxwWF9KkqutaHnVy8gN9.1&sa=D&source=calendar&ust=1729543276623867&usg=AOvVaw2NWjEo5Etvv5Sor8AqKLh6
 

Event Details

See Who Is Interested

0 people are interested in this event

User Activity

No recent activity