Monday, October 21, 2024 10:40am to 11:45am
About this Event
Engineering 2 1156 High Street, Santa Cruz, California 95064
Presenter: Honglei Zhuang
Description: While Large Language Models (LLMs) exhibit strong zero-shot capabilities, carefully designed inference-time strategies are crucial for unlocking their full potential. This talk delves into two tasks where this is particularly evident: text ranking and retrieval-augmented generation (RAG). In text ranking, we investigate inference-time prompting strategies to elicit relevance judgments and ranking preferences from LLMs, illustrating how to create effective zero-shot text rankers from LLMs. In RAG, we show that combining sophisticated inference-time strategies such as incorporating demonstrations and iterative prompting enables LLMs to better utilize long context windows, achieving better performance in the long-context regime beyond simply increasing the number of retrieved documents. We also develop a quantitative model to predict the optimal strategies based on a given context window budget.
Bio: Honglei Zhuang is a research scientist at Google DeepMind. His research interests include information retrieval, natural language processing, data mining and machine learning. He is particularly interested in building the next-generation technology to revolutionize how to access and leverage information with/for LLMs.
Hosted by: Professor Hao Ye / ECE
0 people are interested in this event
Meeting ID: 999 1631 9246
Passcode: 503045
User Activity
No recent activity