Improving the Control and Evaluation of Large Models

发布时间2025-09-22文章来源上海科技大学作者责任编辑系统管理员

Given their increasing ubiquity it is of the utmost importance to ensure that AI systems can represent a diverse set of human values and perspectives. Meanwhile progress on this type of control of Large Language Models (LLMs) has been hampered by a fundamental underlying problem in generative models: the difficulty of evaluating their behavior. With their enormous range of inputs and responses and the wide variety of intended applications and dimensions along which to judge the responses, evaluation is extremely challenging. I will describe two strands of research to help address these needs. The first aims to develop approaches that permit reliable and fine-grained control over LLM generations, which allow them to be steered towards a broad set of viewpoints. The second aims to rigorously combine model-generated labels of an LLM's responses with a small pool of ground-truth labels, to allow estimates of different aspects of the range of responses. I will describe quantitative and qualitative evaluations of these methods on a number of benchmark datasets, spanning public opinion data, educational tutoring, economic disparities, and LLM toxicity and summariation quality.

Short Bio of the Speaker:Richard Zemel is the Trianthe Dakolias Professor of Engineering and Applied Science in the Computer Science Department at Columbia University. He is the Director of the NSF AI Institute for Artificial and Natural Intelligence (ARNI), and was the co-founder and inaugural Research Director of the Vector Institute for Artificial Intelligence. He is a Canadian Institute for Advanced Research AI Chair and is on the Advisory Board of the Neural Information Processing Society. His awards include an AI Lifetime Achievement Award (CAIA) and a Pioneer of AI Award (NVIDIA). His research contributions include foundational work on systems that learn useful representations of data with little or no supervision; graph-based machine learning; and algorithms for fair and robust machine learning.