1 hour
Data Science Institute, University of Toronto
Free Tickets Available
Mon, 26 Jan, 2026 at 11:00 am to 12:00 pm (GMT-05:00)
Data Science Institute, University of Toronto
700 University Avenue, Toronto, Canada
LLM Post-Training and Reasoning via Efficient Value-Based RL
Reinforcement learning (RL) has a newfound killer application in post-training LLMs pre-trained to predict next token to adapt to tasks like instruction following, math-problem solving, and generating content or recommendations that maximize user outcomes. But are the same RL algorithms that animated robots and conquered Atari the right ones to post-train LLMs? In this talk I will present new value-based algorithms for post-training and for scaling test-time compute that leverage both the unique structure of autoregressive LLMs and recent advances on increasing efficiency by changing the Q-learning loss function. I will show how (and argue why) these new algorithms achieve state-of-the-art performance on frontier math reasoning tasks with smaller models and at a fraction of test-time FLOPs.
Biography:
Prof. Kallus’ research interests include causal inference, especially when combined with machine learning; the statistics of optimization under uncertainty; sequential and dynamic decision making; and algorithmic fairness. He is the author of the book “Applied Causal Inference Powered by ML and AI”. Before coming to Cornell, Nathan was a Visiting Scholar at USC’s Department of Data Sciences and Operations and a Postdoctoral Associate at MIT’s Operations Research and Statistics group.
This talk is co-sponsored by the Data Sciences Institute and the Master of Management Analytics Program (MMA), Rotman School of Management, University of Toronto.
For more information, please visit https://datasciences.utoronto.ca/dsi-home/data-sciences-speaker-series/.
Also check out other Workshops in Toronto, Arts events in Toronto, Literary Art events in Toronto.
Tickets for Data Sciences Institute - Data Speaker Series can be booked here.
| Ticket type | Ticket price |
|---|---|
| General Admission | Free |