Back
Many recent studies of LLM performance have focused on the ability of LLMs to achieve outcomes comparable to humans on academic and professional exams. However, it is not clear whether such studies shed light on the extent to which models show reasoning ability, and there is controversy about the significance and implications of such results. We seek to look more deeply into the question of how and whether the performance of LLMs on exams designed for humans reflects true aptitude inherent in LLMs. We do so by making use of the tools of psychometrics which are designed to perform meaningful measurement in test taking. In the first part of the talk we will demonstrate how we do this using a unique dataset that captures the detailed performance of over 5M students across 8 college-entrance exams given over a span of two years in Brazil. In the second part of the talk we will discuss some open questions and problems we are considering in the area of LLM auditing.
Evimaria Terzi (Boston University)
Professor of Computer Science
Evimaria Terzi is a professor at the Department of Computer Science at Boston University (BU). She is also a founding faculty member of the Faculty of Computing and Data Sciences at BU. Before joining BU in 2009, she was a Research Staff Member at IBM Almaden Research Center. She got her PhD from the University of Helsinki in 2007 working under the supervision of Heikki Mannila. Evimaria works in the general area of algorithmic data mining with emphasis on social networks, team formation, urban informatics and recommender systems. She has published more than 50 papers in premier conferences in data mining and data management (SIGKDD, SIGMOD, VLDB, WSDM, WWW, SDM). She has received multiple NSF awards including the NSF CAREER award (2012), the Microsoft Faculty Fellowship (2010) and numerous gifts from companies such as Google, Yahoo and Nokia. Evimaria has been the primary advisor of 7 graduated PhD students (4 female) and has supervised 2 post-docs. Her advisees are either faculty members or they are working in companies such as Meta, Amazon, Twitter and Apple. Evimaria has also acted as a research advisor of numerous undergraduates. In the past, Evimaria has been a visiting researcher at Microsoft Research, Sapienza University of Rome and Aalto University, MPI-SWS and National Technical University of Athens.