Back
The talk presents open problems in the study of trustworthy machine learning. We begin by broadly characterizing the attack surface of modern machine learning algorithms. We then illustrate the challenges of having end user’s trust that machine learning algorithms were deployed responsibly, i.e., verify its trustworthiness, through a deep dive on the problem of unlearning. The need for machine unlearning, i.e., obtaining a model one would get without training on a subset of data, arises from privacy legislation and as a potential solution to data poisoning or copyright claims. As we present different approaches to unlearning, it becomes clear that they fail to answer the following question: how can end users verify that unlearning was successful? We show how an entity can claim plausible deniability when challenged about an unlearning request that was claimed to be processed, and conclude that at the level of model weights, being unlearnt is not always a well-defined property. Put another way, we find that unlearning is an algorithmic property. Taking a step back, we draw lessons for the broader area of trustworthy machine learning. In order for companies, regulators, and countries to verify meaningful properties at the scale that is required for stable governance of AI algorithms both nationally and internationally, our insight is that ML algorithms need to be co-designed with cryptographic protocols.
Nicolas Papernot (University of Toronto)
Assistant Professor
Dr. Nicolas Papernot is an Assistant Professor of Electrical and Computer Engineering at the University of Toronto and a Faculty Member at the Vector Institute where he holds a Canada CIFAR AI Chair. His research interests are broadly at the intersection of computer security, privacy, and machine learning.