Special Talk: Verifiable Approaches to Trustworthy Machine Learning: Lessons from Researching Unlearning (Talk)
The talk presents open problems in the study of trustworthy machine learning. We begin by broadly characterizing the attack surface of modern machine learning algorithms. We then illustrate the challenges of having end user’s trust that machine learning algorithms were deployed responsibly, i.e., verify its trustworthiness, through a deep dive on the problem of unlearning. The need for machine unlearning, i.e., obtaining a model one would get without training on a subset of data, arises from privacy legislation and as a potential solution to data poisoning or copyright claims. As we present different approaches to unlearning, it becomes clear that they fail to answer the following question: how can end users verify that unlearning was successful? We show how an entity can claim plausible deniability when challenged about an unlearning request that was claimed to be processed, and conclude that at the level of model weights, being unlearnt is not always a well-defined property. Put another way, we find that unlearning is an algorithmic property. Taking a step back, we draw lessons for the broader area of trustworthy machine learning. In order for companies, regulators, and countries to verify meaningful properties at the scale that is required for stable governance of AI algorithms both nationally and internationally, our insight is that ML algorithms need to be co-designed with cryptographic protocols.
Biography: Dr. Nicolas Papernot is an Assistant Professor of Electrical and Computer Engineering at the University of Toronto and a Faculty Member at the Vector Institute where he holds a Canada CIFAR AI Chair. His research interests are broadly at the intersection of computer security, privacy, and machine learning. Professor Papernot earned his PhD degree in Computer Science and Engineering from the Pennsylvania State University supported by a Google PhD Fellow in Security and Privacy. His PhD research focused on characterizing the attack surface of machine learning systems and inventing defense mechanisms to improve their security and privacy. Prior to joining the University of Toronto, Nicolas spent a year as a research scientist at Google Brain. His work has been applied in industry and academia to evaluate and improve the robustness of machine learning models, to input perturbations known as adversarial examples, as well as to deploy machine learning with privacy guarantees for training data at industry scale. He was invited to write technical articles for the CACM and IEEE Security and Privacy Magazine. Nicolas is a program committee member of ACM CCS, IEEE S&P, and USENIX Security. He also reviews for ICML, ICLR, and NeurIPS. He has chaired or co-organized workshops on security and privacy for machine learning at ICML, DSN, and NeurIPS.