Language is the key to robust vision systems

ORGANIZERS

Perceiving Systems

Lea Müller

Guest Scientist

The ability to extend a model beyond the domain of the training data is central to building robust computer vision models. Methods for dealing with unseen test distributions or biased training data often require leveraging additional image data, but linguistic knowledge of the task and potential domain shifts is much cheaper and easier to obtain. In this talk, I will present three recent works that focus on different ways one can improve accuracy with language advice and incomplete training data via large-scale vision and language models.

Speaker Biography

Lisa Dunlap (University of California, Berkeley)

PhD student

Lisa Dunlap is a second-year PhD student at UC Berkeley working on understanding large datasets and improving robustness in vision systems with language. Recently her focus has been on using diffusion-based image editing to improve accuracy in fine-grained classification and summarizing differences between image datasets with text. She is co-advised by Trevor Darrell in BAIR as well as Joseph Gonzalez in Sky Computing Lab.

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives