To meaningfully study OOD generalization in web-scale datasets, we split LAION into natural images, sketches, and ambiguous images.
Here, we create OOD-specific datasets from LAION to reveal that much of CLIP’s perceived generalization is driven by in-domain examples, showing that challenges from the ImageNet era persist and offering a framework for meaningful assessment and optimization of OOD robustness [].
This website uses cookies to ensure you get the best experience. Learn more.