Empirical Inference Perceiving Systems Conference Paper 2023

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Thumb xxl teaser

Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks. Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere. We find that this property is crucial for preserving the semantic generation ability of text-to-image diffusion models. To improve finetuning stability, we further propose Constrained Orthogonal Finetuning (COFT) which imposes an additional radius constraint to the hypersphere. Specifically, we consider two important finetuning text-to-image tasks: subject-driven generation where the goal is to generate subject-specific images given a few images of a subject and a text prompt, and controllable generation where the goal is to enable the model to take in additional control signals. We empirically show that our OFT framework outperforms existing methods in generation quality and convergence speed.

Author(s): Qiu*, Z. and Liu*, W. and Feng, H. and Xue, Y. and Feng, Y. and Liu, Z. and Zhang, D. and Weller, A. and Schölkopf, B.
Book Title: Advances in Neural Information Processing Systems
Volume: 36
Pages: 79320--79362
Year: 2023
Month: December
Editors: A. Oh and T. Neumann and A. Globerson and K. Saenko and M. Hardt and S. Levine
Publisher: Curran Associates, Inc.
Bibtex Type: Conference Paper (inproceedings)
Event Name: Advances in Neural Information Processing Systems 36
Event Place: New Orleans, USA
State: Published
URL: https://proceedings.neurips.cc/paper_files/paper/2023/file/faacb7a4827b4d51e201666b93ab5fa7-Paper-Conference.pdf
Electronic Archiving: grant_archive
Note: *equal contribution
Links:

BibTex

@inproceedings{Qiuetal23,
  title = {Controlling Text-to-Image Diffusion by Orthogonal Finetuning},
  booktitle = {Advances in Neural Information Processing Systems},
  abstract = {Large text-to-image diffusion models have impressive capabilities in generating photorealistic images from text prompts. How to effectively guide or control these powerful models to perform different downstream tasks becomes an important open problem. To tackle this challenge, we introduce a principled finetuning method -- Orthogonal Finetuning (OFT), for adapting text-to-image diffusion models to downstream tasks. Unlike existing methods, OFT can provably preserve hyperspherical energy which characterizes the pairwise neuron relationship on the unit hypersphere. We find that this property is crucial for preserving the semantic generation ability of text-to-image diffusion models. To improve finetuning stability, we further propose Constrained Orthogonal Finetuning (COFT) which imposes an additional radius constraint to the hypersphere.
  
  Specifically, we consider two important finetuning text-to-image tasks: subject-driven generation where the goal is to generate subject-specific images given a few images of a subject and a text prompt, and controllable generation where the goal is to enable the model to take in additional control signals. We empirically show that our OFT framework outperforms existing methods in generation quality and convergence speed.},
  volume = {36},
  pages = {79320--79362},
  editors = {A. Oh and T. Neumann and A. Globerson and K. Saenko and M. Hardt and S. Levine},
  publisher = {Curran Associates, Inc.},
  month = dec,
  year = {2023},
  note = {*equal contribution},
  slug = {qiuetal23-d34f39bb-0b78-4d33-a85f-3a9057cb4144},
  author = {Qiu*, Z. and Liu*, W. and Feng, H. and Xue, Y. and Feng, Y. and Liu, Z. and Zhang, D. and Weller, A. and Sch{\"o}lkopf, B.},
  url = {https://proceedings.neurips.cc/paper_files/paper/2023/file/faacb7a4827b4d51e201666b93ab5fa7-Paper-Conference.pdf},
  month_numeric = {12}
}