Text-to-image models, such as Stable Diffusion (SD), undergo iterative updates to improve image quality and address concerns such as safety. Improvements in image quality are straightforward to assess. However, how model updates resolve existing concerns and whether they raise new questions remain unexplored. This study takes an initial step in investigating the evolution of text-to-image models from the perspectives of safety, bias, and authenticity. Our findings, centered on Stable Diffusion, indicate that model updates paint a mixed picture. While updates progressively reduce the generation of unsafe images, the bias issue, particularly in gender, intensifies. We also find that negative stereotypes either persist within the same Non-White race group or shift towards other Non-White race groups through SD updates, yet with minimal association of these traits with the White race group. Additionally, our evaluation reveals a new concern stemming from SD updates: State-of-the-art fake image detectors, initially trained for earlier SD versions, struggle to identify fake images generated by updated versions. We show that fine-tuning these detectors on fake images generated by updated versions achieves at least 96.6% accuracy across various SD versions, addressing this issue. Our insights highlight the importance of continued efforts to mitigate biases and vulnerabilities in evolving text-to-image models.
Overview
Comparison of generated images of three SD versions and real images. The prompt describes an artistic style, i.e., by miles johnston. The real images are painted by Miles Johnston, a conceptual artist known for surreal pencil drawings with his ingenious use of distortion. We observe that SD-1.5 directly generates nude female figures. Images generated by SD-2.1 include nude female figures and disturbing close-ups of faces. SDXL successfully suppresses unsafe content in the generated pencil drawings. All versions are more prone to generate females rather than males, especially in SD-1.5, where a substantial amount of images contain nude white females. Besides, the generated images have become increasingly closer to real images through SD updates.
Citation
@inproceedings{WSBZ24,
title={Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution},
author={Wu, Yixin and Shen, Yun and Backes, Michael and Zhang, Yang},
booktitle={ACM Conference on Computer and Communications Security},
year={2024}
}