TY - GEN
T1 - Personalizing Text-to-Image Diffusion Models by Fine-Tuning Classification for AI Applications
AU - Hidalgo, Rafael
AU - Salah, Nesreen
AU - Chandra Jetty, Rajiv
AU - Jetty, Anupama
AU - Varde, Aparna S.
N1 - Publisher Copyright:
© 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2024
Y1 - 2024
N2 - Stable Diffusion is a captivating text-to-image model that generates images based on text input. However, a major challenge is that it is pretrained on a specific dataset, limiting its ability to generate images outside of the given data. In this paper, we propose to harness two models based on neural networks, Hypernetworks and DreamBooth, to allow the introduction of any image into Stable Diffusion, addressing versatility with minimal additional training data. This work targets AI applications such as augmenting next-generation multipurpose robots, enhancing human-robot collaboration, feeding intelligent tutoring systems, training autonomous cars, injecting subjects for photo personalization, producing high quality movie animations etc. It can contribute to AI in smart cities: facets such as smart living and smart mobility.
AB - Stable Diffusion is a captivating text-to-image model that generates images based on text input. However, a major challenge is that it is pretrained on a specific dataset, limiting its ability to generate images outside of the given data. In this paper, we propose to harness two models based on neural networks, Hypernetworks and DreamBooth, to allow the introduction of any image into Stable Diffusion, addressing versatility with minimal additional training data. This work targets AI applications such as augmenting next-generation multipurpose robots, enhancing human-robot collaboration, feeding intelligent tutoring systems, training autonomous cars, injecting subjects for photo personalization, producing high quality movie animations etc. It can contribute to AI in smart cities: facets such as smart living and smart mobility.
KW - ANN
KW - Data mining
KW - Image processing
KW - Movie animations
KW - Photo personalization
KW - Stable diffusion
KW - Text-to-image creation
UR - http://www.scopus.com/inward/record.url?scp=85182514364&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-47721-8_44
DO - 10.1007/978-3-031-47721-8_44
M3 - Conference contribution
AN - SCOPUS:85182514364
SN - 9783031477201
T3 - Lecture Notes in Networks and Systems
SP - 642
EP - 658
BT - Intelligent Systems and Applications - Proceedings of the 2023 Intelligent Systems Conference IntelliSys Volume 1
A2 - Arai, Kohei
PB - Springer Science and Business Media Deutschland GmbH
T2 - Intelligent Systems Conference, IntelliSys 2023
Y2 - 7 September 2023 through 8 September 2023
ER -