Personalizing Text-to-Image Diffusion Models by Fine-Tuning Classification for AI Applications

Rafael Hidalgo, Nesreen Salah, Rajiv Chandra Jetty, Anupama Jetty, Aparna S. Varde

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Stable Diffusion is a captivating text-to-image model that generates images based on text input. However, a major challenge is that it is pretrained on a specific dataset, limiting its ability to generate images outside of the given data. In this paper, we propose to harness two models based on neural networks, Hypernetworks and DreamBooth, to allow the introduction of any image into Stable Diffusion, addressing versatility with minimal additional training data. This work targets AI applications such as augmenting next-generation multipurpose robots, enhancing human-robot collaboration, feeding intelligent tutoring systems, training autonomous cars, injecting subjects for photo personalization, producing high quality movie animations etc. It can contribute to AI in smart cities: facets such as smart living and smart mobility.

Original languageEnglish
Title of host publicationIntelligent Systems and Applications - Proceedings of the 2023 Intelligent Systems Conference IntelliSys Volume 1
EditorsKohei Arai
PublisherSpringer Science and Business Media Deutschland GmbH
Pages642-658
Number of pages17
ISBN (Print)9783031477201
DOIs
StatePublished - 2024
EventIntelligent Systems Conference, IntelliSys 2023 - Amsterdam, Netherlands
Duration: 7 Sep 20238 Sep 2023

Publication series

NameLecture Notes in Networks and Systems
Volume822
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

ConferenceIntelligent Systems Conference, IntelliSys 2023
Country/TerritoryNetherlands
CityAmsterdam
Period7/09/238/09/23

Keywords

  • ANN
  • Data mining
  • Image processing
  • Movie animations
  • Photo personalization
  • Stable diffusion
  • Text-to-image creation

Fingerprint

Dive into the research topics of 'Personalizing Text-to-Image Diffusion Models by Fine-Tuning Classification for AI Applications'. Together they form a unique fingerprint.

Cite this