Disentangling Local and Global Semantics in Diffusion Models for Image Editing

20260 citationsJournal Articlehybrid Open Access

Authors

Manos Plitsis · National and Kapodistrian University of Athens

Theodoros Kouzelis · National Technical University of Athens

Panagiotis Koromilas · National and Kapodistrian University of Athens

Vassilis Katsouros · Institute for Language and Speech Processing

Mihalis A. Nicolaou · Cyprus Institute

Panagakis Yannis · National and Kapodistrian University of Athens

Abstract

Abstract Diffusion models have achieved state-of-the-art image synthesis, yet unlike GANs, they lack a well-structured latent space for intuitive image editing. Existing diffusion-based editing methods often rely on supervised fine-tuning or text-based guidance, while recent unsupervised techniques leveraging the model’s bottleneck layer suffer from one or more key limitations: (i) they focus only on global attributes, (ii) fail to disentangle local and global semantics, or (iii) require extensive human intervention. To fill this gap, we first propose an unsupervised method for localized image editing in pre-trained unconditional diffusion models that disentangles local and global semantics in the model’s latent space. Given an input image and a user-specified region of interest, our approach uses the denoising network’s Jacobian to map that region to a corresponding latent subspace. We then separate this subspace into shared (global) and region-specific components to uncover latent directions that control local attributes. These directions generalize across images, enabling semantically consistent edits without retraining. We go one step further by extending our method to minimize manual supervision by automatically inferring edit directions from a single reference image and generating region masks without human input. Experiments on multiple datasets show that our method yields more localized, high-fidelity edits than state-of-the-art approaches.

Topics & Keywords

Generative Adversarial Networks and Image Synthesis Cell Image Analysis Techniques Digital Humanities and Scholarship

Publication Details

Published in: International Journal of Computer Vision

Volume 134, Issue 4

DOI: 10.1007/s11263-025-02694-y

Field-Weighted Citation Impact: 0.00

Command Palette

Disentangling Local and Global Semantics in Diffusion Models for Image Editing

Authors

Abstract

Topics & Keywords

Publication Details