Editing 3D Scenes via Text Prompts without Retraining

20260 citationsJournal Article

Authors

Shuangkang Fang · Beihang University

Yufeng Wang · Beihang University

Yi-Hsuan Tsai · ATM (Poland)

wenrui ding · Beihang University

Yi Yang · China Iron and Steel Research Institute Group

Shuchang Zhou · China Iron and Steel Research Institute Group

Ming-Hsuan Yang ·

Abstract

Numerous diffusion models have been developed for 2D image synthesis and editing, and recently they are extended to 3D scene editing tasks. However, editing 3D scenes is still in its early stages, and the challenges of scene representations and multi-view consistency need to be addressed. A notable limitation of existing approaches is the need for specific modules for different edits and model retraining for each scene. To tackle these issues, we propose a novel and versatile text-driven 3D scene editing method, termed DN2N, which allows for the direct acquisition of the editing results without the requirement for retraining. Our method employs off-the-shelf text-based editing models of 2D images to modify the multi-view images of a 3D scene. A content filtering process is then applied to discard poorly edited images that disrupt 3D consistency. We consider the remaining inconsistency as a problem of removing noise perturbations and solve it by generating data with similar perturbation characteristics for training. We develop a versatile NeRF model structure and propose two novel cross-view regularization terms to help the DN2N mitigate these perturbations. Empirical results show that our method achieves multiple editing types based solely on text prompts, including but not limited to appearance editing, weather transition, object changing, and style transfer. Most importantly, DN2N exhibits a versatility of editing capabilities, eliminating the need to customize or retrain editing models for specific scenes or editing types. Namely, DN2N achieves comparable total editing time to the 3DGS-based editing method, enhancing its practical value.

Topics & Keywords

Interactive and Immersive Displays Digital Humanities and Scholarship Human Motion and Animation

Publication Details

Published in: IEEE Transactions on Visualization and Computer Graphics

Volume PP, pp. 1-15

DOI: 10.1109/tvcg.2026.3668359

Field-Weighted Citation Impact: 0.00

Command Palette

Editing 3D Scenes via Text Prompts without Retraining

Authors

Abstract

Topics & Keywords

Publication Details