Search for a command to run...
Image stitching, a fundamental problem in computer vision, aims to generate panoramic images with an extended field of view. However, existing methods often struggle with large parallax, weak textures, and illumination variations, leading to misalignment and visible seams. In this work, we propose semantics-guided hybrid stitching (SGHS), a semantics-guided hybrid framework that recasts image stitching as a coarse-to-fine generative inpainting problem. SGHS combines robust semantic alignment with a semantic-aware mask and multimodal conditioning to steer a pretrained diffusion model. A lightweight low-rank adaptation (LoRA) module adapts the generator to the stitching task, and an edge-guided enhancement module sharpens seams. This design is resilient to large parallax, texture sparsity, and illumination changes, producing geometry-faithful panoramas with fewer artifacts. Extensive experiments on the challenging unsupervised deep image stitching dataset benchmark across high-, medium-, and low-overlap regimes, together with cross-dataset evaluations on as-projective-as-possible and real unmanned aerial vehicle mosaics, demonstrate that SGHS consistently improves peak signal-to-noise ratio and structure similarity index measure over representative warping- and inpainting-based baselines while yielding visibly cleaner seams, better straight-line preservation, and stronger semantic consistency. Ablation studies further validate the effectiveness of the semantic alignment, semantic-aware mask and conditioning, LoRA adaptation, and edge-guided enhancement components.