MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

202445 citationsJournal Article

Authors

Xiaoting Zhang · Huawei Technologies (China)

Abstract

We present a Multi-Instance Generation (MIG) task, si-multaneously generating multiple instances with diverse controls in one image. Given a set of predefined coordinates and their corresponding descriptions, the task is to ensure that generated instances are accurately at the designated locations and that all instances' attributes adhere to their corresponding description. This broadens the scope of current research on Single-instance generation, elevating it to a more versatile and practical dimension. Inspired by the idea of divide and conquer, we introduce an innovative approach named Multi-Instance Generation Controller (MIGC) to address the challenges of the MIG task. Ini-tially, we break down the MIG task into several subtasks, each involving the shading of a single instance. To ensure precise shading for each instance, we introduce an instance enhancement attention mechanism. Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD). To evaluate how well generation models per-form on the MIG task, we provide a COCO-MIG bench-mark along with an evaluation pipeline. Extensive experiments were conducted on the proposed COCO-MIG bench-mark, as well as on various commonly used benchmarks. The evaluation results illustrate the exceptional control ca-pabilities of our model in terms of quantity, position, at-tribute, and interaction. Code and demos will be released at https://migcproject.github.io/.

Topics & Keywords

Generative Adversarial Networks and Image Synthesis Handwritten Text Recognition Techniques Multimodal Machine Learning Applications

Publication Details

DOI: 10.1109/cvpr52733.2024.00651

Field-Weighted Citation Impact: 11.31

Command Palette

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Authors

Abstract

Topics & Keywords

Publication Details