DCASE2026Task4Dataset: The Dataset for Spatial Semantic Segmentation of Sound Scenes

20260 citationsDatasetgreen Open Access

Authors

Masahiro Yasuda · NTT (United States)

Binh Thien Nguyen · NTT (United States)

Noboru Harada · NTT (United States)

Daiki Takeuchi · NTT (United States)

Abstract

The DCASE 2026 Task 4 development set was created for the Spatial Semantic Segmentation of Sound Scenes task, which focuses on detecting and separating target sound events from multichannel spatial audio mixtures. The dataset was constructed by combining newly recorded data, screened recordings from the dataset released for DCASE 2025 Challenge Task 4, and selected publicly available recordings. It was designed to reflect the revised task setting for DCASE 2026, including mixtures that may contain multiple target sources from the same class as well as mixtures that contain no target sound events. The development set includes the recorded components used for mixture synthesis. These consist of isolated target sound events from 18 classes, first-order Ambisonics room impulse responses recorded in real rooms, multichannel background-noise recordings, and interference sounds. The target-event recordings were prepared for source-level mixture construction, while the room impulse responses and background recordings provide realistic spatial and environmental conditions for the generated sound scenes. In addition to the recorded components, this release also includes pre-synthesized test mixtures for the development set. These mixtures were generated from the recorded components under the official task conditions and are intended for reproducible evaluation and system comparison. Together, the recorded components and the synthesized test data provide a practical development resource for training, validation, and benchmarking systems for DCASE 2026 Task 4. License: see the file named LICENSE.pdf

Topics & Keywords

Publication Details

Published in: Zenodo (CERN European Organization for Nuclear Research)

DOI: 10.5281/zenodo.19328045

Command Palette

DCASE2026Task4Dataset: The Dataset for Spatial Semantic Segmentation of Sound Scenes

Authors

Abstract

Topics & Keywords

Publication Details