Search for a command to run...
General description This repository contains all development data for the task 2 "Supervised Detection of Strongly-Labelled Whale Calls" of the BioDCASE challenge (edition 2026, edition 2025). They have been derived from the ATBFL library, to make a version more fitted to the challenge. ATBFL is one of the largest annotated datasets in marine bioacoustics, gathering underwater recordings around Antarctica from 2005 to 2017. Overall, the data are composed of : 6591 audio files (6004 in train set + 587 in validation set) totaling 1880 hours of recordings from 11 different deployments organized in site-year datasets (eg, kerguelen2015); 11 CSV annotation files named after each corresponding site-year dataset. See "Annotation structure" for more details. Folder structure 2026_BioDCASE_development_set.zip |___2026_BioDCASE_development_set/ |____train/ |____annotations/ |____siteYYYY1.csv |____siteYYYY2.csv |____... |____audio/ |____siteYYYY1/ |____*.wav |____...... |____siteYYYY2/ |____... |____validation/ |____annotations/ |____siteYYYY3.csv |____siteYYYY4.csv |____... |____audio/ |____siteYYYY3/ |____*.wav |____siteYYYY4/ |____... Annotation structure Each annotated sound event is defined by the tuple (dataset,filename,annotation,annotator,low_frequency,high_frequency,start_datetime,end_datetime), with annotation representing the class label and taking a unique value in {bma, bmb, bmz, bmd, bpd, bp20, bp20plus}. Note that these labels correspond to a more machine-readable version of the list {BmA, BmB, BmZ, BmD, BpD, Bp20, Bp20plus} described below in the section Call description. Annotator represents the short name of the expert annotator who have produced the annotation file. There is one single annotator per dataset but a same annotator may have annotated several datasets. As calls may overlap, the set up is multi-label : one file or segment of file is likely to contain several classes. Please note that evaluation metric is IoU 1D over the temporal axis : the frequency component is provided but will not be part of the final evaluation. Please note that 7 classes are provided but the evaluation will only take 3 into account as calls can be gathered by similarity following this table : bma bmb bmz bmd bpd bp20 bp20plus ABZ call ABZ call ABZ call Downsweep Downsweep Bp call Bp call Calls description BmA, BmB, and BmZ calls are specific to blue whales (Balaenoptera musculus intermedia, Bm), while Bp20 and Bp20Plus calls are characteristic of fin whales (Balaenoptera physalus quoyi, Bp). Both species also produce downsweeps. As described by Miller et al. (see "Related Works"), BmA calls consist of a constant-frequency tone between 25 and 28 Hz, without additional units. BmB calls are similar but followed by a partial or full inter-tone downsweep. BmZ calls contain two tonal units: A (higher frequency) and C (lower frequency). Occasionally, a B downsweep unit appears between them, forming a "Z" shape on spectrograms. Bp20 and Bp20Plus vocalizations are pulsed calls with peak energy at 20 Hz (Bp20) and additional energy at higher frequencies (80–100 Hz) in Bp20Plus. Downsweeps are characterized by a continuous frequency modulation from f₁ to f₂, where f₁ > f₂. Example spectrograms are available on the here, with more detailed references in Miller et al. Dataset statistics Train set Dataset Number of audio recordings Total audio duration (h) Total sound event Ratio event presence (%) ballenyisland2015 205 204 2222 1.4 casey2014 194 194 6866 7.3 elephantisland2013 2247 187 21966 8.6 elephantisland2014 2595 216 20964 13 greenwich2015 190 32 1128 6.5 kerguelen2005 200 200 2960 1.8 maudrise2014 200 83 2360 7 rosssea2014 176 176 104 0.1 TOTAL / AVERAGE 6004 1292 58570 5.7 Validation set Dataset Number of audio recordings Total audio duration (h) Total sound event Ratio event presence (%) casey2017 187 187 3263 3.3 kerguelen2014 200 200 8822 5.7 kerguelen2015 200 200 5542 3.7 TOTAL / AVERAGE 587 587 17627 4.2 A more complete version of this table is available here, with more statistics within the different classes and more information on the recording deployments. Supplementary resources All datasets, annotations and pre-trained models from this list are allowed. In particular, we inform participants that additional unlabeled data originating from the same scientific program are available here, and that other comparable datasets from the same geographical regions can be accessed on the OPUS platform ; Use of other external data (e.g. audio files, annotations) and pre-trained models are allowed only after approval from the task coordinators (contact: dorian.cazau@ensta.fr). Note that these external data and models should be at least publicly available. Evaluation set The evaluation set for this task will be released on June 1, 2026. Data collection Original data were collected, curated, annotated and published by Miller, B.S., The IWC-SORP/SOOS Acoustic Trends Working Group., Balcazar, N. et al. An open access dataset for developing automated detectors of Antarctic baleen whale sounds and performance evaluation of two commonly used detectors. Sci Rep 11, 806 (2021). https://doi.org/10.1038/s41598-020-78995-8 (see Related works section) Dataset version control Below we track changes applied to our challenge datasets over time. Original data They can be accessed via the Australian Antarctic Data Centre at https://data.aad.gov.au/metadata/records/AcousticTrends_BlueFinLibrary BioDCASE2025 Minor changes were brought to the original data to fit the challenge formatting conventions : adopting more consistent naming conventions, particularly for file paths. All .wav filenames now follow the format "YYYY-mm-ddTHH-MM-SS_fff.wv", and the same convention is used in the CSV files under the "filename" column. Additionally, the start and end timestamps for temporal annotation boxes are formatted as "YYYY-mm-ddTHH:MM:SS.ffffff+zz:zz", all in Zulu Time, pooling together call-type-specific Raven tables into one CSV file per subdataset, named after the site-year annotated, resampling all audio files at 250 Hz, the minimal original sample rate of several subdatasets. This was done to standardize sample rates and because the vocalizations of interest are very low-frequency calls (15–120 Hz), ensuring no loss of information at sampling rate of 250 Hz BioDCASE2026 During the 2025 edition, several minor bugs were identified and reported here. The following fixes have been applied for the 2026 version : Wrong file names : 5 annotations in the dataset elephantisland2014 had the wrong filename "2014-10-05T02-00-00_000.wav", it was replaced with the correct one "2014-10-05T03-00-00_000.wav" ; Truncated annotation boxes at the end of audio files : some annotation boxes exceeded their audio files, either because the annotator pulled the Raven box a bit too long or because a longer audio file was used during annotation. The fixes applied were : 1) the end timestamp of a truncated annotation box is set to the audio file duration , 2) if the resulting truncated box is shorter than 0.286 seconds (corresponding to the smallest annotation durations from BmABZ, downsweeps and Bp20 labels), then it is removed from the annotation set. 76 annotation boxes were reduced in time and only 2 were removed. Changes in datasets between the 2025 and 2026 editions can be further inspected throug the summary statistics tables (2025 and 2026). Open access Data used in this study are publicly available under a Creative Commons 4.0 Attribution licence. It is attributed to Miller, B.S., The IWC-SORP/SOOS Acoustic Trends Working Group., Balcazar, N. et al. Contact info Please, send any feedback or question to : Dorian CAZAU (dorian.cazau@ensta.fr) | Lucie JEAN-LABADYE (lucie.jean-labadye@sorbonne-universite.fr) | Cléa PARCERISAS (clea.parcerisas@vliz.be)