Search for a command to run...
Environmental DNA (eDNA) is increasingly used to monitor biodiversity, biosecurity and invasive species, providing insights into species presence across ecosystems. As eDNA datasets grow, interoperability and accessibility are crucial. OBIS Australia (OBIS-AU), Australia’s node of the United Nations Educational, Scientific and Cultural Organization (UNESCO) International Oceanographic Data and Information Exchange (IODE) Ocean Biodiversity Information System (OBIS), hosted by the Commonwealth Scientific and Industrial Research Organisation (CSIRO) National Collections and Marine Infrastructure (NCMI), promotes use of the DNA Derived Data Extension in Darwin Core (DwC) to standardise publication of eDNA and metabarcoding data Wieczorek et al. (2012), Abarenkov et al. (2023). OBIS-AU has published over 26 datasets with 21 million eDNA records to the OBIS. OBIS-AU is developing a scalable and interoperable eDNA data publishing pipeline that integrates tools such as the Findable, Accessible, Interoperable, Reusable eDNA (FAIRe) suite and the Global Biodiversity Information Facility (GBIF) Metabarcoding Data Toolkit (MDT) to transform diverse eDNA source data into Darwin Core Archives (DwC-A) for publication to OBIS, GBIF, and Atlas of Living Australia (ALA). By leveraging metadata standards including DwC, DNA Derived Data Extension, Minimum Information about any (X) Sequence (MIxS) and the FAIRe metadata checklist, the pipeline enables standardised, FAIR-compliant data publishing Takahashi et al. (2025), Meyer et al. (2023). It supports multiple transformation pathways, ensuring that eDNA datasets are consistent, reusable, and aligned with global biodiversity data infrastructures. The FAIR eDNA initiative enhances the FAIRness of eDNA data by extending standards like DwC and MIxS with eDNA-tailored metadata terms Takahashi et al. (2025). The FAIRe tools (FAIRe-ator, FAIRe-fier, and FAIRe2MDT) facilitate creation, validation, and conversion of standardised metadata to improve interoperability and reusability across platforms. The Metabarcoding Data Toolkit (MDT) is an open-source web tool that streamlines publishing of eDNA metabarcoding data by converting common data structures [e.g., Operational Taxonomic Unit (OTU) tables, taxonomy, metadata, Format for All Sequences from All Species (FASTA) files] into DwC-A GBIF Secretariat (2024). This modular design allows the pipeline to accommodate diverse data types and processing workflows while ensuring compatibility with global biodiversity data standards. The pipeline as shown in Fig. 1 provides four main pathways for converting source data into DwC-A and publishing them via the Integrated Publishing Toolkit (IPT): Directly publishing data already formatted as DwC-A; Transforming source data using a simple DwC pipeline with custom DwC transformation script; Generating DwC-A file from source data via the MDT tool; and Using FAIRe tools and converting it to DwC-A via the MDT tool or a custom transformation script. Directly publishing data already formatted as DwC-A; Transforming source data using a simple DwC pipeline with custom DwC transformation script; Generating DwC-A file from source data via the MDT tool; and Using FAIRe tools and converting it to DwC-A via the MDT tool or a custom transformation script. The Australian Microbiome (AM) Initiative is a national collaborative research program characterising microbial diversity across Australia’s terrestrial, freshwater, coastal, and marine environments. The AM data pipeline depicted in Fig. 2 transforms data stored in the AM Data Portal using the MDT tool to generate DwC-A for publication to global repositories. The Globalising Marine Biodiversity Observations (GLOMBO) Partnership is a collaboration between CSIRO and the Minderoo Foundation aimed at improving the large-scale monitoring of Australia’s vast marine ecosystems by deploying automated eDNA sampling systems to gather samples continuously during voyages, with the first installation taking place on CSIRO’s research vessel Investigator . This approach will be expanded through a network of “ships of opportunity,” encompassing research, commercial, and tourist vessels that contribute to a nationwide eDNA monitoring effort. The scalable data pipeline is proposed to automate, integrate, and disseminate eDNA datasets, enabling comprehensive, real-time insights into marine biodiversity across Australia’s oceans as illustrated in Fig. 3. Recording eDNA-derived species occurrences presents several challenges. One example is taxonomic ambiguity, often caused by incomplete reference databases like GenBank, World Register of Marine Species (WoRMS), Barcode of Life Data (BOLD), or SILVA. Linking eDNA sequence reads to biodiversity occurrence records is complex and requires expert knowledge and infrastructure integrating sequence data, metadata, and taxonomy. Technical barriers, limited engagement, and lack of incentives to make data accessible, hinder open access to eDNA data. OBIS-AU is addressing these challenges by exploring tools like GBIF’s MDT, OBIS’s Pacific Islands Marine bioinvasions Alert Network (PacMAN) pipeline, FAIRe tools, and AI-based tools as well as expert support and developing new automation pipelines to assist data publishers. OBIS-AU has published eDNA data using DwC Occurrence Core and DNA Derived Data Extension and is now testing a publication model with the DwC Event Core to better capture sampling context and improve integration, interoperability, and reuse of complex eDNA datasets. OBIS-AU intends to align with the new DwC Data Package to support modular publishing of marine biodiversity data.