Search for a command to run...
Biodiversity data linkage is a prerequisite for addressing pressing societal and research questions. Digitisation of resources has progressed rapidly, but this has led to a number of parallel approaches for datasets and services, making it difficult for users to identify the appropriate tool for their requirements. While scientific names are a key element for data linkage, they are not fully suited to this role. Many of them are not unique identifiers for taxa (such as the species that form the basic building block of a large part of biodiversity information) because taxonomic research has resulted in name changes. Likewise, usage is prone to errors like misspellings or misapplication of nomenclatural rules. This complexity and unreliability may demotivate data providers from readying their valuable datasets for integration into an interlinked landscape of biodiversity information. Correct linking of scientific names is an important element in this process of linking data from various sources. Aggregated taxonomic datasets and name-matching services are an essential infrastructure to build tools for this purpose and are increasingly using unique, resolvable and persistent name identifiers. Accessing the aggregated dataset via these identifiers can also display connections between the names within the dataset, linking, orthographic variants to correct spelling, synonyms to accepted taxon name etc. External datasets that hold valuable research and usage information as taxon-associated data can also be linked directly via these identifiers. This helps them to meet the FAIR data criteria (Findable, Accessible, Interoperable, Reusable). Name matching finds the aggregator's name identifier, adding this identifier to the local data record may help to further interconnect biodiversity information, if data providers are motivated to partake in this process of enabling data linkage. We posit that this motivation can be increased by simplifying the process of choosing the right method for their specific needs, which may vary widely. We thus propose a framework that helps users to select the optimal workflow to address these needs. This involves: (i) identifying and categorising the specific criteria that characterise different user needs; (ii) based on these criteria, defining the necessary metadata both for taxonomic datasets accessed by name-matching services and the name-matching services themselves; (iii) providing tools based on these metadata that will allow users to interactively and confidently choose the best-fit target dataset and name matching service for their specific purpose.