Hybrid Music Similarity with Hypergraph and Siamese Network

20260 citationsJournal Articlegold Open Access

Authors

Sera Kim · Korea Electronics Technology Institute

Jaewon Lee · Korea Electronics Technology Institute

Dalwon Jang · Korea Electronics Technology Institute

Abstract

This paper proposes a novel method for measuring music similarity. Existing music similarity measurements have often been used for music appreciation, but this paper proposes a method for measuring the similarity between music samples which are used for music production. Conventional music recommendation approaches often rely on either metadata-based similarity or audio-based feature similarity in isolation, which limits their effectiveness in sample-based recommendation scenarios where both compositional context and acoustic characteristics are important. To address this limitation, the proposed framework combines a hypergraph-based information similarity module with a feature-based similarity module learned using Siamese networks and triplet loss. In the information-based module, metadata attributes such as beats per minute (BPM), genre, chord, key, and instrument are modeled as vertices in a hypergraph, and Random Walk–Word2Vec embeddings are learned to capture structural relationships between music samples and their attributes. In parallel, the feature-based module employs vertex-specific Siamese networks trained on instrument and key classification tasks to learn perceptual similarity directly from audio signals. The two modules are trained independently and jointly utilized at the recommendation stage to provide attribute-specific similarity results for a given query sample. Results show that the proposed system achieves high Precision@k across multiple attributes and forms stable similarity structures in the embedding space, even without relying on user interaction data. These results reflect embedding consistency evaluated over the entire dataset where training and retrieval are performed on the same sample pool, rather than generalization to unseen samples. These results demonstrate that the proposed hybrid framework effectively captures both structural and perceptual similarity among music samples and is well suited for sample-based music recommendation in music production environments.

Topics & Keywords

Music and Audio Processing Neuroscience and Music Perception Music Technology and Sound Studies

Publication Details

Published in: Big Data and Cognitive Computing

Volume 10, Issue 3, pp. 96-96

DOI: 10.3390/bdcc10030096

Field-Weighted Citation Impact: 0.00

Command Palette

Hybrid Music Similarity with Hypergraph and Siamese Network

Authors

Abstract

Topics & Keywords

Publication Details