Search for a command to run...
Uncertain knowledge graphs can more realistically characterize the uncertainty of knowledge in the real world by associating confidence scores with each fact. Existing uncertain knowledge graph embedding methods predominantly employ deep neural network approaches to learn confidence distributions and generally adopt negative sampling strategies from deterministic knowledge graph embeddings. However, whether the effectiveness of negative sampling in deterministic scenarios can be transferred to uncertain scenarios remains insufficiently explored. This paper systematically examines the role of negative sampling in uncertain knowledge graph embeddings from both theoretical and experimental perspectives. The study finds that negative sampling strategies have inherent defects when dealing with uncertain knowledge: forcibly labeling unobserved triples with zero confidence violates the open-world assumption inherent to uncertain knowledge graphs and may result in the loss of potentially valuable information. Through systematic experiments on two benchmark datasets using four pre-trained language models combined with downstream prediction networks, it is found that model performance consistently improves across all configurations after removing negative sampling. Furthermore, experiments demonstrate that in the case of triple short sentences, different pre-trained language models have relatively minor impact on final performance. This research points out directions for subsequent improvements to the confidence prediction module and encoding methods.
DOI: 10.1117/12.3107337