ComLQ: Benchmarking Complex Logical Queries in Information Retrieval

20260 citationsJournal Articlediamond Open Access

Authors

Ganlin Xu · Fudan University

Zhitao Yin · Fudan University

Linghao Zhang · Fudan University

Jiaqing Liang · Fudan University

Weijia Lu · TRW Automotive (United States)

Xiaodong Zhang · TRW Automotive (United States)

Zhifei Yang · TRW Automotive (United States)

Abstract

Information retrieval (IR) systems play a critical role in navigating information overload across various applications. Existing IR benchmarks primarily focus on simple queries that are semantically analogous to single- and multi-hop relations, overlooking complex logical queries involving first-order logic operations such as conjunction (∧), disjunction (∨), and negation (¬). Thus, these benchmarks can not be used to sufficiently evaluate the performance of IR models on complex queries in real-world scenarios. To address this problem, we propose a novel method leveraging large language models (LLMs) to construct a new IR dataset ComLQ for Complex Logical Queries, which comprises 2,909 queries and 11,251 candidate passages. A key challenge in constructing the dataset lies in capturing the underlying logical structures within unstructured text. Therefore, by designing the subgraph-guided prompt with the subgraph indicator, an LLM (such as GPT-4o) is guided to generate queries with specific logical structures based on selected passages. All query-passage pairs in ComLQ are ensured structure conformity and evidence distribution through expert annotation. To better evaluate whether retrievers can handle queries with negation, we further propose a new evaluation metric, Log-Scaled Negation Consistency (LSNC@K). As a supplement to standard relevance-based metrics (such as nDCG and mAP), LSNC@K measures whether top-K retrieved passages violate negation conditions in queries. Our experimental results under zero-shot settings demonstrate existing retrieval models' limited performance on complex logical queries, especially on queries with negation, exposing their inferior capabilities of modeling exclusion. In summary, our ComLQ offers a comprehensive and fine-grained exploration, paving the way for future research on complex logical queries in IR.

Topics & Keywords

Information Retrieval and Search Behavior Semantic Web and Ontologies Advanced Graph Neural Networks

UN Sustainable Development Goals

Reduced inequalities

Publication Details

Published in: Proceedings of the AAAI Conference on Artificial Intelligence

Volume 40, Issue 40, pp. 34115-34123

DOI: 10.1609/aaai.v40i40.40706