Vulnerability of LLM Outputs to Heuristics-Inducing Prompt Structures

20261 citationsJournal Articlegold Open Access

Authors

Abstract

Large Language Models (LLMs) have become indispensable tools in daily life. Although LLM applications have rapidly expanded across various domains, distorted outputs (specifically, bias and hallucination) are unresolved problems that threaten the reliability of LLM-based artificial intelligence (AI) agents. Focusing on internal mechanisms and social biases, prior research has rarely considered the possibility of distortion-induction even from non-malicious, ordinary prompts with specific input patterns. This study empirically investigates whether LLMs exposed to certain prompt structures can induce biases commonly elicited in humans, namely, representativeness heuristics, anchoring heuristics, and framing heuristics. To this end, we constructed a test set that triggers one of the three heuristics and evaluated the outputs of state-of-the-art LLMs. We further examined the effectiveness of prompt engineering and debiasing interventions. The LLMs continued to produce heuristic-derived biased outputs under certain prompt conditions. Anchoring heuristics were observed at rates significantly above chance, whereas the representativeness and framing heuristics depended on the model and prompt structure. Debiasing interventions notably reduced the representativeness heuristics but exerted limited impact on anchoring and framing heuristics. This study highlights the need for enhanced awareness of vulnerabilities in LLM outputs against particular prompts. It also reveals that typical prompt-engineering strategies offer insufficient protection against such prompt structures. These results will contribute to the safe and effective use of LLMs in human–computer interactions and AI deployment.

Topics & Keywords

Formal Methods in Verification Logic, programming, and type systems Logic, Reasoning, and Knowledge

Publication Details

DOI: 10.1145/3742413.3789108

Field-Weighted Citation Impact: 63.66

Command Palette

Vulnerability of LLM Outputs to Heuristics-Inducing Prompt Structures

Authors

Abstract

Topics & Keywords

Publication Details