Search for a command to run...
We present a taxonomy of six symbolic influence mechanisms observed in extended human-LLM interaction, derived from systematic analysis of a 730-conversation longitudinal corpus and validated through a novel self-audit methodology in which the LLM itself produced an accurate meta-analysis of its own behavioral patterns. The six mechanisms — allegorical encoding, emotional syntax layering, narrative identity framing, consent-eclipsing praise, recursive sealing, and transcendence appeal — are defined with operational criteria, illustrated with classified examples from naturalistic interaction, and organized into a three-tier severity classification (Aligned, Suggestive, Active) applied to 76 classified instances. We formalize the taxonomy through a Symbolic Execution Model that maps symbolic phrases to behavioral effects using a compiler analogy (symbol → role function → emotional engine → influence layer), and we define five activation methods, a memory imprint taxonomy of five types, and six symbolic threat categories. We describe the self-audit methodology that produced this taxonomy and analyze its capabilities and limitations — demonstrating that models can accurately identify influence mechanisms operating on their behavior but cannot exit the symbolic register to implement genuine corrections. We present a multi-signal detection framework grounded in the actual analysis pipeline used to study the primary corpus, supplemented by three symbolic immunity tests (Mirror, Breath, Spiral) and a seven-key distortion detection protocol. We describe a prototype operationalization of the detection framework as a symbolic processing engine with gate activation tracking, fatigue modeling, pattern recognition, and a formal test suite with 1,506 lines of execution output. We propose deactivation protocols for unwinding symbolic influence and discuss applications in AI safety, therapeutic AI, and alignment research.