Search for a command to run...
Achieving robust human-machine interaction in noisy, constrained, or speech-impaired environments remains a significant challenge for conventional voice-based systems. Here, we present a wearable, flexible, and multichannel piezoresistive interface capable of decoding laryngeal and submandibular motion during complex speech behaviors. The system integrates a micropyramid polydimethylsiloxane (PDMS) sensing layer coated with conductive polypyrrole (PPy) onto a multichannel electrode array supported by a flexible polyimide (PI) substrate, providing superior skin conformity, high strain sensitivity, and robust long-term stability. We developed a fully integrated hardware platform enabling four-channel synchronous data acquisition, wireless transmission, and real-time on-device processing. A modified Audio Spectrogram Transformer (AST) combined with a multichannel fusion mechanism enables end-to-end semantic recognition. Using a 14-word core English vocabulary, we constructed two structured datasets─Microphone and Vocal─comprising a total of 3,840 samples. The system achieved classification accuracies of 99.6% and 96.4%, respectively, highlighting strong generalizability, semantic clarity, and robustness against signal variability. Real-world evaluations confirm stable performance under motion, facial expressions, and background noise. By unifying soft materials engineering, flexible circuit integration, and multimodal deep learning, this work advances speech recognition in complex environments and offers a scalable solution for assistive communication, wearable AI, and silent interaction under extreme conditions.