Search for a command to run...
Abstract An increasingly popular approach to investigating the neural bases of speech processing is forward modeling via a multivariate temporal-response function (mTRF). This approach uses stimulus characteristics to predict neural responses, especially in EEG and MEG. A central question in this regard is how best to represent the input stimulus. In the case of speech processing, established representations include the speech envelope or spectrogram, as well as feature-based linguistic representations of phonetic content. However, when multiple representations are used as input, a key challenge is how best to isolate their relative effects. This is particularly challenging because such representations have nonvanishing mutual information. To address this problem, we propose optimizations to the mTRF framework via a novel statistical approach of cyclic permutation. Additionally, we propose methodological improvements to the mTRF model targeting three key challenges: effectively managing spatial and temporal autocorrelations endemic to multi-sensor EEG data; mitigating the effects of endogenous drift; and introducing robust artifact rejection to enhance data quality. To demonstrate the effectiveness of this approach, the novel method was applied to a novel EEG data set of natural language listening in 27 adults with normal hearing. Our data showed that including ICA decomposition, artifact rejection, and cyclic permutations in an mTRF analysis improves the isolation of neural responses specific to phonetic and acoustic input variables. Author Summary Speech processing happens in different stages. It starts with recognizing basic sounds, then categorizes them into discrete categories called phonemes, and goes on to understanding words and sentences. The multivariate temporal response function (mTRF) is a method for predicting brain activity from different features of the speech stimulus. Features that can be used as input to the mTRF model include acoustic features, such as sound envelopes, as well as more abstract language features, such as phonemes, which are a fundamental building block of words. One problem in speech research is distinguishing neural responses to different features. This is challenging because knowing one feature of the speech stimulus enables educated guesses about others and educated predictions about how this feature will behave in the future. Both of these properties of speech make multivariate temporal statistical analysis more difficult. To address this, we propose changes to the preprocessing of the EEG recordings and a new mathematical model that uses a partially rearranged version of the features of the speech stimulus to isolate the predictive power of a particular type of speech feature.