Mapping Whisper Representations to Human ECoG Responses with Interpretable Time-Resolved Neural Encoding

June 01, 2026 ยท Grace Period ยท ๐Ÿ› ICLR 2026 Workshop on Representational Alignment

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Matteo Ciferri, Tommaso Boccato, Michal Olak, Matteo Ferrante, Nicola Toschi arXiv ID 2606.02305 Category q-bio.NC Cross-listed cs.HC Citations 0 Venue ICLR 2026 Workshop on Representational Alignment
Abstract
Understanding how speech foundation models relate to human cortical activity is a key challenge for computational neuroscience. Here, we investigate how internal representations from Whisper predict intracranial ECoG responses during naturalistic speech perception. We introduce a time-resolved neural encoder that combines speech embeddings with a recurrent temporal model and soft attention, allowing us to examine layer-wise brain alignment. Intermediate Whisper layers provide the strongest correspondence with neural activity, supporting a hierarchical match between model representations and cortical speech processing. Comparisons with baselines show that high-resolution ECoG responses benefit from temporally structured modelling beyond linear mappings from the same speech representations. In addition, attention maps reveal temporally local alignment between speech embeddings and neural responses, while a phonemic interpretability analysis identifies anatomically coherent phoneme-category organization among encoding-informative electrodes. Together, these results suggest that speech foundation models offer a useful framework for studying time-resolved cortical speech representations.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” q-bio.NC