Nature Human Behaviour
nature human behaviour
https://doi.org/10.1038/s41562-024-01867-y
Artice
Representation of internal speech by single
neurons in human supramarginal gyrus
Sarah K. Wandelt
1,2
, David A. Bjånes
1,2,3
, Kelsie Pejsa
1,2
, Brian Lee
1,4,5
,
Charles Liu
1,3,4,5
& Richard A. Andersen
1,2
Speech brain–machine interfaces (BMIs) translate brain signals into words
or audio outputs, enabling communication for people having lost their
speech abilities due to diseases or injury. While important advances in
vocalized, attempted and mimed speech decoding have been achieved,
results for internal speech decoding are sparse and have yet to achieve
high functionality. Notably, it is still unclear from which brain areas
internal speech can be decoded. Here two participants with tetraplegia
with implanted microelectrode arrays located in the supramarginal gyrus
(SMG) and primary somatosensory cortex (S1) performed internal and
vocalized speech of six words and two pseudowords. In both participants,
we found significant neural representation of internal and vocalized speech,
at the single neuron and population level in the SMG. From recorded
population activity in the SMG, the internally spoken and vocalized words
were significantly decodable. In an offline analysis, we achieved average
decoding accuracies of 55% and 24% for each participant, respectively
(chance level 12.5%), and during an online internal speech BMI task, we
averaged 79% and 23% accuracy, respectively. Evidence of shared neural
representations between internal speech, word reading and vocalized
speech processes was found in participant 1. SMG represented words as well
as pseudowords, providing evidence for phonetic encoding. Furthermore,
our decoder achieved high classification with multiple internal speech
strategies (auditory imagination/visual imagination). Activity in S1 was
modulated by vocalized but not internal speech in both participants,
suggesting no articulator movements of the vocal tract occurred during
internal speech production. This work represents a proof-of-concept for a
high-performance internal speech BMI.
Speech is one of the most basic forms of human communication, a
natural and intuitive way for humans to express their thoughts and
desires. Neurological diseases like amyotrophic lateral sclerosis (ALS)
and brain lesions can lead to the loss of this ability. In the most severe
cases, patients who experience full-body paralysis might be left with-
out any means of communication. Patients with ALS self-report loss
of speech as their most serious concern
1
. Brain–machine interfaces
(BMIs) are devices offering a promising technological path to bypass
Received: 15 May 2023
Accepted: 16 March 2024
Published online: xx xx xxxx
Check for updates
1
Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
2
T&C Chen Brain-Machine Interface Center,
California Institute of Technology, Pasadena, CA, USA.
3
Rancho Los Amigos National Rehabilitation Center, Downey, CA, USA.
4
Department of
Neurological Surgery, Keck School of Medicine of USC, Los Angeles, CA, USA.
5
USC Neurorestoration Center, Keck School of Medicine of USC,
Los Angeles, CA, USA.
e-mail:
skwandelt@gmail.com
Nature Human Behaviour
Artice
https://doi.org/10.1038/s41562-024-01867-y
affected inner speech rhyming tasks
39
. Recently, ref.
16
showed that
electrode grids over SMG contributed to vocalized speech decoding.
Finally, vocalized grasps and colour words were decodable from SMG
from one of the same participants involved in this work
23
. These stud-
ies provide evidence for the possibility of an internal speech decoder
from neural activity in the SMG.
The relationship between inner speech and vocalized speech is still
debated. The general consensus posits similarities between internal
and vocalized speech processes
36
, but the degree of overlap is not
well understood
8
,
35
,
40
–
42
. Characterizing similarities between vocalized
and internal speech could provide evidence that results found with
vocalized speech could translate to internal speech. However, such
a relationship may not be guaranteed. For instance, some brain areas
involved in vocalized speech might be poor candidates for internal
speech decoding.
In this Article, two participants with tetraplegia performed inter
-
nal and vocalized speech of eight words while neurophysiological
responses were captured from two implant sites. To investigate neural
semantic and phonetic representation, the words were composed of
six lexical words and two pseudowords (words that mimic real words
without semantic meaning). We examined representations of various
language processes at the single-neuron level using recording micro
-
electrode arrays from the SMG located in the posterior parietal cortex
(PPC) and the arm and/or hand regions of the primary somatosensory
cortex (S1). S1 served as a control for movement, due to emerging evi-
dence of its activation beyond defined regions of interest
43
,
44
. Words
were presented with an auditory or a written cue and were produced
internally as well as orally. We hypothesized that SMG and S1 activity
would modulate during vocalized speech and that SMG activity would
modulate during internal speech. Shared representation between
internal speech, vocalized speech, auditory comprehension and word
reading processes was investigated.
Results
Task design
We characterized neural representations of four different language
processes within a population of SMG and S1 neurons: auditory
neurological impairment by recording neural activity directly from the
cortex. Cognitive BMIs have demonstrated potential to restore inde
-
pendence to participants with tetraplegia by reading out movement
intent directly from the brain
2
–
5
. Similarly, reading out internal (also
reported as inner, imagined or covert) speech signals could allow the
restoration of communication to people who have lost it.
Decoding speech signals directly from the brain presents its own
unique challenges. While non-invasive recording methods such as
functional magnetic resonance imaging (fMRI), electroencephalogra
-
phy (EEG) or magnetoencephalography
6
are important tools to locate
speech and internal speech production, they lack the necessary tempo
-
ral and spatial resolution, adequate signal-to-noise ratio or portability
for building an online speech BMI
7
–
9
. For example, state-of-the-art
EEG-based imagined speech decoding performances in 2022 ranged
from approximately 60% to 80% binary classification
10
. Intracortical
electrophysiological recordings have higher signal-to-noise ratios and
excellent temporal resolution
11
and are a more suitable choice for an
internal speech decoding device.
Invasive speech decoding has predominantly been attempted
with electrocorticography (ECoG)
9
or stereo-electroencephalographic
depth arrays
12
, as they allow sampling neural activity from different
parts of the brain simultaneously. Impressive results in vocalized and
attempted speech decoding and reconstruction have been achieved
using these techniques
13
–
18
. However, vocalized speech has also been
decoded from localized regions of the cortex. In 2009, the use of a neu
-
rotrophic electrode
19
demonstrated real-time speech synthesis from
the motor cortex. More recently, speech neuroprosthetics were built
from small-scale microelectrode arrays located in the motor cortex
20
,
21
,
premotor cortex
22
and supramarginal gyrus (SMG)
23
, demonstrating
that vocalized speech BMIs can be built using neural signals from local
-
ized regions of cortex.
While important advances in vocalized speech
16
, attempted
speech
18
and mimed speech
17
,
22
,
24
–
26
decoding have been made, highly
accurate internal speech decoding has not been achieved. Lack of
behavioural output, lower signal-to-noise ratio and differences in
cortical activations compared with vocalized speech are speculated to
contribute to lower classification accuracies of internal speech
7
,
8
,
13
,
27
,
28
.
In ref.
29
, patients implanted with ECoG grids over frontal, parietal
and temporal regions silently read or vocalized written words from
a screen. They significantly decoded vowels (37.5%) and consonants
(36.3%) from internal speech (chance level 25%). Ikeda et al.
30
decoded
three internally spoken vowels using ECoG arrays using frequencies in
the beta band, with up to 55.6% accuracy from the Broca area (chance
level 33%). Using the same recording technology, ref.
31
investigated
the decoding of six words during internal speech. The authors demon
-
strated an average pair-wise classification accuracy of 58%, reaching
88% for the highest pair (chance level 50%). These studies were so-called
open-loop experiments, in which the data were analysed offline after
acquisition. A recent paper demonstrated real-time (closed-loop)
speech decoding using stereotactic depth electrodes
32
. The results
were encouraging as internal speech could be detected; however, the
reconstructed audio was not discernable and required audible speech
to train the decoding model.
While, to our knowledge, internal speech has not previously been
decoded from SMG, evidence for internal speech representation in
the SMG exists. A review of 100 fMRI studies
33
not only described SMG
activity during speech production but also suggested its involvement in
subvocal speech
34
,
35
. Similarly, an ECoG study identified high-frequency
SMG modulation during vocalized and internal speech
36
. Additionally,
fMRI studies have demonstrated SMG involvement in phonologic pro
-
cessing, for instance, during tasks while participants reported whether
two words rhyme
37
. Performing such tasks requires the participant to
internally ‘hear’ the word, indicating potential internal speech repre-
sentation
38
. Furthermore, a study performed in people suffering from
aphasia found that lesions in the SMG and its adjacent white matter
Participant 1
Participant 2
SMG
S1
Central
sulcus
Lateral sulcus
Lateral sulcus
Wire
Wire
a
Central
sulcus
Central
sulcus
Central
sulcus
Hand
knob
1
2
2
1
Hand
knob
Pre-central
gyrus
Pre-central
gyrus
Post-
central
gyrus
Post-
central
gyrus
Wire
Wire
Wire
Wire
c
b
d
Fig. 1 | Multielectrode implant locations.
a
,
b
, SMG implant locations
in participant 1 (1 × 96 multielectrode array) (
a
) and participant 2 (1 × 64
multielectrode array) (
b
).
c
,
d
, S1 implant locations in participant 1 (2 × 96
multielectrode arrays) (
c
) and participant 2 (2 × 64 multielectrode arrays) (
d
).
Nature Human Behaviour
Artice
https://doi.org/10.1038/s41562-024-01867-y
comprehension, word reading, internal speech and vocalized speech
production. In this manuscript, internal speech refers to engaging a
prompted word internally (‘inner monologue’), without correlated
motor output, while vocalized speech refers to audibly vocalizing a
prompted word. Participants were implanted in the SMG and S1 on the
basis of grasp localization fMRI tasks (Fig.
1
).
The task contained six phases: an inter-trial interval (ITI), a cue
phase (cue), a first delay (D1), an internal speech phase (internal), a
second delay (D2) and a vocalized speech phase (speech). Words were
cued with either an auditory or a written version of the word (Fig.
2a
).
Six of the words were informed by ref.
31
(battlefield, cowboy, python,
spoon, swimming and telephone). Two pseudowords (nifzig and bin-
dip) were added to explore phonetic representation in the SMG. The
first participant completed ten session days, composed of both the
auditory and the written cue tasks. The second participant completed
nine sessions, focusing only on the written cue task. The participants
8 × 8 trials
ITI
2 s
20
ITI
Cue
D1
D2
Internal
Speech
ITI
Cue
D1
D2
Internal
Speech
10
Firing rate (Hz)
0
20
10
0
20
20
20
15
10
10
5
0
10
0
20
10
Firing rate (Hz)
Firing rate (Hz)
Firing rate (Hz)
0
6
4
2
0
10
5
0
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
0
1
2
3
Audio cue
Written cue
Vocalized
speech
Vocalized
speech
4
5
6
7
0
1
2
3
Time (s)
Time (s)
4
5
6
7
0
1
2
3
4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
1.5 s
0.5 s
1.5 s
1.5 s
0.5 s
Cue
Python
D1
D2
Speech
Internal
speech
Internal speech task design
Individual neuronal firing rates
Participant 1
Participant 2
Auditory cue
Written cue
Written cue
Average FR of example neuron—nifzig
Average FR of example neuron—spoon
Average FR of example neuron—python
a
Example trial with audio recording
Example trial with audio recording
Example trial with audio recordings
Example trial with audio recording
Average FR of example neuron—telephone
Written cue
b
c
d
e
Fig. 2 | Neurons in the SMG represent language processes.
a
, Written words
and sounds were used to cue six words and two pseudowords in a participant
with tetraplegia. The ‘audio cue’ task was composed of an ITI, a cue phase during
which the sound of one of the words was emitted from a speaker (between 842
and 1,130 ms), a first delay (D1), an internal speech phase, a second delay (D2)
and a vocalized speech phase. The ‘written cue’ task was identical to the ‘audio
cue’ task, except that written words appeared on the screen for 1.5 s. Eight
repetitions of eight words were performed per session day and per task for
the first participant. For the second participant, 16 repetitions of eight words
were performed for the written cue task.
b
–
e
, Example smoothed firing rates of
neurons tuned to four words in the SMG for participant 1 (auditory cue, python
(
b
), and written cue, telephone (
c
)) and participant 2 (written cue, nifzig (
d
),
and written cue, spoon (
e
)). Top: the average firing rate over 8 or 16 trials (solid
line, mean; shaded area, 95% bootstrapped confidence interval). Bottom: one
example trial with associated audio amplitude (grey). Vertically dashed lines
indicate the beginning of each phase. Single neurons modulate firing rate during
internal speech in the SMG.
Nature Human Behaviour
Artice
https://doi.org/10.1038/s41562-024-01867-y
were instructed to internally say the cued word during the internal
speech phase and to vocalize the same word during the speech phase.
For each of the four language processes, we observed selective
modulation of individual neurons’ firing rates (Fig.
2b–e
). In general,
the firing rates of neurons increased during the active phases (cue,
internal and speech) and decreased during the rest phases (ITI, D1
and D2). A variety of activation patterns were present in the neural
population. Example neurons were selected to demonstrate increases
in firing rates during internal speech, cue and vocalized speech. Both
the auditory (Fig.
2b
) and the written cue (Fig.
2c–e
) evoked highly
modulated firing rates of individual neurons during internal speech.
These stereotypical activation patterns were evident at the
single-trial level (Fig.
2b–e
, bottom). When the auditory recording
was overlaid with firing rates from a single trial, a heterogeneous neu-
ral response was observed (Supplementary Fig. 1a), with some SMG
neurons preceding or lagging peak auditory levels during vocalized
Linear regression tuning 50 ms
Tuned units per word (number)
Tuned units per word (number)
Audio cue
Written cue
Written cue
Tuning per task phase
Linear regression tuning 50 ms
ITI
N
= 10
N
= 10
P
< 0.001
P
< 0.001
P
< 0.001
P
< 0.001
P
= 0.008
P
= 0.003
P
= 0.001
P
= 0.003
N
= 10
N
= 9
N
= 9
60
Participant 1
a
50
40
30
Tuned units (%)
20
10
0
60
60
40
40
30
20
10
Battlefield
Swimming
Cowboy
Telephone
Python
Bindip
Spoon
Nifzig
0
20
0
50
40
30
Tuned units (%)
Tuned units
(number)
60
40
20
0
Tuned units
(number)
20
10
0
60
50
40
30
Tuned units (%)
20
10
0
0
1
2
3
4
Time (s)
5
6
*aligned to cue onset
7
Cue
D1
D2
Speech
Internal
ITI
Cue
D1
D2
Speech
Internal
ITI
Cue
D1
D2
Speech
Internal
ITI
Cue
D1
D2
Speech
Internal
ITI
Cue
D1
D2
Speech
Internal
ITI
Cue
D1
D2
Speech
Internal
Auditory* - 95% CI
Written - 95% CI
Written - 95% CI
ITI
Cue
D1
D2
Speech
Internal
0
1
2
3
4
Time (s)
5
6
7
Participant 2
d
e
Tuning per task phase
b
f
c
Fig. 3 | Neuronal population activity modulates for individual words.
a
, The
average percentage of tuned neurons to words in 50-ms time bins in the SMG
over the trial duration for ‘auditory cue’ (blue) and ‘written cue’ (green) tasks for
participant 1 (solid line, mean over ten sessions; shaded area, 95% confidence
interval of the mean). During the cue phase of auditory trials, neural data were
aligned to audio onset, which occurred within 200–650 ms following initiation
of the cue phase.
b
, The average percentage of tuned neurons computed on
firing rates per task phase, with 95% confidence interval over ten sessions. Tuning
during action phases (cue, internal and speech) following rest phases (ITI, D1 and
D2) was significantly higher (paired two-tailed
t
-test, d.f. 9,
P
ITI_CueWritten
< 0.001,
Cohen’s
d
= 2.31;
P
ITI_CueAuditory
= 0.003, Cohen’s
d
= 1.25;
P
D1_InternalWritten
= 0.008,
Cohen’s
d
= 1.08;
P
D1_InternalAuditory
< 0.001, Cohen’s
d
= 1.71;
P
D2_SpeechWritten
< 0.001,
Cohen’s
d
= 2.34;
P
D2_SpeechAuditory
< 0.001, Cohen’s
d
= 3.23).
c
, The number of
neurons tuned to each individual word in each phase for the ‘auditory cue’ and
‘written cue’ tasks.
d
, The average percentage of tuned neurons to words in 50-ms
time bins in the SMG over the trial duration for ‘written cue’ (green) tasks for
participant 2 (solid line, mean over nine sessions; shaded area, 95% confidence
interval of the mean). Due to a reduced number of tuned units, only the ‘written
cue’ task variation was performed.
e
, The average percentage of tuned neurons
computed on firing rates per task phase, with 95% confidence interval over nine
sessions. Tuning during cue and internal phases following rest phases ITI and
D1 was significantly higher (paired two-tailed
t
-test, d.f. 8,
P
ITI_CueWritten
= 0.003,
Cohen’s
d
= 1.38;
P
D1_Internal
= 0.001, Cohen’s
d
= 1.67).
f
, The number of neurons
tuned to each individual word in each phase for the ‘written cue’ task.
Nature Human Behaviour
Artice
https://doi.org/10.1038/s41562-024-01867-y
speech. In contrast, neural activity from primary sensory cortex (S1)
only modulated during vocalized speech and produced similar firing
patterns regardless of the vocalized word (Supplementary Fig. 1b).
Population activity represented selective tuning for
individual words
Population analysis in the SMG mirrored single-neuron patterns
of activation, showing increases in tuning during the active task
phases (Fig.
3a,d
). Tuning of a neuron to a word was determined
by fitting a linear regression model to the firing rate in 50-ms time
bins (Methods). Distinctions between participant 1 and participant
2 were observed. Specifically, participant 1 exhibited strong tuning,
whereas the number of tuned units was notably lower in participant
2. Based on these findings, we exclusively ran the written cue task
with participant number 2. In participant 1, representation of the
auditory cue was lower compared with the written cue (Fig.
3b
, cue).
However, this difference was not observed for other task phases.
In both participants, the tuned population activity in S1 increased
during vocalized speech but not during the cue and internal speech
phases (Supplementary Fig. 3a,b).
To quantitatively compare activity between phases, we assessed
the differential response patterns for individual words by examin
-
ing the variations in average firing rate across different task phases
(Fig.
3b,e
). In both participants, tuning during the cue and internal
speech phases was significantly higher compared with their preced
-
ing rest phases ITI and D1 (paired
t
-test between phases. Participant
1: d.f. 9,
P
ITI_CueWritten
< 0.001, Cohen’s
d
= 2.31;
P
ITI_CueAuditory
= 0.003,
Cohen’s
d
= 1.25;
P
D1_InternalWritten
= 0.008, Cohen’s
d
= 1.08;
P
D1_InternalAudi
-
tory
< 0.001, Cohen’s
d
= 1.71. Participant 2: d.f. 8,
P
ITI_CueWritten
= 0.003,
Cohen’s
d
= 1.38;
P
D1_Internal
= 0.001, Cohen’s
d
= 1.67). For participant
1, we also observed significantly higher tuning to vocalized speech
than to tuning in D2 (d.f. 9,
P
D2_SpeechWritten
< 0.001, Cohen’s
d
= 2.34;
P
D2_SpeechAuditory
< 0.001, Cohen’s
d
= 3.23). Representation for all words
was observed in each phase, including pseudowords (bindip and nifzig)
(Fig.
3c,f
). To identify neurons with selective activity for unique words,
we performed a Kruskal–Wallis test (Supplementary Fig. 3c,d). The
results mirrored findings of the regression analysis in both participants,
albeit weaker in participant 2. These findings suggest that, while neural
activity during active phases differed from activity during the ITI phase,
neural responses of only a few neurons varied across different words
for participant 2.
The neural population in the SMG simultaneously represented
several distinct aspects of language processing: temporal changes,
input modality (auditory, written for participant 1) and unique words
from our vocabulary list. We used demixed principal component
analysis (dPCA) to decompose and analyse contributions of each
individual component: timing, cue modality and word. In Fig.
4
,
demixed principal components (PCs) explaining the highest amount
of variance were plotted by projecting data onto their respective
dPCA decoder axis.
For participant 1, the ‘timing’ component revealed that temporal
dynamics in the SMG peaked during all active phases (Fig.
4a
). In con-
trast, temporal S1 modulation peaked only during vocalized speech
production, indicating a lack of synchronized lip and face movement
of the participant during the other task phases. While ‘cue modal
-
ity’ components were separable during the cue phase (Fig.
4b
), they
overlapped during subsequent phases. Thus, internal and vocalized
speech representation may not be influenced by the cue modality.
Pseudowords had similar separability to lexical words (Fig.
4c
). The
explained variance between words was high in the SMG and was close
to zero in S1. In participant 2, temporal dynamics of the task were
preserved (‘timing’ component). However, variance to words was
reduced, suggesting lower neuronal ability to represent individual
words in participant 2. In S1, the results mirrored findings from S1 in
participant 1 (Fig.
4d,e
, right).
Internal speech is decodable in the SMG
Separable neural representations of both internal and vocalized speech
processes implicate SMG as a rich source of neural activity for real-time
speech BMI devices. The decodability of words correlated with the
percentage of tuned neurons (Fig.
3a–f
) as well as the explained dPCA
variance (Fig.
4c,e
) observed in the participants. In participant 1, all
words in our vocabulary list were highly decodable, averaging 55%
offline decoding and 79% (16–20 training trials) online decoding from
neurons during internal speech (Fig.
5a,b
). Words spoken during the
vocalized phase were also highly discriminable, averaging 74% offline
(Fig.
5a
). In participant 2, offline internal speech decoding averaged 24%
(Supplementary Fig. 4b) and online decoding averaged 23% (Fig.
5a
),
with preferential representation of words ‘spoon’ and ‘swimming’.
In participant 1, trial data from both types of cue (auditory and
written) were concatenated for offline analysis, since SMG activity
was only differentiable between the types of cue during the cue phase
(Figs.
3a
and
4b
). This resulted in 16 trials per condition. Features were
selected via principal component analysis (PCA) on the training data-
set, and PCs that explained 95% of the variance were kept. A linear
discriminant analysis (LDA) model was evaluated with leave-one-out
cross-validation (CV). Significance was computed by comparing results
with a null distribution (Methods).
Significant word decoding was observed during all phases, except
during the ITI (Fig.
5a
,
n
= 10, mean decoding value above 99.5th per-
centile of shuffle distribution is
P
< 0.01, per phase, Cohen’s
d
= 0.64,
6.17, 3.04, 6.59, 3.93 and 8.26, confidence interval of the mean ± 1.73,
4.46, 5.21, 5.67, 4.63 and 6.49). Decoding accuracies were significantly
higher in the cue, internal speech and speech condition, compared
with rest phases ITI, D1 and D2 (Fig.
5a
, paired
t
-test,
n
= 10, d.f. 9, for all
P
< 0.001, Cohen’s
d
= 6.81, 2.29 and 5.75). Significant cue phase decod
-
ing suggested that modality-independent linguistic representations
were present early within the task
45
. Internal speech decoding averaged
55% offline, with the highest session at 72% and a chance level of ~12.5%
(Fig.
5a
, red line). Vocalized speech averaged even higher, at 74%. All
words were highly decodable (Fig.
5c
). As suggested from our dPCA
results, individual words were not significantly decodable from neural
activity in S1 (Supplementary Fig. 4a), indicating generalized activity
for vocalized speech in the S1 arm region (Fig.
4c
).
For participant 2, SMG significant word decoding was observed
during the cue, internal and vocalized speech phases (Supplementary
Fig. 4b,
n
= 9, mean decoding value above 97.5th/99.5th percentile of
shuffle distribution is
P
< 0.05/
P
< 0.01, per phase Cohen’s
d
= 0.35, 1.15,
1.09, 1.44, 0.99 and 1.49, confidence interval of the mean ± 3.09, 5.02,
6.91, 8.14, 5.45 and 4.15). Decoding accuracies were significantly higher
in the cue and internal speech condition, compared with rest phases ITI
and D1 (Supplementary Fig. 4b, paired
t
-test,
n
= 9, d.f. 8,
P
ITI_Cue
= 0.013,
Cohen’s
d
= 1.07,
P
D1_Internal
= 0.01, Cohen’s
d
= 1.11). S1 decoding mir
-
rored results in participant 1, suggesting that no synchronized face
movements occurred during the cue phase or internal speech phase
(Supplementary Fig. 4c).
High-accuracy online speech decoder
We developed an online, closed-loop internal speech BMI using an
eight-word vocabulary (Fig.
5b
). On three separate session days, training
datasets were generated using the written cue task, with eight repeti-
tions of each word for each participant. An LDA model was trained on
the internal speech data of the training set, corresponding to only
1.5 s of neural data per repetition for each class. The trained decoder
predicted internal speech during the online task. During the online
task, the vocalized speech phase was replaced with a feedback phase.
The decoded word was shown in green if correctly decoded, and in
red if wrongly decoded (Supplementary Video 1). The classifier was
retrained after each run of the online task, adding the newly recorded
data. Several online runs were performed on each session day, corre-
sponding to different datapoints on Fig.
5b
. When using between 8 and