This paper describes our initial research into the use of
the P300 event related potential as a control signal in a
computer interface for locked-in patients. These
experiments are directed toward development of a
device which uses the occurrence of a P300 to control
motion of a cursor on a computer screen. Simultaneous
target-detection tasks are presented in four
compass positions (N, E, S, W) on a computer screen.
The subject is instructed to detect targets in one of the
positions corresponding to the intended direction of
cursor movement. EEG signals from scalp electrodes
at Fz, Cz and Pz are collected and averaged for targets
in each of the four locations and the direction of
movement is indicated by the highest P300 peak
amplitude. Variables such as target on-time, ISI, null-
target inclusion, and trial averaging are manipulated
and the results are discussed in terms of a rate vs.
accuracy trade-off.
There exists a significant population who, due to disease
or injury, are totally paralyzed but have normal or
near-normal brain function. In such cases, called
"Locked-in-Syndrome", the individual is aware of his or
her surroundings, but has no way of communicating
with the outside world. In cases where the person has
even a slight degree of voluntary movement (e.g., eyebrow motion),
it is possible to use that movement as a
switch for controlling a computer. Likewise, when the
person has good eye control, he or she can be fitted
with an eye-tracking device to control cursor movement
on a computer screen. In many cases, however,
the individual may have no reliable voluntary motion
to attach a switch to, and eye-movement may not be
precise enough to use with an eye-tracking device. In
such cases, the only possible method of communication
would be to use electrical signals produced by the
brain as a switching device for computer interaction.
In order to achieve this, a reliable, detectable brain
signal must be found. Wolpaw, et al. (1991) was able
to train subjects to voluntarily adjust the amplitude of
scalp-recorded mu rhythm. The mu rhythm amplitude
was assessed and translated into up and down cursor
movement. Although a fair amount of training is
required, this method has the advantage of using signals
that are endogenous to the subject. At present,
however, only one-dimensional cursor movement can
be achieved.
Because of its robustness, we believe that an evoked
electrical potential, called P300, may serve as a good
candidate for an EEG-based computer interface. The
P300 is a late positive wave that occurs between 250
and 800 milliseconds after the onset of a meaningful
stimulus. It was first reported in 1965 as a late positive
component occurring in response to task-relevant
stimuli (Sutton et al., 1965; Walter, 1965). Sutton et al
concluded that the late positive component was related
to the subject's psychological reaction to the stimuli
and not the physical characteristics of the stimulus. In
fact, two years later, Sutton et al (1967) discovered
that a P300 could be elicited by the omission of a
stimulus if its omission was task-relevant. Attempts
have been made to use P300 as a control device. Farwell
and Donchin (1988) used P300s to operate a
computer-based communication device by presenting
letters and commands in a matrix and repeatedly flashing
each row and column. P300s were elicited when
the row or column of the element that the subject was
focusing on was flashed.
In the present experiment, we will explore the possibility
of using P300s to control cursor movement on a
computer screen by presenting simultaneous visual
target-detection tasks (oddball paradigms) and
measuring peak P300 amplitudes to targets occurring at
each of four compass locations (N, E, S, W). Peak
amplitudes are expected to be greatest for P300s in
response to the direction that the subject is attending
to.
Subjects were seated in a sound attenuated chamber
facing a monochrome monitor 18 inches from the subject's
face. The central fixation point was a cross.
There were four target arms (compass positions N, E,
S, W) with a target (a cross) at the end of each arm
and one centimeter away from the central fixation
point. Each stimulus was presented for 250 msec with
an interstimulus interval of either 750 or 1,000 msec.
There were two different stimulus sets: in the first,
each of the four target crosses was replaced by an
asterisk one at a time and in random order; in the
second, a null-stimulus was included, in which no asterisk
appeared. The subject was instructed to fixate the
central point and count the number of times one particular
cross (N, E, S, or W) was replaced by an asterisk.
The order of asterisk substitution was random
without replacement within each set of four (asterisk
always appearing) or five (null included) stimuli. The
target stimulus occurred with a probability of 0.25 in
the first case, and 0.2 in the second. When a blink was
detected (any signal beyond a preset threshold on the
EOG channel), that stimulus trial was discarded, and
was presented again later in the set. No set was
complete until at least one good (non-blink) trial was
recorded for each target position. Thus, each set consisted
of at least four or five trials (more if the subject
blinked). Sessions consisted of 50 complete sets. For
the pilot data to be presented here, two subjects
recorded data for the case where the target probability
was 0.2. and three subjects recorded data for target
probability 0.25 conditions.
Data Acquisition. Grass silver-silver chloride electrodes
were placed according to the international 10-
20 system at Fz, Cz, and Pz and referenced to bilateral
(joined) earlobe electrodes. The EOG was recorded
from an electrode at SO2 (inferior and lateral to the
right eye) also referenced to bilateral earlobes. The
three EEG channels and single EOG channel were
amplified 50,000 times (Med Associates AG-100
physiological amplifiers), bandpass filtered between
0.15 Hz and 150 Hz, and digitized (12-bit resolution)
at a 300 Hz sampling rate on an IBM compatible 486
computer with an 8-channel DSP card. Data recording
for each trial began 50 msec before presentation of the
target stimulus and continued for a total of 650 msec.
Thus, 600 msec of EEG data were recorded for each
channel following target onset. These data were saved
for subsequent analysis.
Data Analysis. While the data to be presented here are
based on off-line analysis of collected data, we are
modeling a real time process in which the computer
estimates the direction in which the subject wishes to
move the cursor, moves the cursor one step in the estimated
direction, obtains another estimate of the
desired direction, and so forth. It is in the nature of the
task that each estimate must be independent of the last
estimate since the subject must be free to change cursor
direction at will. The estimated direction is based
on comparing P300 levels for targets on each arm of
the cursor and selecting the largest P300 level as the
most likely direction for cursor motion. This comparison
can be made as soon as a single set (i.e., four target positions)
has been obtained, or EEG activity for
each target location can be summed over a series of
sets to obtain a more stable P300 estimate. One issue
we address in our analyses of the data is the trade-off
between fast, but more error-prone, and slow, but
more accurate cursor motion.
Another speed-accuracy trade-off often cited in P300
studies relates to the frequency with which a target is
presented. Low frequency targets generally produce
higher amplitude P300s than higher frequency targets.
Hence, having low frequency targets may result in better
detection of P300, but again, at the cost of longer
intervals between cursor movements.
From preliminary analyses, it was determined that the
interval from 300 to 600 msec following target onset
was the best interval for p300 detection in these
subjects. Consequently, we defined a P300 level on each
trial to be the maximum signal level in the interval
from 300 to 600 msec following target onset.
In general, P300 detection was best for tasks in which
target frequency was 0.20 versus 0.25 (54.02% correct
and 40.85% respectively averaged across the two subjects
who participated in both task conditions). All
subsequent results, however, represent data averaged
over these two task conditions.
Turning to the issue of the gain in detection accuracy
due to summing EEG signals over a sequence of trials,
percentage correct direction scores were also calculated
for conditions in which one, two, or three successive sets
were summed before determining peak
P300 amplitudes. Figure 1 shows results of these
analyses for the three subjects separately, averaged over all
data obtained from each subject. There appears to be a
constant increase in percentage correct detection of
P300 for each additional trial summed. There is no
indication of an interaction among these three
subjects.
Figure 2 carries this analysis further for one subject (subject A who's performance was lowest overall). In this figure, summation is carried out over up to 9 consecutive trials. As figure 2 shows, the advantages (in percentage correct) of summing trials begins to flatten out after about 6 trials, reaching a level of over 80% correct.
Overall, P300 detection was correct about 50% of the
time based on comparisons among peak levels within
a single set of trials. Correct detection occurred about
60% of the time when three successive sets of trials
were summed before comparing P300 levels. However, this gain of
10 percentage points in correct detection
would be accompanied by a factor of three
increase in the amount of time necessary to arrive at a
decision. Consider how this would affect a subject
attempting to move the cursor 10 steps in a particular
direction. With 50% accuracy in the P300 detection
(and assuming that errors in P300 detection are uniformly
distributed over the three unattended targets),
the subject would be expected to need 30 steps to
reach the goal of ten steps in the attended target
direction. These 30 steps, at four seconds per step, would
require about 2 minutes of actual time on task. For a
60% accuracy, the expected number of steps to reach a
ten-step goal drops to about 21, but at 12 seconds per
step, this would require over four minutes to reach the
same goal. In no data that we have examined to date
have we observed an instance in which the trade-off
between accuracy of P300 detection and time would
result in advantages for summing trials; the cost in
time far exceeds the benefits of accuracy in this task.
Previous work by Farwell and Donchin (1988) on a
P300 based communication device concluded that
their subjects were able to achieve a communication
rate of about 0.20 bits per second. This rate is slightly
better than the rates we see in the present data when
averaged over all task conditions. Overall average performance for
our slowest subject was 0.13 bits per
second (44.27% correct) and our fastest subject was
0.18 bits per second (51.92% correct). However, considering
only the task in which target frequency was
0.2 for these two subjects, bit rates were 0.15 and 0.27
respectively.
To conclude, we find that averaging over trials in the
present task does not appear to be a productive way to
improve bit rate. However, varying other task variables like
target frequency and presentation rate may
lead to moderate improvements in the accuracy with
which P300 events are detected. In future studies we
will continue to explore these and other task variables
to find conditions which lead to optimum
performance. As a signal for control of communication
devices and interfaces, the P300 has several limitations.
The most serious of these is the relatively low
bit rate associated with its use. However, for some
potential users, this low bit rate may still exceed the
rates available via other communication channels, and
at present, communication rates associated with P300
detection seem equivalent to those associated with the
detection of other brain events or states.
Duncan-Johnson, C. C., Donchin, E. (1977) On
quantifying surprise: the variation of event-related
potentials with subjective probability.
Psychophysiology, 14, 456-467.
Farwell, L. A. and Donchin, E. (1988) Talking off the
top of your head: toward a mental prosthesis
utilizing event-related brain potentials.
Electroenceph. and Cl. Neurophys., 70, 510-523.
Sutton, S., Braren M., Zubin, J., John, E. R. (1965).
Evoked potential correlates of stimulus
uncertainty. Science, 150, 1187-1188.
Sutton, S., Tueting P., Zubin J., John E. R. (1967).
Information delivery and the sensory evoked
potential. Science, 155, 1436-1439.
Tueting, P., Sutton, S., Zubin, J. (1971). Quantitative
evoked potential correlates of the probability of
events. Psychophysiology, 7, 385-394.
Walter, W. G. (1965) Brain responses to semantic
stimuli. Journal of Psychosom. Research, 9, 51-
61.
Wolpaw, J.R., McFarland, D.J., Neat, G.W., and
Forneris, C.A. (1991) An EEG-based brain-computer
interface for cursor control.
Electroenceph. and Cl. Neurophys, 78, 252-259.
This work was supported by the Nemours Research
Programs.
James B. Polikoff
Applied Science and Engineering Laboratories
A. I. duPont Institute
P.O. Box 269
Wilmington, DE 19899
(302)651-6844
e-mail: polikoff@asel.udel.edu