<Toward a P300-based Computer Interface

Toward a P300-based Computer Interface

James B. Polikoff, H. Timothy Bunnell, & Winslow J. Borkowski Jr.
Applied Science and Engineering Laboratories
Alfred I. Dupont Institute

Abstract

This paper describes our initial research into the use of the P300 event related potential as a control signal in a computer interface for locked-in patients. These experiments are directed toward development of a device which uses the occurrence of a P300 to control motion of a cursor on a computer screen. Simultaneous target-detection tasks are presented in four compass positions (N, E, S, W) on a computer screen. The subject is instructed to detect targets in one of the positions corresponding to the intended direction of cursor movement. EEG signals from scalp electrodes at Fz, Cz and Pz are collected and averaged for targets in each of the four locations and the direction of movement is indicated by the highest P300 peak amplitude. Variables such as target on-time, ISI, null- target inclusion, and trial averaging are manipulated and the results are discussed in terms of a rate vs. accuracy trade-off.

Introduction

There exists a significant population who, due to disease or injury, are totally paralyzed but have normal or near-normal brain function. In such cases, called "Locked-in-Syndrome", the individual is aware of his or her surroundings, but has no way of communicating with the outside world. In cases where the person has even a slight degree of voluntary movement (e.g., eyebrow motion), it is possible to use that movement as a switch for controlling a computer. Likewise, when the person has good eye control, he or she can be fitted with an eye-tracking device to control cursor movement on a computer screen. In many cases, however, the individual may have no reliable voluntary motion to attach a switch to, and eye-movement may not be precise enough to use with an eye-tracking device. In such cases, the only possible method of communication would be to use electrical signals produced by the brain as a switching device for computer interaction. In order to achieve this, a reliable, detectable brain signal must be found. Wolpaw, et al. (1991) was able to train subjects to voluntarily adjust the amplitude of scalp-recorded mu rhythm. The mu rhythm amplitude was assessed and translated into up and down cursor movement. Although a fair amount of training is required, this method has the advantage of using signals that are endogenous to the subject. At present, however, only one-dimensional cursor movement can be achieved.

Because of its robustness, we believe that an evoked electrical potential, called P300, may serve as a good candidate for an EEG-based computer interface. The P300 is a late positive wave that occurs between 250 and 800 milliseconds after the onset of a meaningful stimulus. It was first reported in 1965 as a late positive component occurring in response to task-relevant stimuli (Sutton et al., 1965; Walter, 1965). Sutton et al concluded that the late positive component was related to the subject's psychological reaction to the stimuli and not the physical characteristics of the stimulus. In fact, two years later, Sutton et al (1967) discovered that a P300 could be elicited by the omission of a stimulus if its omission was task-relevant. Attempts have been made to use P300 as a control device. Farwell and Donchin (1988) used P300s to operate a computer-based communication device by presenting letters and commands in a matrix and repeatedly flashing each row and column. P300s were elicited when the row or column of the element that the subject was focusing on was flashed.

In the present experiment, we will explore the possibility of using P300s to control cursor movement on a computer screen by presenting simultaneous visual target-detection tasks (oddball paradigms) and measuring peak P300 amplitudes to targets occurring at each of four compass locations (N, E, S, W). Peak amplitudes are expected to be greatest for P300s in response to the direction that the subject is attending to.

Method

Subjects were seated in a sound attenuated chamber facing a monochrome monitor 18 inches from the subject's face. The central fixation point was a cross. There were four target arms (compass positions N, E, S, W) with a target (a cross) at the end of each arm and one centimeter away from the central fixation point. Each stimulus was presented for 250 msec with an interstimulus interval of either 750 or 1,000 msec. There were two different stimulus sets: in the first, each of the four target crosses was replaced by an asterisk one at a time and in random order; in the second, a null-stimulus was included, in which no asterisk appeared. The subject was instructed to fixate the central point and count the number of times one particular cross (N, E, S, or W) was replaced by an asterisk. The order of asterisk substitution was random without replacement within each set of four (asterisk always appearing) or five (null included) stimuli. The target stimulus occurred with a probability of 0.25 in the first case, and 0.2 in the second. When a blink was detected (any signal beyond a preset threshold on the EOG channel), that stimulus trial was discarded, and was presented again later in the set. No set was complete until at least one good (non-blink) trial was recorded for each target position. Thus, each set consisted of at least four or five trials (more if the subject blinked). Sessions consisted of 50 complete sets. For the pilot data to be presented here, two subjects recorded data for the case where the target probability was 0.2. and three subjects recorded data for target probability 0.25 conditions.

Data Acquisition. Grass silver-silver chloride electrodes were placed according to the international 10- 20 system at Fz, Cz, and Pz and referenced to bilateral (joined) earlobe electrodes. The EOG was recorded from an electrode at SO2 (inferior and lateral to the right eye) also referenced to bilateral earlobes. The three EEG channels and single EOG channel were amplified 50,000 times (Med Associates AG-100 physiological amplifiers), bandpass filtered between 0.15 Hz and 150 Hz, and digitized (12-bit resolution) at a 300 Hz sampling rate on an IBM compatible 486 computer with an 8-channel DSP card. Data recording for each trial began 50 msec before presentation of the target stimulus and continued for a total of 650 msec. Thus, 600 msec of EEG data were recorded for each channel following target onset. These data were saved for subsequent analysis.

Data Analysis. While the data to be presented here are based on off-line analysis of collected data, we are modeling a real time process in which the computer estimates the direction in which the subject wishes to move the cursor, moves the cursor one step in the estimated direction, obtains another estimate of the desired direction, and so forth. It is in the nature of the task that each estimate must be independent of the last estimate since the subject must be free to change cursor direction at will. The estimated direction is based on comparing P300 levels for targets on each arm of the cursor and selecting the largest P300 level as the most likely direction for cursor motion. This comparison can be made as soon as a single set (i.e., four target positions) has been obtained, or EEG activity for each target location can be summed over a series of sets to obtain a more stable P300 estimate. One issue we address in our analyses of the data is the trade-off between fast, but more error-prone, and slow, but more accurate cursor motion.

Another speed-accuracy trade-off often cited in P300 studies relates to the frequency with which a target is presented. Low frequency targets generally produce higher amplitude P300s than higher frequency targets. Hence, having low frequency targets may result in better detection of P300, but again, at the cost of longer intervals between cursor movements.

From preliminary analyses, it was determined that the interval from 300 to 600 msec following target onset was the best interval for p300 detection in these subjects. Consequently, we defined a P300 level on each trial to be the maximum signal level in the interval from 300 to 600 msec following target onset.

Results

In general, P300 detection was best for tasks in which target frequency was 0.20 versus 0.25 (54.02% correct and 40.85% respectively averaged across the two subjects who participated in both task conditions). All subsequent results, however, represent data averaged over these two task conditions.

Turning to the issue of the gain in detection accuracy due to summing EEG signals over a sequence of trials, percentage correct direction scores were also calculated for conditions in which one, two, or three successive sets were summed before determining peak P300 amplitudes. Figure 1 shows results of these analyses for the three subjects separately, averaged over all data obtained from each subject. There appears to be a constant increase in percentage correct detection of P300 for each additional trial summed. There is no indication of an interaction among these three subjects.

Figure 2 carries this analysis further for one subject (subject A who's performance was lowest overall). In this figure, summation is carried out over up to 9 consecutive trials. As figure 2 shows, the advantages (in percentage correct) of summing trials begins to flatten out after about 6 trials, reaching a level of over 80% correct.

Conclusion

Overall, P300 detection was correct about 50% of the time based on comparisons among peak levels within a single set of trials. Correct detection occurred about 60% of the time when three successive sets of trials were summed before comparing P300 levels. However, this gain of 10 percentage points in correct detection would be accompanied by a factor of three increase in the amount of time necessary to arrive at a decision. Consider how this would affect a subject attempting to move the cursor 10 steps in a particular direction. With 50% accuracy in the P300 detection (and assuming that errors in P300 detection are uniformly distributed over the three unattended targets), the subject would be expected to need 30 steps to reach the goal of ten steps in the attended target direction. These 30 steps, at four seconds per step, would require about 2 minutes of actual time on task. For a 60% accuracy, the expected number of steps to reach a ten-step goal drops to about 21, but at 12 seconds per step, this would require over four minutes to reach the same goal. In no data that we have examined to date have we observed an instance in which the trade-off between accuracy of P300 detection and time would result in advantages for summing trials; the cost in time far exceeds the benefits of accuracy in this task.

Previous work by Farwell and Donchin (1988) on a P300 based communication device concluded that their subjects were able to achieve a communication rate of about 0.20 bits per second. This rate is slightly better than the rates we see in the present data when averaged over all task conditions. Overall average performance for our slowest subject was 0.13 bits per second (44.27% correct) and our fastest subject was 0.18 bits per second (51.92% correct). However, considering only the task in which target frequency was 0.2 for these two subjects, bit rates were 0.15 and 0.27 respectively.

To conclude, we find that averaging over trials in the present task does not appear to be a productive way to improve bit rate. However, varying other task variables like target frequency and presentation rate may lead to moderate improvements in the accuracy with which P300 events are detected. In future studies we will continue to explore these and other task variables to find conditions which lead to optimum performance. As a signal for control of communication devices and interfaces, the P300 has several limitations. The most serious of these is the relatively low bit rate associated with its use. However, for some potential users, this low bit rate may still exceed the rates available via other communication channels, and at present, communication rates associated with P300 detection seem equivalent to those associated with the detection of other brain events or states.

References

Duncan-Johnson, C. C., Donchin, E. (1977) On quantifying surprise: the variation of event-related potentials with subjective probability. Psychophysiology, 14, 456-467.

Farwell, L. A. and Donchin, E. (1988) Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroenceph. and Cl. Neurophys., 70, 510-523.

Sutton, S., Braren M., Zubin, J., John, E. R. (1965). Evoked potential correlates of stimulus uncertainty. Science, 150, 1187-1188.

Sutton, S., Tueting P., Zubin J., John E. R. (1967). Information delivery and the sensory evoked potential. Science, 155, 1436-1439.

Tueting, P., Sutton, S., Zubin, J. (1971). Quantitative evoked potential correlates of the probability of events. Psychophysiology, 7, 385-394.

Walter, W. G. (1965) Brain responses to semantic stimuli. Journal of Psychosom. Research, 9, 51- 61.

Wolpaw, J.R., McFarland, D.J., Neat, G.W., and Forneris, C.A. (1991) An EEG-based brain-computer interface for cursor control. Electroenceph. and Cl. Neurophys, 78, 252-259.

Acknowledgements

This work was supported by the Nemours Research Programs.

Author address

James B. Polikoff
Applied Science and Engineering Laboratories
A. I. duPont Institute
P.O. Box 269
Wilmington, DE 19899
(302)651-6844
e-mail: polikoff@asel.udel.edu