Technical note
Thoughts turned into high-level commands: Proof-of-concept study of a vision-guided robot arm driven by functional MRI (fMRI) signals

https://doi.org/10.1016/j.medengphy.2012.02.004Get rights and content

Abstract

Previous studies have demonstrated the possibility of using functional MRI to control a robot arm through a brain–machine interface by directly coupling haemodynamic activity in the sensory–motor cortex to the position of two axes. Here, we extend this work by implementing interaction at a more abstract level, whereby imagined actions deliver structured commands to a robot arm guided by a machine vision system. Rather than extracting signals from a small number of pre-selected regions, the proposed system adaptively determines at individual level how to map representative brain areas to the input nodes of a classifier network. In this initial study, a median action recognition accuracy of 90% was attained on five volunteers performing a game consisting of collecting randomly positioned coloured pawns and placing them into cups. The “pawn” and “cup” instructions were imparted through four mental imaginery tasks, linked to robot arm actions by a state machine. With the current implementation in MatLab language the median action recognition time was 24.3 s and the robot execution time was 17.7 s. We demonstrate the notion of combining haemodynamic brain–machine interfacing with computer vision to implement interaction at the level of high-level commands rather than individual movements, which may find application in future fMRI approaches relevant to brain-lesioned patients, and provide source code supporting further work on larger command sets and real-time processing.

Introduction

Brain–computer interfaces (BCI) and brain–machine interfaces (BMI) aim to provide control over a software process or hardware device by extracting information directly from brain activity, bypassing the muscles and peripheral nervous system. Since their inception in the 1970s mainly for augmented reality applications in space exploration and defence [1], BCIs and BMIs have received increasing attention also as means of supporting communication with severely disabled patients and neurorehabilitation tools. Successful interaction with healthy participants and patients has been demonstrated using a variety of physiological signals, including local field potentials recorded directly from the cortex, electroencephalogram (EEG) measured at the scalp surface, and haemodynamic activity detected by functional magnetic resonance imaging (fMRI) or near-infrared spectroscopy (NIRS) [2], [3].

While cortical electrode arrays provide the most accurate representation of neural activity and even enable two-way communication, implantable systems raise obvious issues and may only be considered for patients where the long-term advantages outweigh the risks [4]. Despite severe limitations to activity localization caused by the ill-posed inverse problem, EEG-based interfaces have evolved substantially over the last decade and currently provide viable systems for spelling, wheelchair control and domotics [5], [6]; yet, some individuals are unable to attain satisfactory performance with such devices and considerable prior training may be required [7].

On the other hand, the haemodynamic response to neuronal activity possesses many features that would, in principle, make it an ideal substrate for BCIs and BMIs: it is comparatively large with respect to baseline fluctuations, easily localizable and highly region-specific during sensory, motor and cognitive tasks [8]. The high spatial resolution of fMRI compared to EEG can provide fine discrimination between mental states, as typified by recent work demonstrating the reconstruction of detailed perceptual stimuli from blood oxygen level dependent (BOLD) signal time-courses [9]. However, the development of haemodynamic interfaces has been hampered by the fact that fMRI technology is inherently unsuitable for portable applications, and that the imaging datasets are very large and traditionally require off-line processing over several hours [10].

Recent technological developments have led to a resurgence of interest in haemodynamic BCIs and BMIs. As reviewed in [11], algorithmic advances and the widespread availability of high performance computers have made it possible to implement real-time fMRI analysis on the scanner console itself or on networked workstations. Based on this approach, recent studies have demonstrated successful haemodynamic control of simple computer games such as navigating in a maze, attained using similarity analysis [12] as well as support vector machines [13]. In parallel, small-sized multichannel NIRS devices are gradually becoming available, and one can now envisage translating results initially obtained with fMRI to portable BCI and BMI systems [14], [15].

A major interest in general BMI research is the control of robots, aiming to replace function lost due to injury or to enhance interaction with the environment [16]. For example, the P300 event-related EEG potential has been used as a basis to drive a remote robot arm writing short messages on paper [6] and to navigate a platform in an indoor environment [17]. To date, very limited research has been done on haemodynamic BMIs for robot control. In the only available fMRI study, the BOLD signal was extracted from the left and right sensory–motor cortex and coupled in real time to the two axes of a manipulator; three healthy participants learnt to control the arm with the purpose of reaching a target object [18].

Because the haemodynamic response is associated with a well-posed localization problem and displays high regional specificity, it is an ideal substrate for distinguishing among large sets of commands, each associated with a specific mental state. On the other hand, since its temporal dynamics are on the order of several seconds due to inherent physiological limitations, it cannot offer rapid direct control of target dynamical systems [8], [18], [19], [20]. Hence, abstracting the interaction from the level of controlling individual movements to specifying structured goals is necessary to fully exploit the potential of this bio-signal for robotic control [2], [5], [10].

A key feature of contemporary robot systems is the ability to execute high-level commands. Based on machine vision and multisensory feedback implemented through biologically inspired architectures, robot arms and vehicles can exhibit goal-oriented behaviour and realize structured actions [21], [22]. Hence, the analysis of distributed patterns of cortical activity could be used to program a robot at a considerably higher level than establishing a direct correspondence between the intensity of regional activity and the position of each axis. Such approach would be particularly applicable to fMRI-based BMIs, where the fine anatomical detail could be drawn upon to select among large numbers of high-level commands and action sequences.

Extending previous work in this area, we present a novel prototype implementation of an fMRI-based BMI enabling interaction at the level of a small set of high-level commands. Four imagined actions were associated through a state machine to operations performed by a bench-top robot arm guided by a machine vision system. Instead of preselecting the BOLD signal from specific a priori regions, the system adaptively determined at individual level how to map representative brain areas to the input nodes of a classifier network [18]. The complete source code is provided as supplementary material.

Section snippets

Participants

Five healthy participants (3 female, age range 26–35 median 28 years, education 18–22 median 21 years) were recruited for the experiment, which was conducted at the Fondazione IRCCS Istituto Neurologico Carlo Besta and formally approved by the local institutional review board (project ref. fMRI-DM, part B). All subjects were hospital staff, right-handed and free from neurological and psychiatric pathology. The purpose of the study was explained and written informed consent was obtained.

Task and experimental procedure

Upon

Results

The imagined actions A and B, corresponding to left and right finger tapping, consistently activated the hand knob in the contralateral pre- and post-central gyri. Action A also evoked significant activity in the supplementary motor area and ipsilateral motor cortex. Action C (word generation) consistently engaged Broca's area (left inferior frontal gyrus); in participant # 2 activation extended medially towards the ventro-medial prefrontal cortex, and in # 1 and 4 it included the supplementary

Discussion and conclusions

In this initial experiment, only four commands were defined. Left and right motor imaginery and speech generation were chosen as they are closely related to well-established clinical fMRI tasks, and have been successfully employed in previous haemodynamic BCI studies [12], [13], [29]. The indoors exploration task was predicated on established visual imaginery paradigms, and preferred over alternatives such as mental calculation as it mainly activates well-separated, posterior regions of the

Conflicts of interest statement

All authors declare that they do not have any real or perceived conflicts. The results of the present study are generally applicable to fMRI brain-state analysis and not dependent on any specific hardware implementation, and all source code is provided as supplementary material.

Acknowledgements

The authors are grateful to Jim Frye of Lynxmotion Inc. (Pekin, IL, USA) for generously donating to L.M. mechanical parts of the robot arm, to Matteo Pavanello for outstanding workshop assistance, to Francesca Epifani for radiographic support, and to the personnel of the technical and estates unit of their Institution for testing and clearing the prototype robot system for experimental use. The authors are also grateful to two anonymous reviewers for offering insightful feedback on an earlier

References (45)

  • M.F. Møller

    A scaled conjugate gradient algorithm for fast supervised learning

    Neural Networks

    (1993)
  • N. Bunzeck et al.

    Scanning silence: mental imagery of complex sounds

    Neuroimage

    (2005)
  • M. Arsalidou et al.

    Is 2 + 2 = 4? Meta-analyses of brain areas needed for numbers and calculations

    Neuroimage

    (2011)
  • K.A. Norman et al.

    Beyond mind-reading: multi-voxel pattern analysis of fMRI data

    Trends Cogn Sci

    (2006)
  • W. Penny et al.

    Variational Bayesian inference for fMRI time series

    Neuroimage

    (2003)
  • C. Guger et al.

    How many people are able to control a P300-based brain–computer interface (BCI)?

    Neurosci Lett

    (2009)
  • B.D. Berman et al.

    Self-modulation of primary motor cortex activity with motor and motor imagery tasks using real-time fMRI-based neurofeedback

    Neuroimage

    (2012)
  • J.J. Vidal

    Toward direct brain–computer communication

    Ann Rev Biophys Bioeng

    (1973)
  • M.A. Nicolelis

    Brain–machine interfaces to restore motor function and probe neural circuits

    Nat Rev Neurosci

    (2003)
  • I.S. Kotchetkov et al.

    Brain–computer interfaces: military, neurosurgical, and ethical perspective

    Neurosurg Focus

    (2010)
  • M.M. Moore

    Real-world applications for brain–computer interface technology

    IEEE Trans Neural Syst Rehabil Eng

    (2003)
  • C. Vidaurre et al.

    Towards a cure for BCI illiteracy

    Brain Topogr

    (2010)
  • Cited by (14)

    • Detection of scale-freeness in brain connectivity by functional MRI: Signal processing aspects and implementation of an open hardware co-processor

      2013, Medical Engineering and Physics
      Citation Excerpt :

      Algorithmic accelerations have also been proposed, but embed assumptions about data structure [15]. Future studies will explore in detail the use of QUASAR-1 to map functional connectivity in real-time in brain–computer and brain–machine interfaces, particularly to unlock the possibility of using large sets of abstract commands available to vision-guided robots [16]. While graphics-processing units (GPUs) have gained widespread acceptance and can provide raw computational performance exceeding that of QUASAR-1 [19], the architecture outlined with this initial prototype confers several advantages through provision of fully independent, general-purpose SIMD units with separate buses.

    • High Technology Coming to a Nursing Home Near You

      2012, Journal of the American Medical Directors Association
    • Brain-Voyant: A General Purpose Machine-Learning Tool for Real-Time fMRI Whole-Brain Pattern Classification

      2018, Proceedings of the International Joint Conference on Neural Networks
    View all citing articles on Scopus
    View full text