The current work contributes to the foundation for guidelines necessary to develop advanced displays that support exploration missions involving extravehicular activities (EVAs) and telerobotic missions by mitigating performance decrements due to the perceptually impoverished environment of spaceflight such as limited visibility, reduced sensory information channels, and communication delays. In particular, multimodal displays could improve effective information sharing between humans and semi-autonomous telerobotic agents by enhancing the operator’s situational awareness and perceptual accuracy of the operational space.
Background and Objectives: During surface exploration missions, the availability and reliability of most sensory inputs existing on Earth is reduced and typically the processing of visual information is highly dependent on 2D displays such as visual maps. The current research addressed the information organization of displays that integrate navigation information as well as the health and status of crew and mission systems. It is likely that telerobotic exploration missions will require that operators perform multiple tasks simultaneously, utilizing different types of displays corresponding to each task. This increased workload may negatively impact operator performance as well as interact with the effects of display modality. Time delays present in the control loop of human teleoperation in space can be considered another aspect of increased task workload and can have a critical impact on human performance and mission effectiveness. The current research consisted of studies that investigated the impact of these two different forms of workload, multi-tasking (Dual Task study) and latency (Docking Task study), on performance with different types of unimodal or bimodal displays.
Dual Task Method: The first study extended previous work (Wenzel et al., 2012) using a single orientation task (ST) to a more complex multi-tasking environment with a dual task (DT) paradigm that included both the original orientation task and a second monitoring task. During a simulated extra-vehicular exploration on planetary surface, performance was compared with different types of displays for aiding navigation (NavAid): a 3D spatial auditory navigation aid (NavAid) (A), a 2D North-Up visual map (V), and the combination of the two in a bimodal NavAid (B). Four different environmental conditions were tested combining high and low levels of visibility and ambiguity. To facilitate comparison with the previous work, performance was analyzed separately for the Single Task (ST) and Dual Task (DT) studies, and then compared between the two tasks for all the independent variables (factors). For both studies, the quantitative dependent variables were: percent correct orientation, left/right decision time, localization accuracy, and localization time. Qualitative measures (subjective ratings) were also collected after the experiment.
Dual Task Results: Overall for ST and DT, the bimodal NavAid was associated with the best performance, both in terms of correct orientation and response times. For orientation, the auditory information channel proved to be an efficient countermeasure to the conflict generated by the difference between egocentric and allocentric reference frames in the displays tested. For localization, an “audio locator” display also greatly facilitated accuracy and response time in localization. Taken together, the observations made for ST were corroborated in DT and, in fact, the bimodal advantage was more pronounced under high visual workload.
Docking Task Method: In the second study, a single-task paradigm was used that involved performing a docking task associated with telerobotic planetary exploration on Mars (linkage of a remotely controlled vehicle to a surface habitat). Three different types of docking aids (DockAid) analogous to the NavAids of the previous studies were evaluated: a 2D visual DockAid (V), an auditory DockAid (A), and a combined bimodal DockAid (B). The experiment investigated the impact of display modality of the docking aids on operator performance (docking accuracy and response time) with increased workload introduced via additional control latencies ranging from 0 to 1000 msec. Performance was also assessed under high and low visibility conditions. The quantitative dependent measures were docking accuracy (radial distance between the centers of the targeting reticle and the docking target) and the docking response time. Qualitative measures (subjective ratings) were also collected after the experiment.
Docking Task Results: On-time docking accuracy was greater in the bimodal condition than in the best unimodal condition, again supporting some form of positive multisensory effect. The apparent difficulty of a purely auditory docking evidenced by longer docking times was contradicted by the high rate of accurate docking (94% total, 54% on-time) and shows that the learning curve for processing of spatialized auditory sonification is very quick. As expected, the introduction of latencies was associated with performance degradation. However, this effect was shown to be modality specific, and the presence of auditory cues provided some degree of protection against the negative impact of latencies of 500 ms or less. This performance inflection point may be related to the limits of a “cognitive horizon” for teleoperation in space, i.e., a latency limit of ~500 ms beyond which performance degrades, as described by Lester & Thronson (2011).
Conclusions: Overall, the results of both studies support the idea of integrating 3D audio into displays to aid extravehicular activity on planetary surfaces for tasks as diverse as orientation, localization, or docking. Spatial auditory displays can aid situational awareness, navigation, and way finding by reducing the risk of errors and response latencies. Alone, the auditory system provides a reliable alternate channel of information in cases where the visual information is degraded or unavailable. In particular, it may mitigate the deleterious effects of the relatively small latencies present during lunar telerobotic control from Lagrange points or from Mars orbit during future surface exploration and control missions. When auditory information is provided in synergy with the visual channel, bimodal performance usually exceeds that of the best unimodal display. This multisensory enhancement proved to be inversely proportional to the reliability of the individual sensory inputs (inverse effectiveness effect, Meredith and Stein, 1986).
Recommendations for Guidelines: The results presented here demonstrate that spatial audio displays, both alone and in combination with a visual display, enhance performance and situational awareness, mitigate the impact of visual environment degradation and increased workload, and add to the intuitiveness of the information display:
• User acceptability for 3D audio is very high.
• 3D audio provides an intuitive, ecological, and low-workload solution for the presentation of spatial information.
• 3D audio can be used to efficiently substitute for visual information that is missing or degraded, when workload is increased, and as an effective countermeasure for mental remapping.
• In the orientation/localization task, bimodal presentation of the A and the V spatial information leads to a significant reduction of incorrect orientation responses and a reduction in decision times.
• The use of an auditory localizer, a type of dynamic sonification display, has proved its efficacy, particularly under degraded visual conditions.
• The results of the docking task study indicate that bimodal displays can also mitigate the negative impact of workload in the form of moderate (= 500 msec) control latencies.
• The docking study represents a proof of concept for a purely auditory docking aid. While the auditory aid took longer compared to the visual aid, the fact that accurate auditory docking only required ~30 sec suggests that a purely auditory DockAid is a viable display solution.
Thus, it is recommended that bimodal and/or multimodal displays be used for EVA missions. It is increasingly evident that the auditory channel will need to convey spatial information about localization and navigation in an environment where the visual channel is already saturated by the display of symbology and checklists. The ecological validity of using sound for localization, combined with the possibility of learning to use virtual auditory signals to navigate between virtual waypoints, support their integration in advanced EVA display systems. Similarly, the viability of auditory and/or bimodal aids for tasks such as docking argue for their application to closed-loop tasks involving moderate control latencies.
The integration of alternative ways to present information brings up additional questions such as the best methods for switching between different modes within and/or between the different sensory channels that have been made available to the operator. Issues that must be investigated include that: the use of each sensory channel must be prescribed for a given type of activity; the different functions available cannot overlap; and the sensory channels must combine appropriately to contribute to a reduction of the overall workload while increasing the sense of presence and situation awareness.
For example, 3D audio could provide a higher level of immersion and improved perception of the 6 DOF operational space. The combination of spatial and/or moving sound images with visual stimuli may increase vection and improve the sense of spatial presence as well as mitigating spatial disorientation. Two potential benefits may be of particular interest: (1) providing immediate feedback on operator location as well as the actions performed in space, and (2) providing an auditory “frame of reference” such as an artificial auditory horizon combined with “auditory security boundaries” that define the crew’s position in space in relation to the external features of the environment. Further, during training, the use of spatial audio could provide an additional countermeasure against cyber-sickness (nausea, disorientation, and oculomotor disturbances) induced by scene oscillation along the different axes of motion (pitch, roll, and yaw).
Finally, future investigation of multimodal displays for surface exploration should be extended to the use of tactile displays, for example, in emergency situations requiring coordinated communications between multiple personnel where both the visual and auditory channels may be overloaded.
Meredith, M. A., & Stein, B. E. (1986). Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. Journal of Neurophysiology, 56(3), 640-662.
Lester, D. & Thronson, H. (2011) Low-Latency Lunar Surface Telerobotics from Earth-Moon Libration Points. Proceedings of the AIAA SPACE 2011 Conference & Exposition, Long Beach, California, September 27-29.
Wenzel, E. M., Godfroy, M., & Miller, J. D. (2012) Prototype Spatial Auditory Display for Remote Planetary Exploration. Proceedings of the 133rd Convention of the Audio Engineering Society, San Francisco, California, October 26-29.