4D Attention: Comprehensive Framework
for Spatio-Temporal Gaze Mapping

IEEE Robotics and Automation Letters (RA-L) 2021 (in press)



Abstract

This study presents a framework for capturing human attention in the spatio-temporal domain using eye-tracking glasses. Attention mapping is a key technology for human perceptual activity analysis or Human-Robot Interaction (HRI) to support human visual cognition; however, measuring human attention in dynamic environments is challenging owing to the difficulty in localizing the subject and dealing with moving objects. To address this, we present a comprehensive framework, 4D Attention, for unified gaze mapping onto static and dynamic objects. Specifically, we estimate the glasses pose by leveraging a loose coupling of direct visual localization and Inertial Measurement Unit (IMU) values. Further, by installing reconstruction components into our framework, dynamic objects not captured in the 3D environment map are instantiated based on the input images. Finally, a scene rendering component synthesizes a first-person view with identification (ID) textures and performs direct 2D-3D gaze association. Quantitative evaluations showed the effectiveness of our framework. Additionally, we demonstrated the applications of 4D Attention through experiments in real situations.


Paper

4D Attention: Comprehensive Framework for Spatio-Temporal Gaze Mapping
Shuji Oishi, Kenji Koide, Masashi Yokozuka, Atsuhiko Banno
IEEE Robotics and Automation Letters, vol.6, no.4, pp.7240-7247, 2021
pdf bibtex doi


Video