Speech Spotter: On-demand Speech Recognition in Human-Human Conversation on the Telephone or in Face-to-Face Situations



This project is proposed and researched by Masataka Goto, Koji Kitayama, Katunobu Itou, and Tetsunori Kobayashi.


box Introduction

This paper describes a novel speech-interface function, called ``speech spotter'', which enables a user to enter voice commands into a speech recognizer in the midst of natural human-human conversation. In the past, it has been difficult to use automatic speech recognition in human-human conversation since it was not easy to judge, from only microphone input, whether a user was speaking to another person or a speech recognizer. We solve this problem by using two kinds of nonverbal speech information: a filled pause (a vowel-lengthening hesitation like ``er...'') and voice pitch. Only when a user utters a voice command with a high pitch just after a filled pause is the voice command accepted by the speech recognizer. By using this speech-spotter function, we have built two application systems: an on-demand information system for assisting human-human conversation and a music-playback system for enriching telephone conversation. The results from using these systems have shown that the speech-spotter function is robust and convenient enough to be used in face-to-face or cellular-phone conversations.


box Video Clips


References:

  1. Masataka Goto, Katunobu Itou, Koji Kitayama, and Tetsunori Kobayashi: Speech-Recognition Interfaces for Music Information Retrieval: ``Speech Completion'' and ``Speech Spotter'', Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), pp.403-408, October 2004.
    PDF Slide PDF
  2. Masataka Goto, Koji Kitayama, Katunobu Itou, and Tetsunori Kobayashi: Speech Spotter: On-demand Speech Recognition in Human-Human Conversation on the Telephone or in Face-to-Face Situations, Proceedings of the 8th International Conference on Spoken Language Processing (ICSLP-2004), pp.1533-1536, October 2004.
    PDF Poster PDF

box Acknowledgments:

This research utilized the RWC Music Database "RWC-MDB-P-2001" (Popular Music).


box Back to:


Masataka GOTO <m.goto [at] aist.go.jp>

All pages are copyrighted by the author. Unauthorized reproduction is strictly prohibited.
last update: September 15, 2004