PodCastle: A Spoken Document Retrieval Service Improved by Anonymous User Contributions

Speech Recognition Research 2.0: A Web 2.0 Approach to Speech Recognition Research

This project is proposed and researched by Masataka Goto and Jun Ogata.

Introduction:

PodCastle is a speech retrieval web service that collects and amplifies voluntary contributions by anonymous users. Our goal is to provide users with a public web service based on speech recognition and crowdsourcing so that they can experience state-of-the-art speech recognition performance through a useful service. PodCastle enables users to find speech data (such as podcasts and video clips on video sharing services) that include a search term, read full texts of their recognition results, and easily correct recognition errors by simply selecting from a list of candidates. The resulting corrections were used to improve both the speech retrieval and recognition performances. In our experiences from its practical use over the past four years (since December, 2006), over half a million recognition errors in about one hundred thousand speech data were corrected by anonymous users and we confirmed that the speech recognition performance of PodCastle was actually improved by those corrections.

PodCastle is the world's first speech service based on crowdsourcing and wisdom of crowds, and the first instance of our research approach, ``Speech Recognition Research 2.0'', which is aimed at providing users with a web service based on Web 2.0 and at promoting speech recognition technologies in cooperation with anonymous users.

Video Clip

Demonstration video of PodCastle
- Full version (high quality):
  Demonstration of PodCastle
  (39,294,980 bytes, 59 sec, MPEG-1 file)
- Full version (low quality):
  Demonstration of PodCastle
  (8,384,516 bytes, 59 sec, MPEG-1 file)

Acknowledgments:

We thank Youhei Sawada, Shunichi Arai (Mellowtone Inc.), Kouichirou Eto (AIST), and Ryutaro Kamitsu (Brazil Inc.) for their web service implementation. We also thank anonymous users of PodCastle for correcting speech recognition errors.

References:

Masataka Goto and Jun Ogata: PodCastle: Recent Advances of a Spoken Document Retrieval Service Improved by Anonymous User Contributions, Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), pp.3073-3076, August 2011.
Masataka Goto and Jun Ogata: Invited talk "PodCastle: A Spoken Document Retrieval Service Improved by User Contributions", Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation (PACLIC 24), pp.3-11, November 2010. (Invited Paper)
Masataka Goto and Jun Ogata: Invited talk "PodCastle: A Spoken Document Retrieval Service Improved by User Contributions" in the 5th Korea-Japan Database Workshop 2010 (KJDB2010), Jeju, Korea, May 28, 2010
Masataka Goto, Jun Ogata, and Kouichirou Eto: "PodCastle: A Spoken Document Retrieval System Improved by User Contributions", Transactions of the Japanese Society for Artificial Intelligence, Vol.25, No.1, pp.104-113, January 2010. (in Japanese)
(JSAI server)
Jun Ogata and Masataka Goto: PodCastle: A Spoken Document Retrieval System for Podcasts and Its Performance Improvement by Anonymous User Contributions, Proceedings of the Third Workshop on Searching Spontaneous Conversational Speech (SSCS 2009), pp.37-38, October 2009.
Jun Ogata and Masataka Goto: PodCastle: Collaborative Training of Acoustic Models on the Basis of Wisdom of Crowds for Podcast Transcription, Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009), pp.1491-1494, September 2009.
Masataka Goto, Jun Ogata, and Kouichirou Eto: PodCastle: A Web 2.0 Approach to Speech Recognition Research, Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), pp.2397-2400, August 2007.
Jun Ogata, Masataka Goto, and Kouichirou Eto: Automatic Transcription for a Web 2.0 Service to Search Podcasts, Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), pp.2617-2620, August 2007.

Back to:

Masataka Goto's Home Page

Masataka GOTO <m.goto [at] aist.go.jp>
All pages are copyrighted by the author. Unauthorized reproduction is strictly prohibited.

last update: September 2, 2011