PodCastle: A Spoken Document Retrieval Service Improved by Anonymous User Contributions
Speech Recognition Research 2.0: A Web 2.0 Approach to Speech Recognition Research
This project is proposed and researched by
Masataka Goto and
Jun Ogata.
Introduction:
PodCastle
is a speech retrieval web service
that collects and amplifies voluntary contributions by anonymous users.
Our goal is to provide users with a public web service based on
speech recognition and crowdsourcing
so that they can experience state-of-the-art speech recognition performance
through a useful service.
PodCastle enables users to
find speech data (such as podcasts and video clips on video sharing services)
that include a search term,
read full texts of their recognition results, and
easily correct recognition errors
by simply selecting from a list of candidates.
The resulting corrections were used
to improve both the speech retrieval and recognition performances.
In our experiences
from its practical use over the past four years (since December, 2006),
over half a million recognition errors
in about one hundred thousand speech data
were corrected by anonymous users and
we confirmed that the speech recognition performance of PodCastle
was actually improved by those corrections.
PodCastle is the world's first speech service based on
crowdsourcing and
wisdom of crowds,
and the first instance of our research approach,
``Speech Recognition Research 2.0'',
which is aimed at providing users with a web service based on Web 2.0
and at promoting
speech recognition technologies in cooperation with anonymous users.
Video Clip
-
Demonstration video of PodCastle
Acknowledgments:
We thank Youhei Sawada,
Shunichi Arai (Mellowtone Inc.),
Kouichirou Eto (AIST),
and Ryutaro Kamitsu (Brazil Inc.)
for their web service implementation.
We also thank anonymous users of PodCastle
for correcting speech recognition errors.
References:
- Masataka Goto and Jun Ogata:
PodCastle: Recent Advances of a Spoken Document Retrieval Service
Improved by Anonymous User Contributions,
Proceedings of the 12th Annual Conference of the
International Speech Communication Association (Interspeech 2011),
pp.3073-3076, August 2011.
- Masataka Goto and Jun Ogata:
Invited talk "PodCastle: A Spoken Document Retrieval Service Improved by User Contributions",
Proceedings of
the 24th Pacific Asia Conference on Language, Information and Computation (PACLIC 24),
pp.3-11, November 2010.
(Invited Paper)
- Masataka Goto and Jun Ogata:
Invited talk "PodCastle: A Spoken Document Retrieval Service Improved by User Contributions"
in the
5th Korea-Japan Database Workshop 2010 (KJDB2010),
Jeju, Korea, May 28, 2010
- Masataka Goto, Jun Ogata, and Kouichirou Eto:
"PodCastle: A Spoken Document Retrieval System Improved by User Contributions",
Transactions of the Japanese Society for Artificial Intelligence,
Vol.25, No.1, pp.104-113, January 2010. (in Japanese)
(JSAI server)
- Jun Ogata and Masataka Goto:
PodCastle: A Spoken Document Retrieval System for Podcasts and Its Performance Improvement by Anonymous User Contributions,
Proceedings of the Third Workshop on Searching Spontaneous Conversational Speech (SSCS 2009),
pp.37-38, October 2009.
- Jun Ogata and Masataka Goto:
PodCastle: Collaborative Training of Acoustic Models on the Basis of Wisdom of Crowds for Podcast Transcription,
Proceedings of the 10th Annual Conference of the
International Speech Communication Association (Interspeech 2009),
pp.1491-1494, September 2009.
- Masataka Goto, Jun Ogata, and Kouichirou Eto:
PodCastle: A Web 2.0 Approach to Speech Recognition Research,
Proceedings of the 8th Annual Conference of the
International Speech Communication Association (Interspeech 2007),
pp.2397-2400, August 2007.
- Jun Ogata, Masataka Goto, and Kouichirou Eto:
Automatic Transcription for a Web 2.0 Service to Search Podcasts,
Proceedings of the 8th Annual Conference of the
International Speech Communication Association (Interspeech 2007),
pp.2617-2620, August 2007.
Back to:
Masataka GOTO
<m.goto [at] aist.go.jp>
All pages are copyrighted by the author.
Unauthorized reproduction is strictly prohibited.
last update: September 2, 2011