[ English | Japanese ]
This paper presents a dual approach to the study of breath sounds in singing, consisting of an acoustic analysis of breath sounds, and development of an automatic breath detection system. Previous work on automatic breath detection were based on relatively simple features that were postulated to be relevant to the detection. In contrast, this study starts with a detailed acoustic analysis of breath sounds, with the aim to explore novel characteristics. The obtained results can be used to enhance the capability of automatic breath detection. The acoustic analysis used singing voice recordings of 18 singers with a total length of 128 mins (1488 breath events). The results of the analysis show that the spectral envelope of breath sounds remain similar within the same song, and their long-term average spectra have a notable spectral peak at about 1.6kHz for male singers and 1.7kHz for female singers. A prototype version of a breath detection system was implemented, using HMM based on MFCC, delta-MFCC, and delta-power as acoustic features. In an evaluation experiment with 27 unaccompanied song samples, the system achieved an overall recall/precision rate of 97.5%/77.7% for breath sound detection.
This research utilized the RWC Music Database "RWC-MDB-P-2001" (Popular Music).