SmartMusicKIOSK is a new music-playback interface for trial listening. In music stores, customers typically search out the chorus or ``hook'' of a song by repeatedly pressing the fast-forward button, rather than passively listening to the music. This activity is not well-supported by current technology. This research achieves a function for jumping to the chorus section and other key parts of a song, plus a function for visualizing song structure. These functions eliminate the hassle of searching for the chorus and make it easier for a listener to find desired parts of a song, thereby facilitating an active listening experience. This interface, which enables a listener to look for a section of interest by interactively changing the playback position, is useful not only for trial listening but also for more general purposes in selecting and using music. The proposed functions are achieved through an automatic audio-based chorus-section detection method, and the results of implementing them in a listening station have demonstrated their usefulness. While entire songs of no interest to the listener can be skipped on conventional music-playback interfaces, SmartMusicKIOSK is the first interface that allows the user to easily skip sections of no interest even within a song.
When ``trial listening'' to prerecorded music on compact discs (CDs) at a music store, a listener often takes an active role in the playback of musical pieces or songs by picking out only those sections of interest. This new type of music interaction differs from passive music appreciation in which people usually listen to entire musical selections. To give some background, music stores in recent years have installed music listening stations to allow customers to listen to CDs on a trial basis to facilitate a purchasing decision. In general, the main objective of listening to music is to appreciate it, and it is common for a listener to play a musical selection from start to finish. In trial listening, however, the objective is to quickly determine whether a selection is the music one has been looking for and whether one likes it, so listening to entire selections in the above manner is rare. In the case of popular music, for example, customers often want to listen to the most representative, uplifting part of a song, i.e., the chorus or refrain, to pass judgment on that song. This desire produces a special way of listening in which the trial listener first listens briefly to a song's ``intro'' and then jumps ahead in search of the chorus by repeatedly pushing the fast-forward button, eventually finding the chorus and listening to it.
The functions provided by conventional listening stations for music CDs, however, do not support this unique way of trial listening very well. These listening stations are equipped with playback-operation buttons typical of an ordinary CD player, and among these, only the fast-forward and rewind buttons can be used to find the chorus section of a song. On the other hand, the digital listening stations that have recently been installed in music stores enable playback of musical selections from a hard disk or over a network. Here, however, only one part (e.g., the beginning) of each musical selection (an interval of about 30 to 45 seconds) is mechanically excerpted and stored, which means that a trial listener may not necessarily hear the chorus section.
Against the above background, I developed SmartMusicKIOSK, in which a trial listener can jump to the beginning of a song's chorus (perform an instantaneous fast-forward to the chorus) by simply pushing the button for this function.
For music that would normally not be understood unless some time was taken for listening, the problem here is how to enable changing between specific playback positions before actual listening. To solve this problem, I propose the following two methods assuming the main target to be popular music.
To enable the handling of a large number of songs, this research aims for a general and robust chorus-section detection method using no prior information on acoustic features unique to choruses. To this end, I focus on the fact that chorus sections are usually the most repeated sections of a song and adopt the following basic strategy: find sections that repeat and output those that appear most often. It must be pointed out, however, that it is difficult for a computer to judge repetition because it is rare for repeated sections to be exactly the same.
I therefore developed a method that overcomes this difficulty and automatically detects the beginning and end points of chorus sections and repeated sections in compact-disc recordings of popular music. Most previous methods detected as a chorus a repeated section of a given length and had difficulty in identifying both ends of a chorus section and in dealing with modulations (key changes). By analyzing relationships between various repeated sections, my method can detect all the chorus sections in a song and estimate both ends of each section. It can also detect modulated chorus sections by introducing a similarity measure that enables modulated repetition to be judged correctly. Experimental results with a popular-music database show that this method detects the correct chorus sections in 80 of 100 songs.
"NEXT CHORUS" button (jump to chorus) |
(A-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(A-2) SmartMusicKIOSK keeps playing. |
(A-3) When the user pushes the "NEXT CHORUS" button, SmartMusicKIOSK jumps to the start of the next chorus section in the song from the present cursor position. |
(A-4) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one. |
(A-5) When the user pushes the "NEXT CHORUS" button again, it jumps to the chorus section next to the previous one. |
(A-6) When the user pushes the "NEXT CHORUS" button after the final chorus section, it returns to the first chorus section. |
"NEXT SECTION" button (jump to next section in song) |
(B-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(B-2) SmartMusicKIOSK keeps playing. |
(B-3) When the user pushes the "NEXT SECTION" button, SmartMusicKIOSK jumps to the start of the next repeated section in the song from the present cursor position. |
(B-4) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one. |
(B-5) When the user pushes the "NEXT SECTION" button again, it jumps to the repeated section next to the previous one. |
"PREV SECTION" button (jump to previous section in song) |
(C-1) When a user pushes the "PLAY" button, SmartMusicKIOSK starts playing. |
(C-2) SmartMusicKIOSK keeps playing. |
(C-3) When the user pushes the "PREV SECTION" button, SmartMusicKIOSK jumps to the start of the previous repeated section in the song from the present cursor position. |
(C-4) When the user pushes the "PREV SECTION" button again, it jumps to the previous repeated section. |
The main contribution of this research is to propose a novel music-playback interface SmartMusicKIOSK, considering that conventional playback-operation buttons on CD players or media-player software have not been improved for a long time. One of the innovations brought by CD players is to enable a listener to immediately skip a song (track) of no interest --- i.e., ``listen to any track of a CD whenever one likes.'' I believe that the SmartMusicKIOSK brings a similar innovation at a different level: it enables a listener to immediately skip a structural section (part) of no interest --- i.e., ``listen to any part of a song whenever one likes'' without having to follow the timeline of the original song. I hope this research opens up new vistas for future research that reexamines the entire functional makeup of music-playback interfaces to make interaction between people and music more active and enriching.
This research utilized the RWC Music Database "RWC-MDB-P-2001" (Popular Music). The author would like to thank Hideki Asoh (National Institute of Advanced Industrial Science and Technology) for his valuable discussions.