AIST Annotation for the RWC Music Database

Guide to the AIST Annotation


The AIST Annotation is manual annotation of musical pieces of the RWC (Real World Computing) Music Database --- a copyright-cleared music database that is available to researchers as a common foundation for research. To enhance the usefulness of the RWC Music Database, we have made a continuous effort to manually annotate its musical pieces since August, 2001. Here, we provide a set of music-scene descriptions consisting of the beat structure, melody line, and chorus sections. We also provide standard MIDI files that were manually synchronized with the corresponding audio signals at the beat level.


Please see the References/Citations for details on the annotation for each database. Please note that the AIST Annotation is not perfect and still includes some errors. We hope that researchers around the world will also contribute by adding and improving annotated descriptions in various ways (e.g., by correcting those errors or giving us feedbacks) and will share their additions and improvements, thus expediting progress in this field of research.

  1. Popular Music Database (100 songs)
    Piece Nos.: RWC-MDB-P-2001 No. 1 – 100
  2. Royalty-Free Music Database (15 songs)
    Piece Nos.: RWC-MDB-R-2001 No. 1 – 15
  3. Classical Music Database (50 pieces)
    Piece Nos.: RWC-MDB-C-2001 No. 1 – 50
  4. Jazz Music Database (50 pieces)
    Piece Nos.: RWC-MDB-J-2001 No. 1 – 50
  5. Music Genre Database (100 pieces)
    Piece Nos.: RWC-MDB-G-2001 No. 1 – 100
Contributions by Other Researchers:

Other researchers also made contributions by providing manual annotation of musical pieces of the RWC (Real World Computing) Music Database. The following are an incomplete list of such great contributions. Please note that the annotations that are not made by the AIST should not be called as the AIST Annotation. Please refer to the related references and the names of contributers when the following are used. Please let us know if you also make RWC-MDB-related annotations open to the public. To access some of them, you will be asked to enter the original user ID and password that you have already received to download Standard MIDI Files (SMF).

  1. Popular Music Database (100 songs)
    Piece Nos.: RWC-MDB-P-2001 No. 1 – 100
  2. Classical Music Database (50 pieces)
    Piece Nos.: RWC-MDB-C-2001 No. 1 – 50
Notes regarding use:

Details of the AIST Annotation

Using our multipurpose music-scene labeling editor, a music college graduate with absolute pitch annotated the pieces with the following descriptions.

Beat Structure:

The hierarchical beat structure consists of the quarter-note level represented as the temporal position of each beat and the measure level annotated by labeling the beginning of each measure on the corresponding beat (When the time signature is 4/4, for example, the beginning of measures is labeled on every four beats).

Two techniques facilitated this annotation. First, when the audio signal of a track before mixdown for a musical piece included metronome clicks that were given to musicians to keep the tempo in recordings, its track was analyzed by using a simple amplitude-based event detection method. Beat positions were thus initialized with the detected events and each position was then manually checked and adjusted on the editor while watching the waveform and listening to audio playback with clicks at beat positions as well as short playback excerpts before or after a beat position. Second, given the annotated beat positions and a time-signature assumption, the beginning of all measures after the current cursor position of the editor was automatically labeled.

Melody Line:

The melody line is represented as the temporal trajectory of the fundamental frequency (F0). The F0 is measured in hertz and the discrete time step is 10 ms. For time steps where the melody line is absent, the F0 is set to 0 Hz. Note that the melody line is not represented as a series of either musical notes or MIDI note numbers.

As we did for the beat annotation, the melody line was also initialized with the F0 estimated on a melody track before mixdown when available. The F0 values were graphically set and adjusted on the editor while watching the spectrogram with the melody line and listening to the melody playback generated using the amplitude of harmonics of the currently labeled F0 as well as the melody-cancelled background playback.

Chorus Sections:

The chorus (refrain) sections, which are the most representative thematic sections of a musical piece, are represented as a list of the beginning and end points of every chorus section. When the music structure is obvious, a musical piece is manually segmented into sections and every section is labeled with a section name of the music structure, such as intro, verse A, verse B, pre-chorus, chorus A, chorus B, post-chorus, bridge A, bridge B, and ending.

By making the most of the beat-structure annotation, the beginning and end points of each section were easily specified on beat positions while moving the cursor only on beat positions, watching both global and local views of labeled sections, and listening to the audio playback in units of measure or section. It was also useful to highlight each section with a color corresponding to the labeled section name, especially when showing the entire piece in the global view.

Audio-Synchronized Standard MIDI File (SMF):

We have worked on synchronizing each SMF with the audio signal of the corresponding musical piece. Although the SMFs in the RWC Music Database were transcribed by ear and might not correspond to original scores, they can still be considered a potential source of informative annotated descriptions. For example, the onset times of drum sounds were extracted from the synchronized SMFs of RWC-MDB-P-2001 and used as the ground-truth annotation for the Audio Drum Detection contest in the Music Information Retrieval Evaluation eXchange (MIREX) 2005.

Using the annotated beat positions of audio signals, it is not difficult to synchronize those positions with beat positions in an SMF and generate a synchronized tempo track for the SMF. But since the beat positions around the introduction and ending of a piece sometimes do not match straightforwardly, the editor had to include a function to edit their positions on a wave or MIDI-piano-roll display. The editor also supported interactive and synchronized audio/MIDI playback during editing.

Issues When Sharing Annotated Descriptions:

To make annotated descriptions for sound files ripped from CDs available for researchers around the world, an important issue is how to synchronize their temporal axes because different CD drives and ripping software have different temporal offsets or gaps at the beginning of sound files ripped from the same CD. We solve this issue by providing the beginning of each sound file ripped for the AIST Annotation as a signature so that each user can adjust by himself/herself.

All descriptions are stored in text files and can easily be converted to any file format such as XML and CSV. Each time step or section (temporal region) is represented, in a separate text file line, as a pair consisting of its absolute time (with temporal resolution of 10 ms) and values/words.


Please see the following publications for more detailed information about the AIST Annotation for the RWC Music Database. We ask that you use these listings as bibliographical references when citing the AIST Annotation in papers, etc.

  1. Masataka Goto: AIST Annotation for the RWC Music Database, Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp.359-360, October 2006.

Contact Information

Information on the AIST Annotation for the RWC Music Database:
Inquiries (in English only):

Back to:
Author (contents, page design, and distribution system design):
Masataka GOTO (National Institute of Advanced Industrial Science and Technology (AIST))

All pages are copyrighted by the author. Unauthorized reproduction is strictly prohibited.