2.2 Beat Prediction

Proceedings of the 1995 International Computer Music Conference
A Real-time Beat Tracking System for Audio Signals / Masataka Goto and Yoichi Muraoka

next up previous
Next: 3. Experiments and Results Up: 2. System Description Previous: 2.1 Frequency Analysis

2.2 Beat Prediction

Twenty-eight agents maintain their own hypotheses, each of which consists of a predicted next-beat time, its beat type, and the current IBI. These hypotheses are gathered by the manager (Figure 1), and the most reliable one is selected as the output. The twenty-eight agents are grouped into fourteen agent-pairs. Each agent-pair is different in that it receives onset information from a different onset-time finder. As mentioned in the first section of this paper, two agents in a pair try to track beats with different ranges of tempi. In other words, the two agents have the different assigned ranges of IBI.

The following sections describe the formation and management of hypotheses. First, each agent predicts the next beat time using auto- and cross-correlation, and then evaluates its own reliability (Predicting next beat). Second, the agent infers its beat type and modifies its reliability (Inferring beat type). Finally, the manager selects the most reliable hypothesis from the hypotheses of all agents (Managing hypotheses).


2.2.1 Predicting next beat

Beats are characterized by two properties: IBI (period) and phase. The phase of a beat is the relative beat position to the most recent onset time. We measure phase in radians; for a quarter-note beat, for example, an eighth-note displacement corresponds to a phase-shift of pi radians (Figure 3).

  
Figure 3: Predicting next beat

Each agent first calculates the current IBI (period). The IBI is given by the maximum value within the assigned IBI range in the autocorrelation function of the received onset times. To determine the beat phase, the agent then calculates cross-correlation between the onset times and a set of equally-spaced pulse sequences whose temporal interval is the IBI. The maximum value in the cross-correlation function provides the plausible beat phase. This calculation corresponds to evaluating all possibilities of the beat phase under the current IBI. The next beat time is thus predicted on the basis of the IBI and the current beat phase.

Each agent evaluates the reliability of its own hypothesis. This is determined on the basis of how the next beat time predicted by the auto- and cross-correlation coincides with the time extrapolated from the past two beat times (Figure 3). If they coincide, the reliability is increased; otherwise, the reliability is decreased.


2.2.2 Inferring beat type

Each agent determines the beat type by matching the pre-registered drum patterns of BD and SD with the currently detected drum pattern. Figure 4 shows two examples of the pre-registered patterns. These patterns represent how BD and SD are typically played in rock and pop music. The beginning of a pattern should be a strong beat, and the length of the pattern is restricted to a half note or a measure.

  
Figure 4: Examples of pre-registered drum patterns

The beat type and its reliability are obtained as follows: (1) The onset times of drums are quantized to the currently detected pattern, with one sixteenth-note resolution that is obtained by interpolating between successive beat times (Figure 5). (2) The matching score of each pre-registered pattern is calculated by matching the pattern with the currently detected pattern: The score is weighted by the product of the weight in the pre-registered pattern and the reliability of the detected onset. (3) The beat type is inferred from the fact that the beginning of the best-matched pattern indicates the position of the strong beat (Figure 6). The reliability of the beat type is given by the highest matching score.

  
Figure 5: A drum pattern detected from an input

  
Figure 6: Inferring beat type

If the reliability of the beat type is high, the IBI in the hypothesis can be considered to correspond to a quarter note. In that case, the reliability of the hypothesis is increased so that a hypothesis with an IBI corresponding to a quarter note is likely to be selected.


2.2.3 Managing hypotheses

The manager classifies all agent-generated hypotheses into groups, according to beat time and IBI. Each group has an overall reliability, given by the sum of the reliability of the group's hypotheses. The most reliable hypothesis in the most reliable group is selected as the output and sent to the BI Transmission stage.


next up previous
Next: 3. Experiments and Results Up: 2. System Description Previous: 2.1 Frequency Analysis

Masataka Goto
July 20, 1995