2.1 Frequency Analysis

Proceedings of the 1995 International Computer Music Conference
A Real-time Beat Tracking System for Audio Signals / Masataka Goto and Yoichi Muraoka

next up previous
Next: 2.2 Beat Prediction Up: 2. System Description Previous: 2. System Description

2.1 Frequency Analysis

Onset components and noise components are first extracted from the frequency spectrum calculated by the Fast Fourier Transform. Onset-time finders then detect onset times in different frequency ranges and with different sensitivity levels. At the same time, another process, a drum-sound finder, detects BD and SD.


2.1.1 Extracting onset components / Extracting noise components

Frequency components whose power has been rapidly increasing are extracted as onset components. The onset components and their degree of onset (rapidity of increase in power) are obtained by a process that takes into account the power present in nearby time-frequency regions.

BTS extracts noise components as a preliminary step to detecting SD. Because non-noise sounds typically have harmonic structures and peak components along the frequency axis, frequency components whose power is roughly uniform locally are extracted and considered to be potential SD sounds.


2.1.2 Onset-time finders

Fourteen onset-time finders use different sets of frequency-analysis parameters. Each finder sends its onset information to a particular agent-pair. Each onset time is given by the peak time found by peak-picking in D(t) along the time axis, where , and d(t,f) is the degree of onset of frequency f at time t. The sum D(t) is linearly smoothed with a convolution kernel before its peak time is calculated.


2.1.3 Drum-sound finder

A drum-sound finder detects BD from the onset components and SD from the noise components. Note that BTS cannot simply use the detected drums to track beats, because the results of this detection include many mistakes. The detected drums are used only to label a beat time with its beat type.

[Detecting onset times of BD]
Because the sound of BD is not known in advance, BTS learns the characteristic frequency of BD corresponding to a particular song. The finder finds peaks in the onset components along the frequency axis and histograms them (Figure 2). The finder then judges that BD has sounded at times when an onset's peak

frequency coincides with the characteristic frequency that is given by the lowest-frequency peak of the histogram.

[Detecting onset times of SD]
BTS detects noise components widely distributed along the frequency axis as SD. First, the noise components are quantized (Figure 2). Second, the finder calculates how widely noise components are distributed along the frequency axis in the quantized noise components (degree of wide distribution c(t)). Finally, the onset time of SD is obtained by peak-picking of c(t) in the same way as in the onset-time finder.

  
Figure 2: Detecting BD and SD


next up previous
Next: 2.2 Beat Prediction Up: 2. System Description Previous: 2. System Description

Masataka Goto
July 20, 1995