GPU Computing for Object Recognition

Extraction of local invariant features by GPU computing

Local invariant features have been widely used as fundamental elements for image matching and object recognition. Although dense sampling of local features is useful in achieving an improved performance in image matching and object recognition, it results in increased computational costs for feature extraction. The purpose of this paper is to develop fast computational techniques for extracting local invariant features through the use of a graphics processing unit (GPU). In particular, we consider an algorithm that uses multiresolutional orientation maps to calculate local descriptors consisting of the histograms of gradient orientations. By using multiresolutional orientation maps and applying Gaussian filters to them, we can obtain voting values for the histograms for all the pixels in a scale space pyramid. We point out that the use of orientation maps has two advantages in GPU computing. First, it improves the efficiency of parallel computing by reducing the number of memory access conflicts in the overlaps among local regions, and secondly it utilizes a fast implementation of Gaussian filters that permits the use of shared memory for the many convolution operations required for orientation maps. We conclude with experimental results that demonstrate the usefulness of multiresolutional orientation maps for fast feature extraction.

The following images, "Tour de France" and "Graffiti", were used in the experiments to measure the computational times for extracting local invariant features. The top row shows the original images and the bottom row represents the local regions as the yellow rectangles in which the local descriptors are calculated. The resolutions of images were 720x480 and 320x240.

The computational times for the tasks to extract the local invarinat features are listed blow. There are 4 implementations in this table: CPU-C and GPU-C stand for the CPU and GPU implementations of the conventional method which does not used the orientation maps to calculate the descriptors. CPU and GPU represent the CPU and GPU implementations of the method using the orientation maps. The CPU was an Intel Quadcore Xeon 3.16GHz and the GPU was a NVIDIA GeForce GTX280.

By comparing the computational times for “Dominant orientation”, “Orientation map” and “Descriptor” among the implementations, we can see that both the introduction of orientation maps and the use of the GPU are very effective in terms of fast feature extraction. Although the effectiveness depends on the resolution of the image and the contents of the scene, the method is more than 30 times as fast as the conventional method on the CPU and about 2 times as fast as the conventional method on the GPU.

The following graphs show the efficiency of the parallel computation in feature extraction of the image "Tour de France" . The horizontal axis represents the factor which is proportaional to the sizes of local regions. The vertical axis show the ratio between the computational times on the CPU and the GPU. The ratio of the computational times reflects the efficiency of parallel computing because the computational cost is the same for both the CPU and the GPU. We can change the area of the overlaps among local regions by altering the factor of the horizontal axis. This means that the computational dependency among local regions varies with the factor. The use of a smaller factor reduces the multiple computation for the histograms and the number of memory access conflicts in the overlaps, which increases the efficiency of the parallel computing.

As we can see from the graph on the left, when no orientation maps were used, the efficiency of parallel computing fell as the factor was increased. On the other hand, the efficiency was obviously maintained when orientation maps were used. One reason for this improvement is the reduction in the multiple computation for the histograms and the number of memory access conflicts in the overlaps, and another one is the fast implementation of the Gaussian filters for the multiresolutional orientation maps. The graph on the right shows the advantage of the parallel implementation of the Gaussian filters with the shared memory in comparison with the CPU-based implementation. Because the areas of the local regions increase as the factor is increased, we need Gaussian filters with larger scales, which yield a large number of convolution operations. However, no degradation of the efficiency of the parallel computation was observed as a result of data reuse in the shared memory. Similar results were obtained for the Graffiti image.

In summary, we have shown that the use of multiresolutional orientation maps for calculating the descriptors has two advantages in GPU computing: an improvement in the efficiency of parallel computing as a result of a reduction in the multiple computation for the histograms and the number of memory access conflicts in the overlaps among local regions, and the utilization of the fast implementation of Gaussian filters using the shared memory for large numbers of convolution operations for orientation maps.

【References】

[1] 市村直幸:方向マップを用いた局所不変特徴量の抽出、映像情報メディア学会誌、Vol.66、No.10、pp.831-834、2012（特集「GPUとその応用」における解説記事）.

[2] N. Ichimura:"Scalable Local Feature Extraction with Orientation Maps and GPU Computing," GPU Technology Conference, 2012 (pdf,GTC On-Demand)

[3] N. Ichimura:"Extracting Multi-size Local Descriptors by GPU Computing," Proc. The Second IEEE Workshop on Visual Content Identification and Search (VCIDS'11), 2011 (pdf)

[4] 市村直幸:GPUと方向マップに基づく局所不変特徴量のオンライン抽出、情処研報、Vol.2011-CG-142, No.1, 2011 (pdf).

[5] 市村直幸: 多重サイズ局所記述子のGPUによる抽出、画像の認識・理解シンポジウム(MIRU2010), 2010 (pdf).

[6] N. Ichimura:``GPU Computing with Orientation Maps for Extracting Local Invariant Features,'' Proc. The Sixth IEEE Workshop on Embedded Computer Vision (ECVW2010), 2010 (pdf).

[7] 市村直幸: GPUによる特徴点とエッジに基づく局所不変特徴量の抽出、情処研報、Vol.2009-CG-136, No.11, 2009 (pdf).

[8] 市村直幸: ブランド露出調査のためのGPUによる局所不変特徴量の抽出、画像の認識・理解シンポジウム(MIRU2009)、2009 (pdf).

[9] 市村直幸: 密なエッジサンプリングに基づく局所不変特徴量による対応付け、信学技報、No.PRMU2009-51, pp.71-76, 2009 (pdf).

[10] 市村直幸: 近似LoGフィルタを用いた局所不変特徴量の抽出 - GPUによる実装 -、情処技報、Vol.2008-CVIM-165, pp.243-250, 2008 (pdf).

[IEEE copyright notice] © 2011 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.