Wednesday, May 29, 2013

Methods about vehicle detection and recognition

questions, methods and papers summary in stackoverflow: link

Several papers:

1. Car-Rec: A Real Time Car Recognition System

This paper talks about the way to check whether or not a car is in the database, which includes employee cars in a certain parking lot. There are four steps in the framework:
1. feature extraction
2. word quantization
3. image database search
4. structural matching.

If I understood it correctly, the method in this paper cannot recognize the type of vehicles, like sedan, truck, van, SUV and so on.

2. Robust Classification and Tracking of Vehicles in Traffic Video Streams


Actually, the method in this paper depends on the bounding box information too much, "long" bounding box for Semi, "short" ones for Sedan and TSV, so the system sometimes recognizes TSV as Sedan. Moreover, it needs background subtraction and completely static background and dynamic objects, static objects would not work like cars in parking lots.


This paper integrates vehicle tracking and classification (three types, Sedan, Semi, Truck+SUV+Van) together on low resolution traffic video, the technique is also general to be applied to surveillance scenes besides traffic.

It mentioned that model based trackers are robust to illumination and occlusion but require models for all vehicles, limiting its scalability.

Text format accompanying Fig. 4, Track ID, class identifier number ( c0 for TSU, c1 for sedan, c2 for semi), an estimate of speed and direction of travel

The system will have one classifier trained for the entire scene and invariant to the camera pose selected by a remote operator.

The vehicle classifier was built on comparison of different classification schemes using either image based (IB) or image measurement based (IM) features. PCA or LDA was applied to reduce the dimensionality of data, remove redundant information and project the data to a space better suited for classification. A weighted K nearest neighbor classifier achieved classification.

Image Based (IB) features: The image of tracked object is used as a feature vector. To do a proper comparison each object was resized to [64x32] pixels, generating a feature vector with 2048 components.

Image Measurements Based (IM) features: cheaper computationally and storage-wise to maintain a database of features rather than images. The aim is to obtain as many simple measurements as possible and allow a classifier to decide which are best for classification.
The feature vector consist of:
~ area
~ bounding box [width, height]
~ convex area
~ ellipse [eccentricity, major axis, minor, axis]
~ extent - proportion of pixels in bounding box to object
~ solidity - proportion of pixels in convex hull to region
~ perimeter

weighted K Nearest Neighbor (wkNN): each sample is assigned to every class by a class weight while NN only gives a binary indication of class membership. The wkNN weight for each class indicates the strength of match and a label is assigned corresponding to the class with the highest weight.

The L2 norm was used as the distance metric to determine the similarity between vectors.

LDA-IM classifier was chosen for integration into tracking software because using LDA-IM generates a simple classifier with low computation complexity and that generalizes well due to scene object independence.

An adaptive background subtraction scheme was used to detect potential vehicles in Object Detection module. Taking the difference between the current video frame and the estimate background produces regions of moving objects. These regions are processed to produce vehicle blob detections.

Tracking is accomplished by using a Kalman filter on the center of mass of the detected object region. The Kalman module outputs a state vector, [x, v]^T, containing the position and velocity of the region. The Kalman filter is a state estimation tool that predicts the position of a vehicle in the next frame.

The track vehicle label is determined by building a histogram of class weights for each frame in a track T and selecting as label with the highest membership.

By binning class the soft class membership into a track histogram the Track Builder is able to recover from mis-classified examples by only assigning a final label as the most likely class along the entire track.

The TSV and Sedan are most often confused because of their proximity the LDA feature space.

Tracking based classification uses the entire track of a vehicle for classification rather than just an individual frame image. Each frame generates an individual example of a vehicle which can be classified more accurately when all occurrences in a track are combined.

The track based classification results are promising and indicate the value of doing classification over spatio-temporal detections.

3. Fine-Grained Entity Recognition 
Even though this paper proposes a way to define a fine-grained set of 112 tags, formulate the tagging problem as multi-class, multi-label classification, describes an unsupervised method for collecting training data, and presents the FIGER implementation.

In the overview section (2.1), input is a sentence in given plain text, then segment the sentence and find the candidates for tagging. It is a NLP paper, cannot be applied to vehicle recognition.

4. Inducing Fine-Grained Semantic Classes via Hierarchical and Collective Classification
Also, this is a NLP paper, cannot help to recognize vehicle types.

5. A Codebook-Free and Annotation-Free Approach for Fine-Grained Image Categorization 

The codebook method often loses subtle image information that are critical for fine-grained classification.

The annotation way takes a tedious process that is also difficult to generalize to new tasks.


Although it is talking about birds as example, the method might be applied to vehicle recognition and classification.





















No comments:

Post a Comment