Thursday, May 30, 2013

Basic concepts in Computer Vision and Machine Learning


Attribute: a semantic way to describe objects, human defined most times, like color, type, etc.


Feature: a piece of information which is relevant for solving the computational task related to a certain application. More specifically, features can refer to:
1. the result of a general neighborhood operation (feature extractor or feature detector) applied to the image
2. specific structures in the image itself, ranging from simple structures such as points or edges to more complex structures such as objects.
Other examples of features are related to motion in image sequences, to shapes defined in terms of curves or boundaries between different image regions, or to properties of such a region.
The feature concept is very general and the choice of features in a particular computer vision system may be highly dependent on the specific problem at hand.


Bag-of-words (BoW model): can be applied to image classification, by treating image features as words. In computer vision, a bag of visual words is a sparse vector of occurrence counts of a vocabulary of local image features.
The first step is to extract features, the second one is to represent features.
The final step for the BoW model is to convert vector represented patches to "codewords" (analogy to words in text documents), which also produces a "codebook" (analogy to a word dictionary). A codeword can be considered as a representative of several similar patches.
One of notorious disadvantages of BoW is that it ignores the spatial relationships among the patches, which is very important in image representation.
Furthermore, the BoW model has not been extensively tested yet for view point invariance and scale invariance, and the performance is unclear. Also the BoW model for object segmentation and localization is not well understood.


Scale Invariant Feature Transform (SIFT) is a feature detector and descriptor algorithm notable for incorporating robust scale and rotational invariance to feature descriptions.


Histogram of Gradient (HOG):


k-means clustering: a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. This results in a partitioning of the data space into Voronoi cells.
k-nearest neighbor (k-NN): a non-parametric method for classifying objects based on closest training examples in the feature space.


regression: a statistical technique for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed.


AdaBoost: several weak classifier to construct strong classifier


Support Vector Machine (SVM): classifier


Reduce dimensionality
Principal Component Analysis (PCA): align data along the directions of greatest variance (retain directions of large variance, but large variance is not always best for classification)
Linear Discriminant Analysis (LDA): project onto a subspace of best discrimination by maximizing the separation between classes relative to separation within classes  (take into account the actual classes, ratio about the distance between classes and the one within classes)

Deformable Part Model (DPM): 

No comments:

Post a Comment