It is well known that support vector machines are among the best general-purpose classification algorithms in use today. The strength lies in the way that a separating hyperplane is generated to “best-separate” two classes of data i.e. by fitting a hyperplane which maximizes that distance between the plane and elements of each class on either side of the plane.
In cases where data is not linearly separable, SVM’s rely on projecting data into higher-dimensions where classes might be separable. Depending on the situation, this formalism may or may not work well. Another method named Boosting has been shown to perform well in a variety of scenarios, and is available for consideration.
Boosting builds a strong classifier from an ensemble of weak classifiers (Freund & Shapire, 1995, Friedman, Hastie, Tibshhirani, 1998). Consider the toy problem of building a classifier where one class has a circular distribution and another has a concentric distribution to the first. The idea is to start with a simple linear classifier that has a better-than-random chance at classification. Next follows an iterative procedure where the mis-classified examples from the first stage are “boosted” in weight, and new hyperplanes are designed to best separate the newly weighted dataset. This results in a sequence of hyperplanes fitted to the data such that their aggregate, when duly weighted, can separate the datasets well.
Boosting has been used to great effect in the Viola-Jones face detection algorithm (coming soon… )