Search for a command to run...
We introduce a method of feature selection for Support Vector Machines. The method is based upon finding those features which minimize bounds on the leave-one-out error, which can be efficiently computed via gradient descent. The resulting algorithms are shown to be superior to some standard feature selection algorithms on both toy data and real-life problems of face recognition, pedestrian detection and analyzing DNA microarray data. 1 Introduction In many supervised learning problems 1 feature selection is important for a variety of reasons: generalization performance, running time requirements, and constraints and interpretational issues imposed by the problem itself. In classification problems we are given ` data points x i 2 R n labeled y 2 \\Sigma1 drawn i.i.d from a probability distribution P (x; y). We would like to select a subset of features while preserving or improving the discriminative ability of a classifier. A brute force search of all possible subsets is kn...