Search for a command to run...
3) networks incorporate both kernels, which efficiently deal with highdimensional features, and the ability to capture correlations in structured data.We present an efficient algorithm for learning M 3 networks based on a compact quadratic program formulation. We provide a new theoretical bound for general-ization in structured domains. Experiments on the task of handwritten character recognition, demonstrate very significant gains over previous approaches. 1 Introduction In supervised classification, our goal is to classify instances into some set of discrete cat-egories. Recently, support vector machines (SVMs) have demonstrated impressive successes on a broad range of tasks, including document classification, character recognition,image recognition, and many more. SVMs owe a great part of their success to their ability to use kernels, allowing the classifier to exploit a very high-dimensional (possibly eveninfinite-dimensional) feature space. In addition to their empirical success, SVMs are also appealing due to the existence of strong generalization guarantees, derived from the margin-maximizing properties of the learning algorithm.