School of Mathematical and Statistical Sciences

Computational and Applied Math Proseminar

Thursday, October 15, 12:00 p.m. ECG 317

Wang Juh Chen

School of Math&Stats

A New SVM Model

Abstract We propose a new formulation of the Support Vector Machine (SVM). It is based on the development of ideas from the method of total least squares, in which assumed errors in measured data (errors-in-features) are incorporated in the model design. For example genetic data measured from micorarrays are noise contaminated. Also, for genetic data, the number of features is far greater than the sample size because of not only the high cost of the experiment but also the requirement of collecting patients with the necessary conditions. Traditional classification methods can not be applied directly due to the ``curse of dimensionality'', which is a problem caused by high dimensionality of the feature spaces (parameters) with not enough observations to get good estimates, but SVM-based algorithms, which employ dual methods and the use of a data mapping kernel, have the potential to overcome this difficulty. In our method, we introduce Lagrange multipliers and solve for the dual variables. Instead of finding the optimal value of the Lagrange function, we solve the nonlinear system of equations obtained from the Karush-Kuhn-Tucker (KKT) conditions. We also implement complementarity constraints and incorporate weighting of the linear system by the inverse covariance matrix of the measured data. To improve accuracy of the classification we introduce regularization for the ill-posed linear problem which arises during calculation. Some other aspects of improving the algorithms are also considered such as choosing the initial point and methods to avoid over-fitting.

We apply the proposed algorithm on several public microarray data sets and Positron Emission Tomography (PET) images. The results indicate that the proposed algorithm is competitive with the standard SVM and performs better in some cases. It also succeeds when applied to the dot-product data mapping in the kernel, hence demonstrating the ability of classifying the data with millions of features, i.e. PET images, which is classically incredibly difficult. The algorithm demonstrates better ability to classify the data sets even when there exists errors in features and gives improved results and higher sensitivity for classifying a set of Alzheimer's Disease (AD) PET images.