Kamelia Aryafar Dissertation Defense

Tuesday, December 1, 2015

8:00 AM-10:00 AM

Kamelia Aryafar Dissertation Defense

Date/Time: Tuesday, December 1st, 8:00-10:00 AM
Location: University Crossings 153

Abstract:

Multimodal Information Retrieval and Classification

Kamelia Aryafar

Advisor: Ali Shokoufandeh, Ph.D.

Classification optimizations are the corner stone of machine learning models. The main goal of classifiers is to utilize all available data modalities in training to boost the classification performance metrics. This thesis deals with classification and retrieval models from two different perspectives: (1) the single modality classification optimizations where only one data channel can be used; (2) the multimodal classification methods where more than one data channel is available;

A classification system is composed of two main steps: extraction of meaningful features to represent the dataset in a feature space and the classification optimization. The first contribution of this thesis is based on sparse approximation techniques introduced by Donoho, namely the l1- regression. We introduce a sparsity-eager support vector machine optimization that combines the ideas behind l1-regression and SVM to boost the classification performance. We show that the optimization of sparsity-eager SVM can be relaxed and formulated as a linear program. This linear program is then solved by fast gradient descent techniques, yielding an optimal set of classifier coefficients. We compare the performance of this classifier with state-of-the-art deep neural networks and baseline models on various public datasets.

The second contribution of this thesis is a vector space model of feature vectors to boost the classification performance. This representation is similar to the explicit semantic analysis modeling of text documents introduced by Evgeniy Gabrilovich and Shaul Markovitch. In essence, the explicit semantic analysis representation is an extension of term frequency inverse document frequency modeling to multi-dimensional feature vectors. Through a set of experiments, we show that this representation can boost the classification accuracy. The explicit semantic analysis can also provide an efficient cross-domain information retrieval framework. We combine this representation with a canonical correlation optimization to achieve this.

The third contribution of this thesis is based on multimodal approaches to classification: sparse linear integration model and l1-SVM. We extend the sparsity-eager support vector machine optimization to deal with more than one data modality. Again we formulate this optimization as a linear combination of the training samples in multimodal settings. We then relax this optimization by replacing the l0-norm. The final optimization is solvable using convex optimization methods. We show that combining all available data channels can boost the classification accuracy of the sparsity-eager support vector machine classifier in comparison with baseline classifiers.

Contact Information

Evy Vega
ev56@drexel.edu

Location

Drexel University
University Crossings, room 152
3175 JFK Blvd.
Philadelphia, PA 19104

Audience

Graduate Students
Faculty
Staff