Title: Extreme Classification: A New Paradigm for Ranking & Recommendation
Speaker: Manik Varma, Microsoft Research
Abstract:
The objective in extreme multi-label classification is to learn a
classifier that can automatically tag a data point with the most
relevant subset of labels from a large label set. Extreme multi-label
classification is an important research problem since not only does it
enable the tackling of applications with many labels but it also allows
the reformulation of ranking and recommendation problems with certain
advantages over existing formulations.
Our objective, in this talk, is to develop an extreme multi-label
classifier that is faster to train and more accurate at prediction than
the state-of-the-art Multi-label Random Forest (MLRF) algorithm [Agrawal
et al. WWW 13] and the Label Partitioning for Sub-linear Ranking (LPSR)
algorithm [Weston et al. ICML 13]. MLRF and LPSR learn a hierarchy to
deal with the large number of labels but optimize task independent
measures, such as the Gini index or clustering error, in order to learn
the hierarchy. Our proposed FastXML algorithm achieves significantly
higher accuracies by directly optimizing an nDCG based ranking loss
function. We also develop an alternating minimization algorithm for
efficiently optimizing the proposed formulation. Experiments reveal that
FastXML can be trained on problems with more than a million labels on a
standard desktop in eight hours using a single core and in an hour using
multiple cores.
Bio:
Manik Varma is a researcher at Microsoft Research India. Manik received
a bachelor's degree in Physics from St. Stephen's College, University of
Delhi in 1997 and another one in Computation from the University of
Oxford in 2000 on a Rhodes Scholarship. He then stayed on at Oxford on a
University Scholarship and obtained a DPhil in Engineering in 2004.
Before joining Microsoft Research, he was a Post-Doctoral Fellow at the
Mathematical Sciences Research Institute Berkeley. He has been an
Adjunct Professor at the Indian Institute of Technology (IIT) Delhi in
the Computer Science and Engineering Department since 2009 and jointly
in the School of Information Technology since 2011. His research
interests lie in the areas of machine learning, computational
advertising and computer vision. He has served as an Area Chair for
machine learning and computer vision conferences such as ACCV, CVPR,
ICCV, ICML, ICVGIP and NIPS. He has been awarded the Microsoft Gold Star
award and has won the PASCAL VOC Object Detection Challenge. Classifiers
that he has developed are running live on millions of machines around
the world protecting them from viruses and malware.