Title: Finding the needle in a structured haystack: scalable optimization for rare item detection and applications
Speaker: Purushottam Kar, Microsoft Research India
Abstract:
The modern machine learning application is characterized by enormous
volumes of data, as well as a demand for extreme precision. For
instance, breast cancer detection requires cancerous tissue to be
detected in mammogram patches where cancerous tissue is typically
present in less than 0.61% of the patches. An interest in an exceedingly
rare class of items or patterns is similarly observed in domains such as
anomaly detection, bio-informatics and medicine. Such tasks require
sensitivity to class imbalance as well as prediction characteristics.
Taken together, these extreme requirements preclude a direct application
of traditional machine learning techniques designed for large scale data
processing such as online and stochastic gradient methods.
In this talk I will describe advances that bring to bear scalable
learning techniques on extreme learning problems as those described
above. Our methods cover a wide variety of learning scenarios, can
operate in situations where only streaming access is available to data,
offer provably optimal solutions, and can be shown to significantly
improve the state-of-the-art in learning tasks in terms of training time
and prediction quality.
Bio:
Purushottam Kar is a post-doctoral Research Fellow with the Machine
Learning and Optimization Group at Microsoft Research India. He obtained
his Ph.D. in Computer Science and Engineering at the Indian Institute of
Technology Kanpur. He is interested in the foundations of machine
learning and in building scalable learning systems. His Ph.D.
dissertation was awarded the 2014 doctoral dissertation award by the
Indian Unit of Pattern Recognition and Artificial Intelligence. He is
also a recipient of the 2010 Microsoft Research India Ph.D. fellowship.