Title: Finding the needle in a structured haystack: scalable optimization for rare item detection and applications

Speaker: Purushottam Kar, Microsoft Research India

Abstract: The modern machine learning application is characterized by enormous volumes of data, as well as a demand for extreme precision. For instance, breast cancer detection requires cancerous tissue to be detected in mammogram patches where cancerous tissue is typically present in less than 0.61% of the patches. An interest in an exceedingly rare class of items or patterns is similarly observed in domains such as anomaly detection, bio-informatics and medicine. Such tasks require sensitivity to class imbalance as well as prediction characteristics. Taken together, these extreme requirements preclude a direct application of traditional machine learning techniques designed for large scale data processing such as online and stochastic gradient methods.
In this talk I will describe advances that bring to bear scalable learning techniques on extreme learning problems as those described above. Our methods cover a wide variety of learning scenarios, can operate in situations where only streaming access is available to data, offer provably optimal solutions, and can be shown to significantly improve the state-of-the-art in learning tasks in terms of training time and prediction quality.

Bio: Purushottam Kar is a post-doctoral Research Fellow with the Machine Learning and Optimization Group at Microsoft Research India. He obtained his Ph.D. in Computer Science and Engineering at the Indian Institute of Technology Kanpur. He is interested in the foundations of machine learning and in building scalable learning systems. His Ph.D. dissertation was awarded the 2014 doctoral dissertation award by the Indian Unit of Pattern Recognition and Artificial Intelligence. He is also a recipient of the 2010 Microsoft Research India Ph.D. fellowship.