Title: An Excursion in Probabilistic Hashing Techniques for Big Data

Speaker: Anshumali Shrivastava, Rice University

Abstract: Large scale machine learning and data mining applications are constantly dealing with datasets at TB scale and the anticipation is that soon it will reach PB levels. At this scale conventional algorithms fail and simple data mining operations such as search, learning, clustering, etc. become challenging In this talk, I will introduce probabilistic hashing techniques for large scale search and learning. I will show how the old hashing framework, originally meant for sub-linear search, can be converted into fast learning algorithms. I will talk about our recent success in constructing hash functions for dot product by making use of asymmetry. Such a construction is not possible in the conventional setting and was a known hard problem. I will further show the direct consequence of hashing inner products in speeding up popular learning algorithms. Later, I will discuss the recent improvements that I found in some decade old textbook hashing algorithms, which will include the fastest way of performing minwise hashing in practice. I will demonstrate the utility of the above techniques on various real applications including search, learning, collaborative filtering, record linkage, etc.

Bio: Anshumali Shrivastava is an Assistant Professor in the Department of Computer Science at Rice University with joint appointments in Statistics and ECE department. His broad research interests include large scale machine learning, randomized algorithms for big data systems and graph mining. His research on hashing inner products won Best Paper Award at NIPS 2014 while his work on representing graphs got the Best Paper Award at IEEE/ACM ASONAM 2014. He obtained his Ph. D. in computer science from Cornell University. Before joining Cornell, he worked as a scientist at FICO (Fair Isaac Corp.) research Bangalore, India. Anshumali did his bachelors and masters in mathematics and computing from Indian Institute of Technology (IIT) Kharagpur India.