Title: Flexible Bayesian Models for Complex and Heterogeneous Data
Speaker: Piyush Rai, Duke University
Abstract:
Emergence of the Big Data phenomenon has led to a tremendous growth not
only in the scale of the data but also in its complexity. Modern data
analysis problems now routinely involve data that can be (usually a mix
of) high-dimensional, noisy, incomplete, heterogeneous, multi-way
(tensor), multi-relational, time-evolving, etc. Moreover, in many
settings, learning from data involves not just learning a single task
but multiple (possibly related) tasks that may benefit from each-other
via a proper "sharing" of information. In this talk, I will first start
with a brief overview of my research talking about how probabilistic
latent variable models, in particular Bayesian and nonparametric
Bayesian approaches, can lead to considerably flexible ways of modeling
such complex data. I will then talk specifically about some of my recent
work on (1) learning from sparsely observed tensor data which encodes
multi-way relations over objects from multiple (typically 3 or more)
sets, and (2) learning from heterogeneous multi-modality data,
comprising of multiple feature- and/or similarity-based representations,
with significant amount of missing data. I will also discuss some
specific applications of these frameworks to problems in recommender
systems, information retrieval, cognitive neuroscience, and in modeling
graph-structured and multi-relational data such as complex networks and
knowledge bases. I will conclude with some directions for future work.
Bio:
Piyush Rai is a postdoctoral researcher at Duke University and is
associated with the interdisciplinary Information Initiative at Duke
(iiD). He did his PhD (2007-2012) in Computer Science from the School of
Computing, University of Utah. Prior to that, he did his B.Tech. in
Computer Science and Engineering from the Indian Institute of
Technology, BHU, Varanasi. His research interests are in statistical
machine learning, primarily (but not limited to) probabilistic modeling
and Bayesian statistics, and also in applying statistical learning
methods to solve problems in computer vision, natural language
processing, information retrieval, robotics, computational biology, and
computer systems.