Title: Ensemble Learning in the Crowd-sourcing Era
Speaker: Gaurav Pandey, Mt Sinai School of Medicine
Abstract:
Crowd-sourcing-based platforms, such as Kaggle, InnoCentive and DREAM
challenges, are transforming our ability to address hard predictive
modeling tasks by leveraging "wisdom of the crowds". In this talk, I
will present our work on enhancing the predictive power of these
platforms, especially DREAM challenges, through two avenues: (1) a
collaborative-competitive setup of prediction challenges/competitions
and (2) learning heterogeneous ensemble predictors. In the
collaborative-competitive setup, challenge participants are encouraged
to both compete and collaborate (share ideas), both mechanisms leading
to improvement in predictive power. Heterogeneous ensembles are a
data-driven method to achieve this improvement by "smartly" assimilating
the knowledge embedded in predictions submitted by individual
participants. I will also share results demonstrating the potential of
these approaches for difficult biomedical problems, such as the
prediction of protein function and cancer phenotypes.
Bio:
Gaurav Pandey is an Assistant Professor in the Department of Genetics
and Genomic Sciences at the Mount Sinai School of Medicine (New York)
and is part of the newly formed Institute for Genomics and Multiscale
Biology. He completed his Ph.D. in computer science and engineering from
the University of Minnesota, Twin Cities in 2010, and subsequently
completed a post-doctoral fellowship at the University of California,
Berkeley. His primary fields of interest are computational biology,
genomics and large-scale data analysis and mining, and he has published
extensively in these areas.