Title: Neural Attention Models for Natural Language Grounding and Generation
Speaker: Mohit Bansal, TTI-Chicago
Abstract:
Neural sequence-to-sequence, encoder-decoder models have recently shown
strong promise in the areas of machine translation and image captioning
as end-to-end models that require little domain-specific knowledge or
resources. Incorporating an attention or alignment step into this
encoder-decoder architecture helps further by learning to focus on parts
of the input sequence that are salient for generating a particular step
in the output sequence. We employ and improve this neural attention
architecture for the tasks of natural language grounding and selective
generation. Our grounding task is that of direction following, a skill
that is essential to realizing effective autonomous agents. Our model
uses long short-term memory recurrent neural networks (LSTM-RNNs) to
encode natural language instructions and decode to action sequences
based upon a world state representation, using a novel multi-input
aligner. Our selective generation task is the joint task of content
selection and surface realization, where we first encode the full set of
over-determined database event records (e.g., in weather forecasting and
sportscasting) via an LSTM-RNN, then utilize a novel coarse-to-fine
(hierarchical) aligner to identify the small subset of salient records
to talk about, and finally employ a decoder to generate free-form
descriptions of the aligned, selected records. In contrast to existing
methods, our models use no specialized resources (e.g., parsers,
lexicons, features). They are therefore generalizable, yet they still
achieve the best results reported to-date on multiple benchmark
datasets. We also present various ablations to elucidate the
contributions and limitations of the primary components of our models.
This is joint work with Hongyuan Mei and Matthew R. Walter at TTI-Chicago.
Bio:
Dr. Mohit Bansal is a research assistant professor at TTI-Chicago. He
received his Ph.D. and M.S. in CS from UC Berkeley in 2013 (where he was
advised by Dan Klein) and his B.Tech. in CSE from the IIT Kanpur in
2008. His research interests are in statistical natural language
processing and machine learning, with a particular interest in semantics
(lexical, compositional, multimodal), syntactic parsing, question
answering, and deep learning. He was the recipient of a Google Faculty
Research Award in 2014, an IBM Faculty Award in 2014, an ACL Long Best
Paper Honorable Mention (top-5 paper) in 2014, and a Qualcomm Innovation
Fellowship in 2011. Webpage: http://ttic.uchicago.edu/~mbansal/