Title: Social Network Extraction from Text
Speaker: Apoorv Agarwal, Columbia University
Abstract:
Social networks are usually built using meta-data such as
sender-recipient email links or self declared friendships. However, a
large part of a social network is expressed and maintained through the
use of language. For example, in an email people might share who they
"talk to" or have "dinner with." Extracting such social interactions
that are expressed using language through a wide variety of linguistic
constructs requires natural language processing (NLP) and machine
learning (ML) techniques.
In this talk, I will present a formalization of networks based on
interactions. We have, thus far, extracted networks from three broad
genres of text: 19th century British literature, movie screenplays, and
the Enron email corpus. I will present technical challenges in dealing
with each genre followed by NLP and ML solutions and applications.
Bio:
Apoorv Agarwal is a sixth year Ph.D. candidate in the Computer Science
department at Columbia University, NY. His areas of interest and
specialization are Natural Language Processing and Machine Learning. He
is one of the recipients of the 2013-14 IBM PhD fellowship for his work
with the DeepQA team that built Watson (a machine capable of answering
Jeopardy! questions). His work on social network extraction from text
has been demonstrated at the DARPA demo day held in May 2014 at the
Pentagon and at the NYC Media Lab Annual Summit 2014. He is recipient of
the NSF Innovation Corps (I-corps) award for Aug-Dec 2014. The award
helped him identify commercial applications for his research. He is now
one of the two founders of Text IQ, a start-up that aims to make the
document review process faster for attorneys.
Relevant publications can be found at
my Google scholar page.
Homepage (last updated in 2012!)