CSE Seminar
2024 talks
- Knowledge Guides for Scalable Machine Learning
Abstract:Speaker: Prof. Anish Arora from Ohio State University
Date: 2024-01-10 Time: 16:00:00 (IST) Venue: Bharti-501 This talk illustrates a version of Hoare's Law, which says "inside every large (Machine Learned) model there is a small model waiting to get out". Leveraging domain knowledge, we show for problems in classical networking and quantum mechanics how small models can be learned, relative to those learned from black-box methods, with small amounts of data, which still achieving generalizability in not just interpolation but also extrapolation, and also explainability.
Bio:Anish Arora is Chairperson of Computer Science and Engineering at The Ohio State University, where he also directs the 5G-OH Connectivity Center of 5G and Broadband Networking and co-directs the NSF AI-EDGE Institute, which is developing foundations and architecture of intelligence in wireless edge networks. He is CTO of The Samraksh Company, which has commercialized intelligent low-power wireless sensor networks. He graduated from the first batch of Computer Science at IIT, Delhi and is happy to be still playing squash on its campus with anyone who cares to play.
2023 talks
- Text image analysis from low resource dataset
Abstract:Speaker: Prof. Partha Pratim Roy
Date: 2023-08-23 Time: 12:00:00 (IST) Venue: Bharti 501 Text image understanding has long been an active research area because of its complexity and challenges due to a variety of shapes. For bench-marking such a system, the dataset is a necessary and important resource to develop. The deep learning-based text image analytics tasks, such as detection and recognition have shown impressive results under the setting of full supervision of completely labeled datasets. However, the creation of such datasets with a large volume of samples is a challenging and time-consuming task. This research presentation will highlight a few solutions towards effective analysis of textual image analysis in scarcity of data annotation.
Bio:Dr. Partha Pratim Roy (FIETE, SMIEEE) is presently working as an Associate Professor in the Department of Computer Science and Engineering, Indian Institute of Technology (IIT), Roorkee. He received his Masters in 2006 and Ph.D. in 2010 from Universitat Autonoma de Barcelona, Spain. He did postdoctoral stays in France and Canada from 2010 to 2013. Dr. Roy gathered industrial experience while working for about 4 years in TCS and Samsung. In Samsung, he was a part-leader of the Computer Vision research team. He is the recipient of the "Best Student Paper" awarded by the International Conference on Document Analysis and Recognition (ICDAR), 2009, Spain. He has published more than 200 research papers in various international journals, and conference proceedings. He has co-organized several international conferences and workshops, has been a member of the Program Committee of a number of international conferences, and acts as a reviewer for many journals in the field. His research interests include Pattern Recognition, Document Image Processing, Biometrics, and Human-Computer Interaction. He is presently serving as Associate Editor of ACM Transactions on Asian and Low-Resource Language Information Processing, Neurocomputing, IET Image Processing, IET Biometrics, IEICE Transactions on Information and Systems, and Springer Nature Computer Science.
2022 talks
- LP-Duality Theory and the Cores of Games
Abstract:Speaker: Vijay V. Vazirani, UC Irvine
Date: 2022-12-21 Time: 12:00:00 (IST) Venue: bharti-501 Vijay Vazirani got his undergraduate degree from MIT in 1979 and his PhD from the University of California, Berkeley in 1983. He is currently a Distinguished Professor at the University of California, Irvine.
Vazirani has made fundamental contributions to several areas of the theory of algorithms, including algorithmic matching theory, approximation algorithms and algorithmic game theory, as well as complexity theory. His current work is on algorithms for matching markets; his co-edited book on this topic will be published by Cambridge University Press in March 2023. Here is a flyer: https://www.ics.uci.edu/~vazirani/flyer.pdf
Vazirani is an ACM Fellow, a Guggenheim Fellow and the recipient of the 2022 INFORMS John von Neumann Theory Prize.
Bio:The core is a quintessential solution concept in cooperative game theory and LP-duality theory has played a central role in its study, right from its early days to the present time. The classic 1971 paper of Shapley and Shubik showed the ``right'' way of exploiting this theory --- in the context of characterizing the core of the assignment game.
The LP-relaxation of this game has the following key property: the polytope defined by its constraints has integral vertices; in this case, they are matchings in the underlying graph. Similar characterizations for several basic combinatorial optimization games followed; throughout, this property was established by showing that the underlying linear system is totally unimodular (TUM).
We will first exploit TUM further via a very general formulation due to Hoffman and Kruskal (1956). The way to take this methodology to its logical next step is to use total dual integrality (TDI). In the process, we address new classes of games which have their origins in two major theories within combinatorial optimization, namely perfect graphs and polymatroids.
Whereas the core of the assignment game is always non-empty, that of the general graph matching game can be empty. We show how to salvage the situation --- again using LP-duality in a fundamental way.Based on:
https://arxiv.org/pdf/2202.00619.pdf
https://arxiv.org/pdf/2209.04903.pdf
https://www.sciencedirect.com/science/article/pii/S0899825622000239?via%3Dihub
- Building Secure Systems Bottom Up: Hunting down hardware security vulnerabilities
Abstract:Speaker: Jeyavijayan (JV) Rajendran
Date: 2022-12-12 Time: 16:30:00 (IST) Venue: SIT-001 Hardware is at the heart of computing systems. For decades, software was considered error-prone and vulnerable. However, recent years have seen a rise in attacks exploiting hardware vulnerabilities and exploits. Such vulnerabilities are prevalent in hardware for several reasons: First, the existing functional verification and validation approaches do not account for security, motivating the need for new and radical approaches such as hardware fuzzing. Second, existing defense solutions, mostly based on heuristics, do not undergo rigorous red-teaming exercises like cryptographic algorithms; I will talk about how emerging artificial intelligence (AI) can rapidly help red-team such techniques. Last and most important, students and practitioners who are typically trained in designing, testing, and verification are not rigorously trained in cybersecurity -- for many reasons, including a lack of resources, time, and methodologies; I will talk about how AI can be incorporated into (hardware) cybersecurity education.
Bio:Jeyavijayan (JV) Rajendran is an Assistant Professor in the Department of Electrical and Computer Engineering at the Texas A&M University. He obtained his Ph.D. degree from New York University in August 2015. His research interests include hardware security and computer security. His research has won the NSF CAREER Award in 2017, ONR Young Investigator Award in 2022, the IEEE CEDA Ernest Kuh Early Career Award in 2021, the ACM SIGDA Outstanding Young Faculty Award in 2019, the Intel Academic Leadership Award, the ACM SIGDA Outstanding Ph.D. Dissertation Award in 2017, and the Alexander Hessel Award for the Best Ph.D. Dissertation in the Electrical and Computer Engineering Department at NYU in 2016, along with several best student paper awards. He organizes and has co‐founded Hack@DAC, a student security competition co-located with DAC, and SUSHI.
- Efficient Knowledge Extraction and Visual Analytics of Big Data at Scale
Abstract:Speaker: Dr. Soumya Dutt at LANL
Date: 2022-04-18 Time: 12:00:00 (IST) Venue: Teams With the ever-increasing computing power, current big data applications, nowadays, produce data sets that can reach the order of petabytes and beyond. Knowledge extracted from such extreme-scale data promises unprecedented advancements in various scientific fronts, e.g., earth and space sciences, energy applications, chemistry, material sciences, fluid dynamics, just to name a few. However, the complex and extreme nature of these big data sets is currently pushing the very limits of our analytical capabilities. Therefore, finding meaningful and salient information efficiently and compactly from these vast seas of data and then presenting them effectively and interactively have emerged as one of the fundamental problems in modern computer science research.
My talk will characterize various aspects of big data using the 5 Vs, namely, Volume, Velocity, Variety, Veracity, and Value and present novel strategies for efficient data analytics and visualization. I will present state-of-the-art data exploration methodologies that encompass the end-to-end exploration pipeline, starting from the data generation time until the data is being analyzed and visualized interactively to advance scientific discovery. In my talk, statistical and machine learning-based compact data models will be discussed that are significantly smaller compared to the raw data and can be used as a proxy for the data to answer a broad range of scientific questions efficiently. I will demonstrate successful applications of such model-based visual analytics techniques by showing examples from various scientific domains. To conclude my talk, I will briefly highlight my broad-scale future research plan and its implications.
Bio:Soumya Dutta is a full-time Scientist-2 in the Information Sciences group (CCS-3) at Los Alamos National Laboratory (LANL). Before this, Dr. Dutta was a postdoctoral researcher in the Applied Computer Sciences group (CCS-7) at LANL from June 2018 - July 2019. Dr. Dutta obtained his MS and Ph.D. degrees in Computer Science and Engineering from the Ohio State University in May 2017 and May 2018 respectively. Prior to joining Ohio State, Dr. Dutta completed his B. Tech. in Electronics and Communication Engineering from the West Bengal University of Technology in 2009 and then briefly worked in TCS Kolkata from Feb. 2010 - Jul. 2011. His current research interests include Big Data Analytics & Visualization, Statistical Techniques for Big Data, In Situ Analysis, Machine Learning for Visual Computing, and HPC. Dr. Dutta’s research has won Best Paper Award at ISAV 2021 and Best Paper Honorable Mention Award at IEEE Visualization 2016. He was nominated for the Ohio State Presidential Fellowship in 2017 and was also recently selected for the Best Reviewer, Honorary Mention Award for the IEEE TVCG journal for the year 2021. He is a member of IEEE and ACM.
- Continual Learning via Efficient Network Expansion
Abstract:Speaker: Dr. Vinay Kumar Verma
Date: 2022-04-12 Time: 12:00:00 (IST) Venue: Teams As neural networks are increasingly being applied to real-world applications, mechanisms to address distributional shift and sequential task learning without forgetting are critical. Methods incorporating network expansion have shown promise by naturally adding model capacity for learning new tasks while simultaneously avoiding catastrophic forgetting. However, the growth in the number of additional parameters of many of these types of methods can be computationally expensive at larger scales, at times prohibitively so. In this talk, I will discuss two techniques based on task-specific calibration of the features maps, with negligible growth in the number of parameters for each new task. I will discuss the application of these techniques to a variety of problem settings in continual learning, such as task-incremental as well as class-incremental learning, and continual learning for deep generative models. This talk is based on joint work with Kevin Liang, Nikhil Mehta, Pravendra, Pratik, Lawrence Carin, and Piyush Rai.
Bio:Vinay Kumar Verma completed his postdoc at Duke University under the guidance of Prof. Lawrence Carin. Before joining Duke, Vinay completed his PhD in the Department of Computer Science and Engineering at IIT Kanpur, advised by Piyush Rai. Vinay's research interests are in deep learning and computer vision. In particular, he has worked extensively on problems related to deep learning with little/no supervision (zero-shot and few-shot learning) and deep model compression. More recently, during his postdoctoral research, he has been working on continual learning for supervised as well as unsupervised learning problems. Vinay's PhD work also received the outstanding thesis award from IIT Kanpur. He recently joined Amazon, India as an applied scientist and working on visual insights of Softline trends.
- Machine Unlearning
Abstract:Speaker: Dr. Murari Mandal, Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS)
Date: 2022-04-06 Time: 12:00:00 (IST) Venue: Teams Consider a scenario where it is desired that the information pertaining to the data belonging to a single entity or multiple entities be removed from the already trained machine learning (ML) model. Unlearning the data observed during the training of an ML model is an important task that can play a pivotal role in fortifying the privacy and security of ML-based applications. In this talk, I raise the following questions: (i) can we unlearn a class/classes of data from an ML model ? (ii) can we make the process of unlearning fast and scalable to large datasets, and generalize it to different deep networks? and iii) can we do unlearning without having any kind of access to the training data? I will talk about two novel machine unlearning frameworks. The first method offers an efficient solution through an error-maximizing noise generation and impair-repair based weight manipulation. It enables excellent unlearning while substantially retaining the overall model accuracy. The second method introduces unlearning in a zero-shot setting. The newly introduced zero-shot machine unlearning caters to the extreme but practical scenario where zero original data samples are available for use. I will discuss the two novel solutions for zero-shot machine unlearning based on (a) error minimizing-maximizing noise and (b) gated knowledge transfer.
Bio:Murari is a Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS). His research goal is to develop practical and principled approaches to make the ML models i) privacy aware and adaptive to the evolving data privacy rules and regulations, ii) empower the individual user with the facility to derive benefits from one’s own personal data. He is currently working with Prof. Mohan Kankanhalli and Prof. Jussi Keppo. He worked as a lecturer in IIIT Kota prior to starting his Postdoctoral position. His earlier works focused on developing deep learning models for vision applications including moving object detection, video anomaly detection, emotion analysis, and image quality enhancement. A significant part of his current research is focused on machine unlearning, data valuation, privacy and security in deep learning. He has served as a program committee member for AAAI'22, and technical committee member for Earthvision (2021,22), CVPR Workshops. He is a regular reviewer for top-tier conferences and journals including AAAI, ICML, WACV, BMVC, TIP, TMM, TETCI, TITS, TII, TCSVT, etc.
Website: https://www.comp.nus.edu.sg/~murari/
- Machine Unlearning
Abstract:Speaker: Dr. Murari Mandal, Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS)
Date: 2022-04-06 Time: 12:00:00 (IST) Venue: Teams Consider a scenario where it is desired that the information pertaining to the data belonging to a single entity or multiple entities be removed from the already trained machine learning (ML) model. Unlearning the data observed during the training of an ML model is an important task that can play a pivotal role in fortifying the privacy and security of ML-based applications. In this talk, I raise the following questions: (i) can we unlearn a class/classes of data from an ML model ? (ii) can we make the process of unlearning fast and scalable to large datasets, and generalize it to different deep networks? and iii) can we do unlearning without having any kind of access to the training data? I will talk about two novel machine unlearning frameworks. The first method offers an efficient solution through an error-maximizing noise generation and impair-repair based weight manipulation. It enables excellent unlearning while substantially retaining the overall model accuracy. The second method introduces unlearning in a zero-shot setting. The newly introduced zero-shot machine unlearning caters to the extreme but practical scenario where zero original data samples are available for use. I will discuss the two novel solutions for zero-shot machine unlearning based on (a) error minimizing-maximizing noise and (b) gated knowledge transfer.
Bio:Murari is a Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS). His research goal is to develop practical and principled approaches to make the ML models i) privacy aware and adaptive to the evolving data privacy rules and regulations, ii) empower the individual user with the facility to derive benefits from one’s own personal data. He is currently working with Prof. Mohan Kankanhalli and Prof. Jussi Keppo. He worked as a lecturer in IIIT Kota prior to starting his Postdoctoral position. His earlier works focused on developing deep learning models for vision applications including moving object detection, video anomaly detection, emotion analysis, and image quality enhancement. A significant part of his current research is focused on machine unlearning, data valuation, privacy and security in deep learning. He has served as a program committee member for AAAI'22, and technical committee member for Earthvision (2021,22), CVPR Workshops. He is a regular reviewer for top-tier conferences and journals including AAAI, ICML, WACV, BMVC, TIP, TMM, TETCI, TITS, TII, TCSVT, etc.
- Efficient Methods for Deep Learning
Abstract:Speaker: Dr. Pravendra Singh, CSE, IIT Roorkee
Date: 2022-03-25 Time: 12:00:00 (IST) Venue: Teams While convolutional neural networks (CNNs) have achieved remarkable performance on various supervised and unsupervised learning tasks, they typically consist of a massive number of parameters and massive computations. This results in significant memory requirements as well as a computational burden. Consequently, there is a growing need for efficient methods in deep learning to reduce parameters and computations while maintaining the predictive power of neural networks.In this talk, I present various approaches that aim to make deep learning efficient. This talk is organized in the following parts. (1) Model compression: In the first part, we present works on model compression methods, i.e., methods that reduce computations and parameters of deep CNNs without hurting model accuracy. Filter pruning compresses the deep CNN model by pruning unimportant or redundant convolutional filters in the deep CNN. Filter pruning approaches evaluate the importance of an entire convolutional filter and prune them based on some criteria followed by re-training to recover the accuracy drop. We have proposed various approaches to evaluate the importance of the convolutional filter. (2) Design efficiency: In the second part, we present efficient architectures that use separable convolutions to reduce the model size and complexity. These efficient architectures use pointwise, depthwise, and groupwise convolutions to achieve compact models. We propose a new type of convolution operation using heterogeneous kernels. The proposed Heterogeneous Kernel-Based Convolution (HetConv) reduces the computation (FLOPs) and the number of parameters as compared to standard convolution operation while it maintains representational efficiency. (3) Performance efficiency: In the third part, we present methods that adaptively recalibrate channel-wise feature responses, by explicitly modeling interdependencies between channels, to enhance the performance of deep CNNs. These methods increase deep CNNs accuracy without significantly increasing computations/parameters. Recently researchers have tried to boost the performance of CNNs by re-calibrating the feature maps produced by these filters, e.g., Squeeze-and-Excitation Networks (SENets). These approaches have achieved better performance by exciting up the important channels or feature maps while diminishing the rest. However, in the process, architectural complexity has increased. We propose an architectural block that introduces much lower complexity than the existing methods of CNN performance boosting while performing significantly better than them.
Bio:Pravendra Singh is currently working as an Assistant Professor in the Department of Computer Science and Engineering at the Indian Institute of Technology Roorkee. He obtained his Ph.D. from CSE Department, IIT Kanpur. He also received an outstanding Ph.D. Thesis Award from the IIT Kanpur. His research explores techniques that aim to make deep learning more efficient. In particular, his research interests include Model Compression, Few-Shot Learning, Continual Learning, Deep Learning. His research has been published in various top-tier conferences and journals like CVPR, NeurIPS, AAAI, IJCAI, WACV, IJCV, Pattern Recognition, Neurocomputing, IEEE JSTSP, Knowledge-Based Systems, and others.
- Generalized Points-to Graphs: A Precise and Scalable Abstraction for Points-to Analysis
Abstract:Speaker: Dr. Pritam Gharat, Microsoft Research
Date: 2022-03-25 Time: 16:00:00 (IST) Venue: Teams Computing precise (fully flow- and context-sensitive) and exhaustive (as against demand-driven) points- to information is known to be expensive. Therefore, most approaches to precise points-to analysis begin with a scalable but imprecise method and then seek to increase its precision. In contrast, we begin with a precise method and increase its scalability. We create naive but possibly non-scalable procedure summaries and then use novel optimizations to compact them while retaining their soundness and precision.
We propose a novel abstraction called the generalized points-to graph (GPG), which views points-to relations as memory updates and generalizes them using the counts of indirection levels leaving the unknown pointees implicit. This allows us to construct GPGs as compact representations of bottom- up procedure summaries in terms of memory updates and control flow between them. Their compactness is ensured by strength reduction (reducing the indirection levels), control flow minimization (eliminating control flow edges while preserving soundness and precision), and call inlining (enhancing the opportunities of these optimizations).
The effectiveness of GPGs lies in the fact that they discard as much control flow as possible without losing precision. As a result GPGs are very small even for main procedures that contain the effect of the entire program. This allows our implementation to scale to 158 kLoC for C programs. At a more general level, GPGs provide a convenient abstraction to represent and transform memory in the presence of pointers.
The preliminary ideas are published in Static Analysis Symposium (SAS), September 2016 and the full work is published in the ACM Transactions on Programming Languages and Systems (TOPLAS), May 2020.
Bio:Dr. Pritam Gharat completed her masters and doctoral degrees in Computer Science from IIT Bombay. Her primary area of interest is Program Analysis. Her Ph.D. thesis focused on Pointer Analysis. Post Ph.D., she was a postdoctoral researcher in the Department of Computing, Imperial College, London where she worked on integrating static analysis and dynamic symbolic execution for efficient bug detection in C code. Currently, she is working at Microsoft Research (Bangalore) as a part of Cloud Reliability group.
- Scalable Machine Learning
Abstract:Speaker: Dr. Dinesh Singh, Postdoctoral Researcher, RIKEN, Japan
Date: 2022-03-21 Time: 12:00:00 (IST) Venue: Teams The existing computer vision approaches are computation-intensive and hence struggle to scale-up to carry out analysis on the large collection of data esp. to perform the real-time inference on the resource-constrained devices. On the other hand biomedical problems are very high-dimensional with limited sample size. I will talk on scalable machine learning for large-scale and high-dimensional data using approximate, parallel and distributed computing techniques. Some of the studies include: (i) Scaling the second-order optimization method using nystrom approximation, (ii) deep neural network for gene selection, (iii) efficient feature representation of visual data along with some applications, and (iv) distributed training of kernel SVMs.
Bio:Dinesh is working in High-dimensional Statistical Modeling Unit with Prof. Makoto Yamada as a Postdoctoral Researcher at the RIKEN Center for Advanced Intelligence Project (AIP) at Kyoto University, Japan. RIKEN (established in 1917) is the largest research organization of the Japan primarily funded by The MEXT, Government of Japan. He completed his Ph.D. degree in Computer Science and Engineering from Indian Institute of Technology, Hyderabad in 2018, under the supervision of Prof. C. Krishna Mohan on Scalable and Distributed Methods for Large-scale Visual Computing. He received his M.Tech degree in Computer Engineering from National Institute of Technology, Surat in 2013, where he worked with Prof. Dhiren R. Patel on machine learning approaches for network anomaly and intrusion detection in the domain of cybersecurity and cloud security. He received his B. Tech degree in Information Technology from R. D. Engineering College, Ghaziabad (affiliated with UPTU Lucknow) in 2010.
- Interpretable Models for Knowledge Intensive Tasks
Abstract:Speaker: Dr. Koustav Rudra, CSE, IIT(ISM) Dhanbad
Date: 2022-03-16 Time: 11:00:00 (IST) Venue: Teams My research is about developing efficient and interpretable information systems for knowledge-intensive tasks such as question-answering, crisis management, fake news detection, etc. Artificial intelligence and deep learning architectures have brought revolutionary changes in addressing fundamental problems in different domains such as retrieval, language processing, vision, speech. However, some practical challenges hold deep learning-based solutions back for societal applications. Deep learning-based approaches improve the performance but obscure most of the crucial parts because of the blackbox nature of such models. Hence, we do not know if the machine is right for the right reason.
A desirable property of learning systems is to be both effective and interpretable. Towards this goal, recent models have been proposed that follow posthoc analysis or explain-then-predict approach. Such explain-then-predict models first generate an extractive explanation from the input text and then generate a prediction on just the explanation called explain-then-predict models. These models primarily consider the task input as a supervision signal in learning an extractive explanation and do not effectively integrate rational data as an additional inductive bias to improve task performance. In this talk, I will discuss a multi-task learning approach that efficiently manages the trade-off between explanations and task predictions. A fundamental result that we show in this work is that we do not have to trade-off effectiveness for interpretability as was widely believed. Further, we show the applicability of this method for the ad-hoc document retrieval task. We show that our approach ExPred reaches near SOTA performance while being fully interpretable.
Bio:Koustav Rudra is an Assistant Professor at CSE in IIT(ISM) Dhanbad. His research interests include information retrieval, interpretable data science, social computing, and natural language processing. Before joining IIT Dhanbad, he worked as a postdoctoral researcher at L3S Research Center, Leibniz University Hannover, and Northwestern University. He received his B.E. in Computer Science and Technology from Bengal Engineering and Science University Shibpur in 2011, followed by M.Tech (CSE) in 2013 from the Indian Institute of Technology Kharagpur. He obtained his Ph.D. from the Indian Institute of Technology, Kharagpur, in 2018 under the guidance of Niloy Ganguly. He received the TCS Research Fellowship in 2014.
- Anticipating human actions
Abstract:Speaker: Dr. Debaditya Roy, Scientist at the Institute of High-Performance
Computing, A*STAR, SingaporeDate: 2022-03-15 Time: 10:00:00 (IST) Venue: Teams The ability to anticipate future actions of humans is useful in application areas such as automated driving, robot-assisted manufacturing, and smart homes. The problem of anticipating human actions is an inherently uncertain one. However, we can reduce this uncertainty if we have a sense of the goal that the actor is trying to achieve. present an action anticipation model that leverages goal information for the purpose of reducing the uncertainty in future predictions. So, predicting the next action in the sequence becomes easier once we know the goal that guides the entire activity. We present an action anticipation model that uses goal information in an effective manner. As most of these actions involve objects, another way to anticipate actions is to represent human-object interactions and predict which interactions are going to happen next, i.e., the next action. Existing methods that use human-object interactions for anticipation require object affordance labels for every relevant object in the scene that match the ongoing action. We propose to represent every pairwise human-object (HO) interaction using only their visual features. Then, we propose an end-to-end trainable multi-modal transformer to predict the next action using HO pairwise interactions.
Predicting human driving behavior in traffic especially at intersections as a large proportion of road accidents occur at intersections. Especially, in India where vehicles often ply very close to each other, it is essential to determine collision-prone vehicle behavior. Existing approaches only analyze driving behavior in lane-based driving. We propose an approach called Siamese Interaction Long Short-Term Memory network (SILSTM) that learns the collision propensity of a vehicle from its interaction trajectory. Interaction trajectories encapsulate hundreds of interactions for every vehicle at an intersection. The interaction trajectories that match accident characteristics are labeled as unsafe while the rest are considered safe. We also introduce the first laneless traffic aerial surveillance dataset called SkyEye to demonstrate our results.
Bio:Debaditya Roy is currently a Scientist at the Institute of High-Performance Computing, A*STAR, Singapore. His current research involves predicting human actions for enhanced Human-Robotic Engagement which is funded by AI Singapore. Previously, he was a post-doctoral researcher working at Nihon University, Japan under the M2Smart project to develop AI-based models to study traffic behavior in India. He received his Ph.D. from IIT Hyderabad in 2018 for his thesis "Representation Learning for Action Recognition." His research interests involve computer vision, machine learning with a particular curiosity on how to represent context and environment to learn/reason about human actions even with limited examples which we as humans manage to do effectively.
- Learning a dynamical system and its stability
Abstract:Speaker: Dr. Lakshman Mahto, IIIT Dharwad
Date: 2022-03-04 Time: 15:30:00 (IST) Venue: Teams In this talk, I focus on two problems: 1. learning vector field for a non-linear dynamical system from time series data and error bounds, 2. computation of Lyapunov function. In the process of learning a dynamical system we explore various possibilities such as 1. embedding a state vector to a high dimensional feature using a feature map and then estimating the system matrix of the associated linear model, 2. estimating polynomial vector as a polynomial optimization and 3. representation learning using a variational auto-encoder. The error bounds of the estimation error can be established using various existing algorithms. Second part of this talk is learning a suitable Lyapunov function for a given or learned dynamical system by solving a suitable semidefinite program.
Bio:He is currently working as an assistant professor (Mathematics) at the Indian Institute of Information Technology Dharwad, where he teach courses on computational mathematics and statistics. Prior to joining IIIT Dharwad, he worked as a postdoctoral fellow with Dr. S. Kesavan at the Institute of Mathematical Sciences Chennai, on nonlinear analysis and control. He completed his doctoral thesis, under the guidance of Dr. Syed Abbas at the Indian Institute of Technology Mandi, on dynamical systems with impulsive effects and its application to control problems and ecological systems. - Precise and Scalable Program Analysis
Abstract:Speaker: Prof. Uday Khedker, IIT Bombay
Date: 2022-02-28 Time: 15:00:00 (IST) Venue: Teams This talk describes some long term research commitments in the area of
program analysis at IIT Bombay. Although these efforts started off as
distinct research efforts, they now seem to converge towards a single
agenda of precise and scalable pointer analysis. In hindsight, apart
from a common grand goal, a common theme that has emerged is that a
quest of precision need not conflict with a quest of efficiency. With a
careful modelling, it may well be possible to achieve them together.
This talk may be relevant for audience at multiple levels: At a
practical level, it describes some interesting research investigations
in program analysis. At a conceptual level, it contradicts the common
wisdom that compromising on precision is necessary for scalability. At a
philosophical level, it highlights serendipity at work leading to
seemingly distinct strands pursued over a prolonged duration weaving
themselves into a unified whole.
Bio:Uday P. Khedker finished B.E. from GEC Jabalpur in 1986, M.Tech.
from Pune University in 1989, and Ph.D. from IIT Bombay in 1995. He
taught at the Department of Computer Science at Pune University from
1994 to 2001 and since then is with IIT Bombay where he is a Professor
of Computer Science & Engineering.
His areas of interest are Programming Languages, Compilers and Program
Analysis. He specialises in data flow analysis and its applications to
code optimization. He also spearheaded the GCC work at IIT Bombay
through (once very active) GCC Resource Center at IIT Bombay. He has
advised 10 Ph.D. students, close to 50 B.Tech. and M.Tech. students, and
many more interns.
He has published papers in leading journals and conferences, has
contributed chapters in Compiler Design Handbook and has authored
a book titled “Data Flow Analysis: Theory and Practice”
(http://www.cse.iitb.ac.in/~uday/dfaBook-web) published by Taylor and
Francis (CRC Press). He has also worked very closely with the industry
as a consultant as well as a trainer of advanced topics in compilers. - Energy-efficient Communication Architectures for beyond von-Neumann DNN Accelerators
Abstract:Speaker: Dr. Sumit Mandal
Date: 2022-02-16 Time: 09:00:00 (IST) Venue: Teams Data communication plays a significant role in overall performance for hardware accelerators of Deep Neural Networks (DNNs). For example, crossbar-based in-memory computing significantly increases on-chip communication volume since the weights and activations are on-chip. State-of-the-art interconnect methodologies for in-memory computing deploy a bus-based network or mesh-based NoC. Our experiments show that up to 90% of the total inference latency of a DNN hardware is spent on on-chip communication when the bus-based network is used. To reduce communication latency, we propose a methodology to generate an NoC architecture and a scheduling technique customized for different DNNs. We prove mathematically that the developed NoC architecture and corresponding schedules achieve the minimum possible communication latency for a given DNN. Experimental evaluations on a wide range of DNNs show that the proposed NoC architecture enables 20%-80% reduction in communication latency with respect to state-of-the-art interconnect solutions. Graph convolutional networks (GCNs) have shown remarkable learning capabilities when processing data in the form of graph which is found inherently in many application areas. To take advantage of the relations captured by the underlying graphs, GCNs distribute the outputs of neural networks embedded in each vertex over multiple iterations. Consequently, they incur a significant amount of computation and irregular communication overheads, which call for GCN-specific hardware accelerators. We propose a communication-aware in-memory computing architecture (COIN) for GCN hardware acceleration. Besides accelerating the computation using custom compute elements (CE) and in-memory computing, COIN aims at minimizing the intra- and inter-CE communication in GCN operations to optimize the performance and energy efficiency. Experimental evaluations with various datasets show up to 174x improvement in energy-delay product with respect to Nvidia Quadro RTX 8000 and edge GPUs for the same data precision.
Bio:Dr. Sumit K. Mandal received his dual (B.Tech + M.Tech) degree in Electronics and Electrical Communication Engineering from IIT Kharagpur in 2015. After that, he was a Research & Development Engineer in Synopsys, Bangalore (2015-2017). Currently, he is pursuing Ph.D. in University of Wisconsin-Madison. He is expected to graduate on June, 2022. Details of his research work can be found in https://sumitkmandal.ece.wisc.edu/
- Ethical AI in practise
Abstract:Speaker: Dr. Narayanan Unny, American Express.
Date: 2022-01-21 Time: 11:00:00 (IST) Venue: Teams In recent times, there is increased emphasis and scrutiny on
machine learning algorithms being ethical. In this talk, we explore two
different aspects of ethical AI – explainability and fairness. This
talk introduces a very practical explainability tool called LMTE to
solve some of the needs for explaining the blackbox model at a global as
well as local level. We then examine a methodology of creating a plug-in
ML classifier for producing fair ML models and some theoretical
properties around such a classifier. Apart from enjoying some useful
properties like consistency, we also demonstrate how we can use this
classifier keeping privacy intact.
Bio:Narayanan Unny has been a Machine Learning researcher for more than
a decade starting with a PhD in Bayesian learning from University of
Edinburgh. He started his work in industrial research in Xerox Research
Centre where he had contributed to the use of Machine Learning in
intelligent transportation systems including some of the transportation
systems in India. The research included use of machine learning to aid
robust and dynamic scheduling of public transport. He currently leads
the research team on Machine Learning in AI Labs within American Express
and is currently actively researching different aspects of Ethical AI
and Differential Privacy. - Learning-Based Concurrency testing
Abstract:Speaker: Dr. Akash Lal, Microsoft Research
Date: 2022-01-18 Time: 15:00:00 (IST) Venue: Teams Concurrency bugs are notoriously hard to detect and reproduce. Controlled concurrency testing (CCT) techniques aim to offer a solution, where a scheduler explores the space of possible interleavings of a concurrent program looking for bugs. Since the set of possible interleavings is typically very large, these schedulers employ heuristics that prioritize the search to “interesting” subspaces. However, current heuristics are typically tuned to specific bug patterns, which limits their effectiveness in practice.
In this talk, I will describe QL, a learning-based CCT framework where the likelihood of an action being selected by the scheduler is influenced by earlier explorations. We leverage the classical Q-learning algorithm to explore the space of possible interleavings, allowing the exploration to adapt to the program under test, unlike previous techniques. We have implemented and evaluated QL on a set of microbenchmarks, complex protocols, as well as production cloud services. In our experiments, we found QL to consistently outperform the state-of-the-art in CCT.
Bio:Akash Lal is a Senior Principal Researcher at Microsoft Research, Bangalore. He works broadly in the area of programming languages. His interests are in language design, compiler implementation and program analysis targeted towards helping developers deal with the complexity of modern software. At Microsoft, Akash has worked on the verification engine behind Microsoft's Static Driver Verifier tool that has won multiple best-paper awards at top conferences. More recently, he has been working on project Coyote for building highly-reliable asynchronous systems. Akash completed his PhD from University of Wisconsin-Madison and jointly received the ACM SIGPLAN Outstanding Doctoral Dissertation Award for his thesis. He completed his Bachelor's degree from IIT-Delhi.
- The Compiler as a Database of Code Transformations
Abstract:Speaker: Sorav Bansal
Date: 2022-01-12 Time: 12:00:00 (IST) Venue: Microsoft teams A modern compiler is perhaps the most complex machine developed by humankind. Yet it remains fragile and inadequate for meeting the demands of modern optimization tasks necessitated by Moore’s law saturation. In our research, we explore a different approach to building compiler optimizers that is both completely automatic and more systematic. The basic idea is to use superoptimization techniques to automatically discover replacement rules that resemble peephole optimizations. A key component of this compiler architecture is an equivalence checker that validates the optimization rules automatically. I will describe the overall scheme, and delve into some interesting details of automatic equivalence checking. I will also share our recent work on automatic generation of debugging headers. This talk is based on work published at ASPLOS06, OSDI08, SOSP13, APLAS17, HVC17, SAT18, PLDI20, OOPSLA20, and CGO22.
Teams link:
https://teams.microsoft.com/l/meetup-join/19%3ac00a05b5843f4486843ed7ca9c863eeb%40thread.tacv2/1641802730821?context=%7b%22Tid%22%3a%22624d5c4b-45c5-4122-8cd0-44f0f84e945d%22%2c%22Oid%22%3a%220ca83376-feac-4ea2-88fe-f0688c4de543%22%7d
2022 talks
- LP-Duality Theory and the Cores of Games
Abstract:Speaker: Vijay V. Vazirani, UC Irvine
Date: 2022-12-21 Time: 12:00:00 (IST) Venue: bharti-501 Vijay Vazirani got his undergraduate degree from MIT in 1979 and his PhD from the University of California, Berkeley in 1983. He is currently a Distinguished Professor at the University of California, Irvine.
Vazirani has made fundamental contributions to several areas of the theory of algorithms, including algorithmic matching theory, approximation algorithms and algorithmic game theory, as well as complexity theory. His current work is on algorithms for matching markets; his co-edited book on this topic will be published by Cambridge University Press in March 2023. Here is a flyer: https://www.ics.uci.edu/~vazirani/flyer.pdf
Vazirani is an ACM Fellow, a Guggenheim Fellow and the recipient of the 2022 INFORMS John von Neumann Theory Prize.
Bio:The core is a quintessential solution concept in cooperative game theory and LP-duality theory has played a central role in its study, right from its early days to the present time. The classic 1971 paper of Shapley and Shubik showed the ``right'' way of exploiting this theory --- in the context of characterizing the core of the assignment game.
The LP-relaxation of this game has the following key property: the polytope defined by its constraints has integral vertices; in this case, they are matchings in the underlying graph. Similar characterizations for several basic combinatorial optimization games followed; throughout, this property was established by showing that the underlying linear system is totally unimodular (TUM).
We will first exploit TUM further via a very general formulation due to Hoffman and Kruskal (1956). The way to take this methodology to its logical next step is to use total dual integrality (TDI). In the process, we address new classes of games which have their origins in two major theories within combinatorial optimization, namely perfect graphs and polymatroids.
Whereas the core of the assignment game is always non-empty, that of the general graph matching game can be empty. We show how to salvage the situation --- again using LP-duality in a fundamental way.Based on:
https://arxiv.org/pdf/2202.00619.pdf
https://arxiv.org/pdf/2209.04903.pdf
https://www.sciencedirect.com/science/article/pii/S0899825622000239?via%3Dihub
- Building Secure Systems Bottom Up: Hunting down hardware security vulnerabilities
Abstract:Speaker: Jeyavijayan (JV) Rajendran
Date: 2022-12-12 Time: 16:30:00 (IST) Venue: SIT-001 Hardware is at the heart of computing systems. For decades, software was considered error-prone and vulnerable. However, recent years have seen a rise in attacks exploiting hardware vulnerabilities and exploits. Such vulnerabilities are prevalent in hardware for several reasons: First, the existing functional verification and validation approaches do not account for security, motivating the need for new and radical approaches such as hardware fuzzing. Second, existing defense solutions, mostly based on heuristics, do not undergo rigorous red-teaming exercises like cryptographic algorithms; I will talk about how emerging artificial intelligence (AI) can rapidly help red-team such techniques. Last and most important, students and practitioners who are typically trained in designing, testing, and verification are not rigorously trained in cybersecurity -- for many reasons, including a lack of resources, time, and methodologies; I will talk about how AI can be incorporated into (hardware) cybersecurity education.
Bio:Jeyavijayan (JV) Rajendran is an Assistant Professor in the Department of Electrical and Computer Engineering at the Texas A&M University. He obtained his Ph.D. degree from New York University in August 2015. His research interests include hardware security and computer security. His research has won the NSF CAREER Award in 2017, ONR Young Investigator Award in 2022, the IEEE CEDA Ernest Kuh Early Career Award in 2021, the ACM SIGDA Outstanding Young Faculty Award in 2019, the Intel Academic Leadership Award, the ACM SIGDA Outstanding Ph.D. Dissertation Award in 2017, and the Alexander Hessel Award for the Best Ph.D. Dissertation in the Electrical and Computer Engineering Department at NYU in 2016, along with several best student paper awards. He organizes and has co‐founded Hack@DAC, a student security competition co-located with DAC, and SUSHI.
- Efficient Knowledge Extraction and Visual Analytics of Big Data at Scale
Abstract:Speaker: Dr. Soumya Dutt at LANL
Date: 2022-04-18 Time: 12:00:00 (IST) Venue: Teams With the ever-increasing computing power, current big data applications, nowadays, produce data sets that can reach the order of petabytes and beyond. Knowledge extracted from such extreme-scale data promises unprecedented advancements in various scientific fronts, e.g., earth and space sciences, energy applications, chemistry, material sciences, fluid dynamics, just to name a few. However, the complex and extreme nature of these big data sets is currently pushing the very limits of our analytical capabilities. Therefore, finding meaningful and salient information efficiently and compactly from these vast seas of data and then presenting them effectively and interactively have emerged as one of the fundamental problems in modern computer science research.
My talk will characterize various aspects of big data using the 5 Vs, namely, Volume, Velocity, Variety, Veracity, and Value and present novel strategies for efficient data analytics and visualization. I will present state-of-the-art data exploration methodologies that encompass the end-to-end exploration pipeline, starting from the data generation time until the data is being analyzed and visualized interactively to advance scientific discovery. In my talk, statistical and machine learning-based compact data models will be discussed that are significantly smaller compared to the raw data and can be used as a proxy for the data to answer a broad range of scientific questions efficiently. I will demonstrate successful applications of such model-based visual analytics techniques by showing examples from various scientific domains. To conclude my talk, I will briefly highlight my broad-scale future research plan and its implications.
Bio:Soumya Dutta is a full-time Scientist-2 in the Information Sciences group (CCS-3) at Los Alamos National Laboratory (LANL). Before this, Dr. Dutta was a postdoctoral researcher in the Applied Computer Sciences group (CCS-7) at LANL from June 2018 - July 2019. Dr. Dutta obtained his MS and Ph.D. degrees in Computer Science and Engineering from the Ohio State University in May 2017 and May 2018 respectively. Prior to joining Ohio State, Dr. Dutta completed his B. Tech. in Electronics and Communication Engineering from the West Bengal University of Technology in 2009 and then briefly worked in TCS Kolkata from Feb. 2010 - Jul. 2011. His current research interests include Big Data Analytics & Visualization, Statistical Techniques for Big Data, In Situ Analysis, Machine Learning for Visual Computing, and HPC. Dr. Dutta’s research has won Best Paper Award at ISAV 2021 and Best Paper Honorable Mention Award at IEEE Visualization 2016. He was nominated for the Ohio State Presidential Fellowship in 2017 and was also recently selected for the Best Reviewer, Honorary Mention Award for the IEEE TVCG journal for the year 2021. He is a member of IEEE and ACM.
- Continual Learning via Efficient Network Expansion
Abstract:Speaker: Dr. Vinay Kumar Verma
Date: 2022-04-12 Time: 12:00:00 (IST) Venue: Teams As neural networks are increasingly being applied to real-world applications, mechanisms to address distributional shift and sequential task learning without forgetting are critical. Methods incorporating network expansion have shown promise by naturally adding model capacity for learning new tasks while simultaneously avoiding catastrophic forgetting. However, the growth in the number of additional parameters of many of these types of methods can be computationally expensive at larger scales, at times prohibitively so. In this talk, I will discuss two techniques based on task-specific calibration of the features maps, with negligible growth in the number of parameters for each new task. I will discuss the application of these techniques to a variety of problem settings in continual learning, such as task-incremental as well as class-incremental learning, and continual learning for deep generative models. This talk is based on joint work with Kevin Liang, Nikhil Mehta, Pravendra, Pratik, Lawrence Carin, and Piyush Rai.
Bio:Vinay Kumar Verma completed his postdoc at Duke University under the guidance of Prof. Lawrence Carin. Before joining Duke, Vinay completed his PhD in the Department of Computer Science and Engineering at IIT Kanpur, advised by Piyush Rai. Vinay's research interests are in deep learning and computer vision. In particular, he has worked extensively on problems related to deep learning with little/no supervision (zero-shot and few-shot learning) and deep model compression. More recently, during his postdoctoral research, he has been working on continual learning for supervised as well as unsupervised learning problems. Vinay's PhD work also received the outstanding thesis award from IIT Kanpur. He recently joined Amazon, India as an applied scientist and working on visual insights of Softline trends.
- Machine Unlearning
Abstract:Speaker: Dr. Murari Mandal, Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS)
Date: 2022-04-06 Time: 12:00:00 (IST) Venue: Teams Consider a scenario where it is desired that the information pertaining to the data belonging to a single entity or multiple entities be removed from the already trained machine learning (ML) model. Unlearning the data observed during the training of an ML model is an important task that can play a pivotal role in fortifying the privacy and security of ML-based applications. In this talk, I raise the following questions: (i) can we unlearn a class/classes of data from an ML model ? (ii) can we make the process of unlearning fast and scalable to large datasets, and generalize it to different deep networks? and iii) can we do unlearning without having any kind of access to the training data? I will talk about two novel machine unlearning frameworks. The first method offers an efficient solution through an error-maximizing noise generation and impair-repair based weight manipulation. It enables excellent unlearning while substantially retaining the overall model accuracy. The second method introduces unlearning in a zero-shot setting. The newly introduced zero-shot machine unlearning caters to the extreme but practical scenario where zero original data samples are available for use. I will discuss the two novel solutions for zero-shot machine unlearning based on (a) error minimizing-maximizing noise and (b) gated knowledge transfer.
Bio:Murari is a Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS). His research goal is to develop practical and principled approaches to make the ML models i) privacy aware and adaptive to the evolving data privacy rules and regulations, ii) empower the individual user with the facility to derive benefits from one’s own personal data. He is currently working with Prof. Mohan Kankanhalli and Prof. Jussi Keppo. He worked as a lecturer in IIIT Kota prior to starting his Postdoctoral position. His earlier works focused on developing deep learning models for vision applications including moving object detection, video anomaly detection, emotion analysis, and image quality enhancement. A significant part of his current research is focused on machine unlearning, data valuation, privacy and security in deep learning. He has served as a program committee member for AAAI'22, and technical committee member for Earthvision (2021,22), CVPR Workshops. He is a regular reviewer for top-tier conferences and journals including AAAI, ICML, WACV, BMVC, TIP, TMM, TETCI, TITS, TII, TCSVT, etc.
Website: https://www.comp.nus.edu.sg/~murari/
- Machine Unlearning
Abstract:Speaker: Dr. Murari Mandal, Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS)
Date: 2022-04-06 Time: 12:00:00 (IST) Venue: Teams Consider a scenario where it is desired that the information pertaining to the data belonging to a single entity or multiple entities be removed from the already trained machine learning (ML) model. Unlearning the data observed during the training of an ML model is an important task that can play a pivotal role in fortifying the privacy and security of ML-based applications. In this talk, I raise the following questions: (i) can we unlearn a class/classes of data from an ML model ? (ii) can we make the process of unlearning fast and scalable to large datasets, and generalize it to different deep networks? and iii) can we do unlearning without having any kind of access to the training data? I will talk about two novel machine unlearning frameworks. The first method offers an efficient solution through an error-maximizing noise generation and impair-repair based weight manipulation. It enables excellent unlearning while substantially retaining the overall model accuracy. The second method introduces unlearning in a zero-shot setting. The newly introduced zero-shot machine unlearning caters to the extreme but practical scenario where zero original data samples are available for use. I will discuss the two novel solutions for zero-shot machine unlearning based on (a) error minimizing-maximizing noise and (b) gated knowledge transfer.
Bio:Murari is a Postdoctoral research fellow in the School of Computing, National University of Singapore (NUS). His research goal is to develop practical and principled approaches to make the ML models i) privacy aware and adaptive to the evolving data privacy rules and regulations, ii) empower the individual user with the facility to derive benefits from one’s own personal data. He is currently working with Prof. Mohan Kankanhalli and Prof. Jussi Keppo. He worked as a lecturer in IIIT Kota prior to starting his Postdoctoral position. His earlier works focused on developing deep learning models for vision applications including moving object detection, video anomaly detection, emotion analysis, and image quality enhancement. A significant part of his current research is focused on machine unlearning, data valuation, privacy and security in deep learning. He has served as a program committee member for AAAI'22, and technical committee member for Earthvision (2021,22), CVPR Workshops. He is a regular reviewer for top-tier conferences and journals including AAAI, ICML, WACV, BMVC, TIP, TMM, TETCI, TITS, TII, TCSVT, etc.
- Efficient Methods for Deep Learning
Abstract:Speaker: Dr. Pravendra Singh, CSE, IIT Roorkee
Date: 2022-03-25 Time: 12:00:00 (IST) Venue: Teams While convolutional neural networks (CNNs) have achieved remarkable performance on various supervised and unsupervised learning tasks, they typically consist of a massive number of parameters and massive computations. This results in significant memory requirements as well as a computational burden. Consequently, there is a growing need for efficient methods in deep learning to reduce parameters and computations while maintaining the predictive power of neural networks.In this talk, I present various approaches that aim to make deep learning efficient. This talk is organized in the following parts. (1) Model compression: In the first part, we present works on model compression methods, i.e., methods that reduce computations and parameters of deep CNNs without hurting model accuracy. Filter pruning compresses the deep CNN model by pruning unimportant or redundant convolutional filters in the deep CNN. Filter pruning approaches evaluate the importance of an entire convolutional filter and prune them based on some criteria followed by re-training to recover the accuracy drop. We have proposed various approaches to evaluate the importance of the convolutional filter. (2) Design efficiency: In the second part, we present efficient architectures that use separable convolutions to reduce the model size and complexity. These efficient architectures use pointwise, depthwise, and groupwise convolutions to achieve compact models. We propose a new type of convolution operation using heterogeneous kernels. The proposed Heterogeneous Kernel-Based Convolution (HetConv) reduces the computation (FLOPs) and the number of parameters as compared to standard convolution operation while it maintains representational efficiency. (3) Performance efficiency: In the third part, we present methods that adaptively recalibrate channel-wise feature responses, by explicitly modeling interdependencies between channels, to enhance the performance of deep CNNs. These methods increase deep CNNs accuracy without significantly increasing computations/parameters. Recently researchers have tried to boost the performance of CNNs by re-calibrating the feature maps produced by these filters, e.g., Squeeze-and-Excitation Networks (SENets). These approaches have achieved better performance by exciting up the important channels or feature maps while diminishing the rest. However, in the process, architectural complexity has increased. We propose an architectural block that introduces much lower complexity than the existing methods of CNN performance boosting while performing significantly better than them.
Bio:Pravendra Singh is currently working as an Assistant Professor in the Department of Computer Science and Engineering at the Indian Institute of Technology Roorkee. He obtained his Ph.D. from CSE Department, IIT Kanpur. He also received an outstanding Ph.D. Thesis Award from the IIT Kanpur. His research explores techniques that aim to make deep learning more efficient. In particular, his research interests include Model Compression, Few-Shot Learning, Continual Learning, Deep Learning. His research has been published in various top-tier conferences and journals like CVPR, NeurIPS, AAAI, IJCAI, WACV, IJCV, Pattern Recognition, Neurocomputing, IEEE JSTSP, Knowledge-Based Systems, and others.
- Generalized Points-to Graphs: A Precise and Scalable Abstraction for Points-to Analysis
Abstract:Speaker: Dr. Pritam Gharat, Microsoft Research
Date: 2022-03-25 Time: 16:00:00 (IST) Venue: Teams Computing precise (fully flow- and context-sensitive) and exhaustive (as against demand-driven) points- to information is known to be expensive. Therefore, most approaches to precise points-to analysis begin with a scalable but imprecise method and then seek to increase its precision. In contrast, we begin with a precise method and increase its scalability. We create naive but possibly non-scalable procedure summaries and then use novel optimizations to compact them while retaining their soundness and precision.
We propose a novel abstraction called the generalized points-to graph (GPG), which views points-to relations as memory updates and generalizes them using the counts of indirection levels leaving the unknown pointees implicit. This allows us to construct GPGs as compact representations of bottom- up procedure summaries in terms of memory updates and control flow between them. Their compactness is ensured by strength reduction (reducing the indirection levels), control flow minimization (eliminating control flow edges while preserving soundness and precision), and call inlining (enhancing the opportunities of these optimizations).
The effectiveness of GPGs lies in the fact that they discard as much control flow as possible without losing precision. As a result GPGs are very small even for main procedures that contain the effect of the entire program. This allows our implementation to scale to 158 kLoC for C programs. At a more general level, GPGs provide a convenient abstraction to represent and transform memory in the presence of pointers.
The preliminary ideas are published in Static Analysis Symposium (SAS), September 2016 and the full work is published in the ACM Transactions on Programming Languages and Systems (TOPLAS), May 2020.
Bio:Dr. Pritam Gharat completed her masters and doctoral degrees in Computer Science from IIT Bombay. Her primary area of interest is Program Analysis. Her Ph.D. thesis focused on Pointer Analysis. Post Ph.D., she was a postdoctoral researcher in the Department of Computing, Imperial College, London where she worked on integrating static analysis and dynamic symbolic execution for efficient bug detection in C code. Currently, she is working at Microsoft Research (Bangalore) as a part of Cloud Reliability group.
- Scalable Machine Learning
Abstract:Speaker: Dr. Dinesh Singh, Postdoctoral Researcher, RIKEN, Japan
Date: 2022-03-21 Time: 12:00:00 (IST) Venue: Teams The existing computer vision approaches are computation-intensive and hence struggle to scale-up to carry out analysis on the large collection of data esp. to perform the real-time inference on the resource-constrained devices. On the other hand biomedical problems are very high-dimensional with limited sample size. I will talk on scalable machine learning for large-scale and high-dimensional data using approximate, parallel and distributed computing techniques. Some of the studies include: (i) Scaling the second-order optimization method using nystrom approximation, (ii) deep neural network for gene selection, (iii) efficient feature representation of visual data along with some applications, and (iv) distributed training of kernel SVMs.
Bio:Dinesh is working in High-dimensional Statistical Modeling Unit with Prof. Makoto Yamada as a Postdoctoral Researcher at the RIKEN Center for Advanced Intelligence Project (AIP) at Kyoto University, Japan. RIKEN (established in 1917) is the largest research organization of the Japan primarily funded by The MEXT, Government of Japan. He completed his Ph.D. degree in Computer Science and Engineering from Indian Institute of Technology, Hyderabad in 2018, under the supervision of Prof. C. Krishna Mohan on Scalable and Distributed Methods for Large-scale Visual Computing. He received his M.Tech degree in Computer Engineering from National Institute of Technology, Surat in 2013, where he worked with Prof. Dhiren R. Patel on machine learning approaches for network anomaly and intrusion detection in the domain of cybersecurity and cloud security. He received his B. Tech degree in Information Technology from R. D. Engineering College, Ghaziabad (affiliated with UPTU Lucknow) in 2010.
- Interpretable Models for Knowledge Intensive Tasks
Abstract:Speaker: Dr. Koustav Rudra, CSE, IIT(ISM) Dhanbad
Date: 2022-03-16 Time: 11:00:00 (IST) Venue: Teams My research is about developing efficient and interpretable information systems for knowledge-intensive tasks such as question-answering, crisis management, fake news detection, etc. Artificial intelligence and deep learning architectures have brought revolutionary changes in addressing fundamental problems in different domains such as retrieval, language processing, vision, speech. However, some practical challenges hold deep learning-based solutions back for societal applications. Deep learning-based approaches improve the performance but obscure most of the crucial parts because of the blackbox nature of such models. Hence, we do not know if the machine is right for the right reason.
A desirable property of learning systems is to be both effective and interpretable. Towards this goal, recent models have been proposed that follow posthoc analysis or explain-then-predict approach. Such explain-then-predict models first generate an extractive explanation from the input text and then generate a prediction on just the explanation called explain-then-predict models. These models primarily consider the task input as a supervision signal in learning an extractive explanation and do not effectively integrate rational data as an additional inductive bias to improve task performance. In this talk, I will discuss a multi-task learning approach that efficiently manages the trade-off between explanations and task predictions. A fundamental result that we show in this work is that we do not have to trade-off effectiveness for interpretability as was widely believed. Further, we show the applicability of this method for the ad-hoc document retrieval task. We show that our approach ExPred reaches near SOTA performance while being fully interpretable.
Bio:Koustav Rudra is an Assistant Professor at CSE in IIT(ISM) Dhanbad. His research interests include information retrieval, interpretable data science, social computing, and natural language processing. Before joining IIT Dhanbad, he worked as a postdoctoral researcher at L3S Research Center, Leibniz University Hannover, and Northwestern University. He received his B.E. in Computer Science and Technology from Bengal Engineering and Science University Shibpur in 2011, followed by M.Tech (CSE) in 2013 from the Indian Institute of Technology Kharagpur. He obtained his Ph.D. from the Indian Institute of Technology, Kharagpur, in 2018 under the guidance of Niloy Ganguly. He received the TCS Research Fellowship in 2014.
- Anticipating human actions
Abstract:Speaker: Dr. Debaditya Roy, Scientist at the Institute of High-Performance
Computing, A*STAR, SingaporeDate: 2022-03-15 Time: 10:00:00 (IST) Venue: Teams The ability to anticipate future actions of humans is useful in application areas such as automated driving, robot-assisted manufacturing, and smart homes. The problem of anticipating human actions is an inherently uncertain one. However, we can reduce this uncertainty if we have a sense of the goal that the actor is trying to achieve. present an action anticipation model that leverages goal information for the purpose of reducing the uncertainty in future predictions. So, predicting the next action in the sequence becomes easier once we know the goal that guides the entire activity. We present an action anticipation model that uses goal information in an effective manner. As most of these actions involve objects, another way to anticipate actions is to represent human-object interactions and predict which interactions are going to happen next, i.e., the next action. Existing methods that use human-object interactions for anticipation require object affordance labels for every relevant object in the scene that match the ongoing action. We propose to represent every pairwise human-object (HO) interaction using only their visual features. Then, we propose an end-to-end trainable multi-modal transformer to predict the next action using HO pairwise interactions.
Predicting human driving behavior in traffic especially at intersections as a large proportion of road accidents occur at intersections. Especially, in India where vehicles often ply very close to each other, it is essential to determine collision-prone vehicle behavior. Existing approaches only analyze driving behavior in lane-based driving. We propose an approach called Siamese Interaction Long Short-Term Memory network (SILSTM) that learns the collision propensity of a vehicle from its interaction trajectory. Interaction trajectories encapsulate hundreds of interactions for every vehicle at an intersection. The interaction trajectories that match accident characteristics are labeled as unsafe while the rest are considered safe. We also introduce the first laneless traffic aerial surveillance dataset called SkyEye to demonstrate our results.
Bio:Debaditya Roy is currently a Scientist at the Institute of High-Performance Computing, A*STAR, Singapore. His current research involves predicting human actions for enhanced Human-Robotic Engagement which is funded by AI Singapore. Previously, he was a post-doctoral researcher working at Nihon University, Japan under the M2Smart project to develop AI-based models to study traffic behavior in India. He received his Ph.D. from IIT Hyderabad in 2018 for his thesis "Representation Learning for Action Recognition." His research interests involve computer vision, machine learning with a particular curiosity on how to represent context and environment to learn/reason about human actions even with limited examples which we as humans manage to do effectively.
- Learning a dynamical system and its stability
Abstract:Speaker: Dr. Lakshman Mahto, IIIT Dharwad
Date: 2022-03-04 Time: 15:30:00 (IST) Venue: Teams In this talk, I focus on two problems: 1. learning vector field for a non-linear dynamical system from time series data and error bounds, 2. computation of Lyapunov function. In the process of learning a dynamical system we explore various possibilities such as 1. embedding a state vector to a high dimensional feature using a feature map and then estimating the system matrix of the associated linear model, 2. estimating polynomial vector as a polynomial optimization and 3. representation learning using a variational auto-encoder. The error bounds of the estimation error can be established using various existing algorithms. Second part of this talk is learning a suitable Lyapunov function for a given or learned dynamical system by solving a suitable semidefinite program.
Bio:He is currently working as an assistant professor (Mathematics) at the Indian Institute of Information Technology Dharwad, where he teach courses on computational mathematics and statistics. Prior to joining IIIT Dharwad, he worked as a postdoctoral fellow with Dr. S. Kesavan at the Institute of Mathematical Sciences Chennai, on nonlinear analysis and control. He completed his doctoral thesis, under the guidance of Dr. Syed Abbas at the Indian Institute of Technology Mandi, on dynamical systems with impulsive effects and its application to control problems and ecological systems. - Precise and Scalable Program Analysis
Abstract:Speaker: Prof. Uday Khedker, IIT Bombay
Date: 2022-02-28 Time: 15:00:00 (IST) Venue: Teams This talk describes some long term research commitments in the area of
program analysis at IIT Bombay. Although these efforts started off as
distinct research efforts, they now seem to converge towards a single
agenda of precise and scalable pointer analysis. In hindsight, apart
from a common grand goal, a common theme that has emerged is that a
quest of precision need not conflict with a quest of efficiency. With a
careful modelling, it may well be possible to achieve them together.
This talk may be relevant for audience at multiple levels: At a
practical level, it describes some interesting research investigations
in program analysis. At a conceptual level, it contradicts the common
wisdom that compromising on precision is necessary for scalability. At a
philosophical level, it highlights serendipity at work leading to
seemingly distinct strands pursued over a prolonged duration weaving
themselves into a unified whole.
Bio:Uday P. Khedker finished B.E. from GEC Jabalpur in 1986, M.Tech.
from Pune University in 1989, and Ph.D. from IIT Bombay in 1995. He
taught at the Department of Computer Science at Pune University from
1994 to 2001 and since then is with IIT Bombay where he is a Professor
of Computer Science & Engineering.
His areas of interest are Programming Languages, Compilers and Program
Analysis. He specialises in data flow analysis and its applications to
code optimization. He also spearheaded the GCC work at IIT Bombay
through (once very active) GCC Resource Center at IIT Bombay. He has
advised 10 Ph.D. students, close to 50 B.Tech. and M.Tech. students, and
many more interns.
He has published papers in leading journals and conferences, has
contributed chapters in Compiler Design Handbook and has authored
a book titled “Data Flow Analysis: Theory and Practice”
(http://www.cse.iitb.ac.in/~uday/dfaBook-web) published by Taylor and
Francis (CRC Press). He has also worked very closely with the industry
as a consultant as well as a trainer of advanced topics in compilers. - Energy-efficient Communication Architectures for beyond von-Neumann DNN Accelerators
Abstract:Speaker: Dr. Sumit Mandal
Date: 2022-02-16 Time: 09:00:00 (IST) Venue: Teams Data communication plays a significant role in overall performance for hardware accelerators of Deep Neural Networks (DNNs). For example, crossbar-based in-memory computing significantly increases on-chip communication volume since the weights and activations are on-chip. State-of-the-art interconnect methodologies for in-memory computing deploy a bus-based network or mesh-based NoC. Our experiments show that up to 90% of the total inference latency of a DNN hardware is spent on on-chip communication when the bus-based network is used. To reduce communication latency, we propose a methodology to generate an NoC architecture and a scheduling technique customized for different DNNs. We prove mathematically that the developed NoC architecture and corresponding schedules achieve the minimum possible communication latency for a given DNN. Experimental evaluations on a wide range of DNNs show that the proposed NoC architecture enables 20%-80% reduction in communication latency with respect to state-of-the-art interconnect solutions. Graph convolutional networks (GCNs) have shown remarkable learning capabilities when processing data in the form of graph which is found inherently in many application areas. To take advantage of the relations captured by the underlying graphs, GCNs distribute the outputs of neural networks embedded in each vertex over multiple iterations. Consequently, they incur a significant amount of computation and irregular communication overheads, which call for GCN-specific hardware accelerators. We propose a communication-aware in-memory computing architecture (COIN) for GCN hardware acceleration. Besides accelerating the computation using custom compute elements (CE) and in-memory computing, COIN aims at minimizing the intra- and inter-CE communication in GCN operations to optimize the performance and energy efficiency. Experimental evaluations with various datasets show up to 174x improvement in energy-delay product with respect to Nvidia Quadro RTX 8000 and edge GPUs for the same data precision.
Bio:Dr. Sumit K. Mandal received his dual (B.Tech + M.Tech) degree in Electronics and Electrical Communication Engineering from IIT Kharagpur in 2015. After that, he was a Research & Development Engineer in Synopsys, Bangalore (2015-2017). Currently, he is pursuing Ph.D. in University of Wisconsin-Madison. He is expected to graduate on June, 2022. Details of his research work can be found in https://sumitkmandal.ece.wisc.edu/
- Ethical AI in practise
Abstract:Speaker: Dr. Narayanan Unny, American Express.
Date: 2022-01-21 Time: 11:00:00 (IST) Venue: Teams In recent times, there is increased emphasis and scrutiny on
machine learning algorithms being ethical. In this talk, we explore two
different aspects of ethical AI – explainability and fairness. This
talk introduces a very practical explainability tool called LMTE to
solve some of the needs for explaining the blackbox model at a global as
well as local level. We then examine a methodology of creating a plug-in
ML classifier for producing fair ML models and some theoretical
properties around such a classifier. Apart from enjoying some useful
properties like consistency, we also demonstrate how we can use this
classifier keeping privacy intact.
Bio:Narayanan Unny has been a Machine Learning researcher for more than
a decade starting with a PhD in Bayesian learning from University of
Edinburgh. He started his work in industrial research in Xerox Research
Centre where he had contributed to the use of Machine Learning in
intelligent transportation systems including some of the transportation
systems in India. The research included use of machine learning to aid
robust and dynamic scheduling of public transport. He currently leads
the research team on Machine Learning in AI Labs within American Express
and is currently actively researching different aspects of Ethical AI
and Differential Privacy. - Learning-Based Concurrency testing
Abstract:Speaker: Dr. Akash Lal, Microsoft Research
Date: 2022-01-18 Time: 15:00:00 (IST) Venue: Teams Concurrency bugs are notoriously hard to detect and reproduce. Controlled concurrency testing (CCT) techniques aim to offer a solution, where a scheduler explores the space of possible interleavings of a concurrent program looking for bugs. Since the set of possible interleavings is typically very large, these schedulers employ heuristics that prioritize the search to “interesting” subspaces. However, current heuristics are typically tuned to specific bug patterns, which limits their effectiveness in practice.
In this talk, I will describe QL, a learning-based CCT framework where the likelihood of an action being selected by the scheduler is influenced by earlier explorations. We leverage the classical Q-learning algorithm to explore the space of possible interleavings, allowing the exploration to adapt to the program under test, unlike previous techniques. We have implemented and evaluated QL on a set of microbenchmarks, complex protocols, as well as production cloud services. In our experiments, we found QL to consistently outperform the state-of-the-art in CCT.
Bio:Akash Lal is a Senior Principal Researcher at Microsoft Research, Bangalore. He works broadly in the area of programming languages. His interests are in language design, compiler implementation and program analysis targeted towards helping developers deal with the complexity of modern software. At Microsoft, Akash has worked on the verification engine behind Microsoft's Static Driver Verifier tool that has won multiple best-paper awards at top conferences. More recently, he has been working on project Coyote for building highly-reliable asynchronous systems. Akash completed his PhD from University of Wisconsin-Madison and jointly received the ACM SIGPLAN Outstanding Doctoral Dissertation Award for his thesis. He completed his Bachelor's degree from IIT-Delhi.
- The Compiler as a Database of Code Transformations
Abstract:Speaker: Sorav Bansal
Date: 2022-01-12 Time: 12:00:00 (IST) Venue: Microsoft teams A modern compiler is perhaps the most complex machine developed by humankind. Yet it remains fragile and inadequate for meeting the demands of modern optimization tasks necessitated by Moore’s law saturation. In our research, we explore a different approach to building compiler optimizers that is both completely automatic and more systematic. The basic idea is to use superoptimization techniques to automatically discover replacement rules that resemble peephole optimizations. A key component of this compiler architecture is an equivalence checker that validates the optimization rules automatically. I will describe the overall scheme, and delve into some interesting details of automatic equivalence checking. I will also share our recent work on automatic generation of debugging headers. This talk is based on work published at ASPLOS06, OSDI08, SOSP13, APLAS17, HVC17, SAT18, PLDI20, OOPSLA20, and CGO22.
Teams link:
https://teams.microsoft.com/l/meetup-join/19%3ac00a05b5843f4486843ed7ca9c863eeb%40thread.tacv2/1641802730821?context=%7b%22Tid%22%3a%22624d5c4b-45c5-4122-8cd0-44f0f84e945d%22%2c%22Oid%22%3a%220ca83376-feac-4ea2-88fe-f0688c4de543%22%7d