Fairness and reliability of Machine Learning

Notice

Class schedule: 1530-1700, Mondays and Thursdays, in Bharti 201.
Please email suban@iitd.ac.in with COL865 in the Subject if you are not registered in the course but are interested in attending.
All the students interested in attending the course are requested to email a one page pdf indicating:
1. the reason for your interest in the course
2. any prior reading that you may have done on the topic and
3. your views on the issues to be covered in the course.

General reading

The fair ML book

Classes and assignments

July 22 and 25, July 29 (Introduction to fairness and discimination).
Aug 1 to Aug 29 (Learnability, disparate impact and disparate treatment, impossibility results on fairness).
Minor exam.
September (More on discrimination, observational limitations, causality, fairness through awareness).
Minor exam.
October and November (Privacy; Reliability of RCTs; Reliability of machine learning).
Major exam.

Course description

In the last few years machine learning techniques have been extremely successful in applications such as image recognition and object detection, autonomous guidance, natural language understanding and translation, radiology and medical diagnosis, targeting and recommendation systems, credit scoring and risk assessment etc. They are also beginning to see applications in health analytics, econometrics and policy.

As the scope of applications expand beyond computer science into data driven decisions about humans and policy advocacy, we need better understanding of the risks and potential negative fallouts - both unintended and intended. This course will be a formal enquiry in to ethics of data driven decision making that affect people.

In the first part of the course we will concentrate on fairness and bias. We will try to fomalise the criteria of fairness based on the understanding of discrimination that exist in the social sciences and law. We will see that unless the data lie in narrow well-behaved manifolds, most reasonable characterisations of fairness are impossible to achieve in data driven decision making.

We will also study the various types of biases that can creep into algorithmic decision making - including data bias, algorithmic focus bias, processing bias, transfer context bias and emergent bias.

We will review the spate of algorithmic interventions that have been proposed to ensure fairness and reduce bias and discuss the limitations of these approaches. We will have open discussions on how then to promote fairness and contain bias risks in data driven decision making, and discuss the related notions of privacy, accountability, interpretabilty and transparency. We will also discuss the obvious question that follows - alternatively, is it possible to achieve fairness by being data blind?

In the latter part of the course we will analyse the shortcomings of the `prediction only' paradigm. We will try to formalise the notions of predictability vs reliability, and examine to what extent `causal inference' can enhance reliability. We will try to examine the ethical issues in building `autonomous systems' (such as self driving cars or even weapons) based solely on high prediction rates of modern machine learning systems (deep convolutional neural networks).

The course will not assume any background in machine learning though familiarity will be useful. As such, the course will be accessible to all engineering students with 3rd year level maturity. The course however will assume familiarity with basic computing, probability and linear algebra. It will also assume some good common sense.

Evaluation

The course will have reading and discussions for the entire semester. In addition, each participant will be required to scribe all topics and do a term-paper. Evaluations will be based on peer and instructor review of class participation, quality of reports and presentations.

Subhashis Banerjee / Dept. Computer Science and Engineering / IIT Delhi / Hauz Khas/ New Delhi 110016

COL865 Special topic on Computer Applications