# COL863: Special Topics in Theoretical Computer Science

Topic: Mathematics of Data Science

# II semester: 2020-21

# Amitabha Bagchi

**Class Timings:** M Th 3:30PM on Teams.

### Course objectives

At the end of the course the student is expected to develop a working familiarity with the mathematical foundations of most of the techniques used in data science, machine learning and AI.

*Background required*: Basics of Probability, Graph Theory, and Linear Algebra.
### Topics

Geometry of High-dimensional space including dimensionality reduction; Singular Value Decomposition and applications; Random walks and Markov Chains; Sketching and sampling; Clustering.

### Texts

- [BHK18] A. Blum, J. Hopcroft, and R. Kannan,
*Foundations of Data Science*, 2018. Download here.
- [LPW17] D. A. Levin, Y. Peres, and E. Wilmer,
*Markov Chains and Mixing Times, 2e.* Download here (Please download the 2nd edition).

### Course calendar

### Refresher texts

### Evaluation

- 30%: Minor 1.
- 30%: Minor 2.
- 30%: Minor 3.
- 10%: Term papers.

**Audit Pass criterion**. At least 8/30 in *each* minor *and* at least 35/100 overall. Plagiarism will lead to automatic audit fail. In case you have missed a minor and still wish to audit the class you can do so if you make up for your missed minor with 5/10 in the term paper.
**Plagiarism**. Copying text from any source (including but not limited to internet, video, book, another person's paper) constitutes plagiarism. Since this is an elective 800-level course, a very high standard of honesty is expected. Violations will be treated accordingly.

Amitabha
Bagchi