Title: Foundations of Data Science
Instructor: Ragesh Jaiswal
Office: 403, SIT Building
Teaching Assistants: TBD
High dimensional space, singular value decomposition, learning and VC dimension,
markov chains, clustering, algorithms for massive data sets.
Textbook: Foundations of Data Science by Blum, Hopcroft, and Kannan
Course description: The course will follow the above-mentioned book. The following excerpt from the book captures the main idea:
"Computer science as an academic discipline began in the 1960’s. Emphasis was on programming languages, compilers, operating systems,
and the mathematical theory that supported these areas. Courses in theoretical computer science covered finite automata, regular expressions,
context free languages, and computability. In the 1970’s, the study of algorithms was added as an important component of theory. The emphasis
was on making computers useful. Today, a fundamental change is taking place and the focus is more on applications. There are many reasons for
this change. The merging of computing and communications has played an important role. The enhanced ability to observe, collect and store data
in the natural sciences, in commerce, and in other fields calls for a change in our understanding of data and how to handle it in the modern
setting. The emergence of the web and social networks as central aspects of daily life presents both opportunities and challenges for theory.
While traditional areas of computer science remain highly important, increasingly researchers of the future will be involved with using
computers to understand and extract usable information from massive data arising in applications, not just how to make computers useful on
specific well-defined problems. With this in mind we have written this book to cover the theory likely to be useful in the next 40 years,
just as an understanding of automata theory, algorithms and related topics gave students an advantage in the last 40 years. One of the major
changes is the switch from discrete mathematics to more of an emphasis on probability, statistics, and numerical methods."