COMP SCI 639: Foundations of Data Science (Spring 2024)
Instructor: Jelena Diakonikolas
Email: jelena at cs dot wisc dot edu
Office hours: Thu 12-2pm
Communication policy: I try to respond to all emails, but during the semester my email load may become too high, in which case I may miss responding to some emails. If your question is urgent and I do not respond promptly, please send me a reminder. For all non-urgent class-related questions, please use the class Piazza (accessible from Canvas) and/or one of the office hours slots.
Teaching Assistant: Xufeng Cai
Email: xcai74 at wisc dot edu
Office hours: TBD
This class meets on Monday and Wednesday in Eng Hall 3345, 2.30-3-45pm (75 min). Optional and supplementary discussion sessions are held by the TA as one of the office hours. Location and time: CS 2310, Friday 2.30-3.45pm.
General Course Information
Most of the class is theoretical and assumes mathematical maturity: you need to be comfortable with reading, understanding, and writing proofs. Courses in linear algebra and probability and statistics or graduate standing are required. While not required, it is recommended that you have taken a proof-based course or that at least you feel comfortable with mathematical proofs.
Some of the homework problems will require coding in Python, and basic knowledge of Python is expected.
There is no required textbook for this course. Lecture notes for most of the material will provided by the instructor. We will use a few chapaters from the following two textbooks, for the last ~1/3 of the course:
James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning. New York: Springer; 2013.
Wright SJ, Recht B. Optimization for data analysis. Cambridge University Press; 2022.
Lecture notes for the lectures, as well as additional readings are provided on the course Canvas page.
Course Outline with References to Material
This class focuses on theoretical foundations of data science. While we will cover some applications and examples in Python, this class is not an applied, coding-based class.
This is a tentative list of topics that will be covered in class. Most of the topics listed here will be covered, and some other topics may be added.
- Probabilistic foundations of learning
- Concentration Inequalities
- Applications of concentration inequalities
- Intro to Hypothesis Testing
- Null Hypothesis Significance Testing
- Additional Topics in Hypothesis Testing
- A/B Testing and Causality
- Maximum Likelihood Estimation
- Confidence Intervals and Bootstrap
- Select topics in statistical learning: linear regression, classification, cross-validation, model selection (Chapters 2-6 in the James, Witten, Hastie, and Tibshirani textbook)
- Select topics in optimization: optimality conditions, gradient descent, stochastic gradient descent (Chapters 2, 3, and 5 in the Wright-Recht textbook)
All grades will be posted on Canvas. The information provided here is tentative and is subject to change.
Homework: There will be 5 homework assignments, accounting for ~50% of the grade. You may discuss problems with other students, but you need to declare it on your homework submission. Any discussion can be verbal only: you are required to work out and write the solutions on your own. Submitting someone else's work as your own constitutes academic misconduct. Academic honesty is taken very seriously in this class, and any breach of it will be treated according to the University Policy.
Homework assignments and solutions will be posted on Canvas.
Quizzes: Date and Time: TBD. Held in person, in class. Accounts for ~20% of the grade.
Project: Done in pairs. Can be one of the following: (i) a lecture on a data science topic that we did not cover in class, (ii) literature review of a specific data science topic, or (iii) a deep dive on a research question in data science. Accounts for ~30% of the grade.
By enrolling in this course, each student assumes the responsibilities of an active participant in UW-Madison’s community of scholars in which everyone’s academic work and behavior are held to the highest academic integrity standards. Academic misconduct compromises the integrity of the university. Cheating, fabrication, plagiarism, unauthorized collaboration, and helping others commit these acts are examples of academic misconduct, which can result in disciplinary action. This includes but is not limited to failure on the assignment/course, disciplinary probation, or suspension. Substantial or repeated cases of misconduct will be forwarded to the Office of Student Conduct & Community Standards for additional review. [link]
The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform faculty [me] of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. Faculty [I], will work either directly with the student [you] or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student's educational record, is confidential and protected under FERPA. [link]
Institutional Statement on Diversity
Diversity is a source of strength, creativity, and innovation for UW-Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals.
The University of Wisconsin-Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world. [link]