COMP SCI 639: Foundations of Data Science (Spring 2022)
Instructor: Jelena Diakonikolas
Email: jelena at cs dot wisc dot edu
Office hours: Tuesday and Thursday after class (4-5pm), or by appointment. In person, subject to local health guidelines.
Communication policy: I try to respond to all emails, but during the semester my email load may become too high, in which case I may miss responding to some emails. If your question is urgent and I do not respond promptly, please send me a reminder. For all non-urgent class-related questions, please use the class Piazza (accessible from Canvas) and/or one of the office hours slots.
Teaching Assistants: Cheuk Yin (Eric) Lin and Xufeng Cai
Email: clin353 at wisc dot edu, xcai74 at wisc dot edu
Office hours: Monday 9-11am (Xufeng) and Wednesday 3-5pm (Eric)
This class meets on Tuesdays and Thursdays in CS 1221, 2.30-3-45pm (75 min). Optional and supplementary discussion sessions are held by the TAs on Fridays 2.30-3.45pm in CS 1325.
General Course Information
Most of the class is theoretical and assumes mathematical maturity: you need to be comfortable with reading, understanding, and writing proofs. Courses in linear algebra and probability and statistics as well as a proof-based theoretical course (or instructor permission) are required.
Some of the homework problems will require coding in Python, and basic knowledge of Python is expected.
There is no required textbook for this course.
Lecture notes for each of the lectures, as well as additional readings are provided on the course Canvas page.
This class focuses on theoretical foundations of data science. While we will cover some applications and examples in Python, this class is not an applied, coding-based class.
This is a tentative list of topics that will be covered in class. Most of the topics listed here will be covered, and some other topics may be added.
- Probabilistic foundations of learning: Gaussian random variables, Central Limit Theorem, concentration inequalities.
- Applications of concentration inequalities in learning theory: approximating population loss by the empirical loss.
- Select topics in statistical inference: hypothesis testing, p values, A/B testing, permutation testing, bootstrap, causal inference.
- Select topics in statistical learning: linear regression and classification.
- Select topics in optimization: gradient descent, stochastic gradient descent, variance reduction.
- Select topics in online learning: learning from experts, mutliplicative weights update.
All grades will be posted on Canvas. The information provided here is tentative and is subject to change.
Homework: There will be 5-6 homework assignments, accounting for ~50% of the grade. You may discuss problems with other students, but you need to declare it on your homework submission. Any discussion can be verbal only: you are required to work out and write the solutions on your own. Submitting someone else's work as your own constitutes academic misconduct. Academic honesty is taken very seriously in this class, and any breach of it will be treated according to the University Policy.
Homework assignments and solutions will be posted on Canvas.
Midterm: Date and Time: TBD. Held in person, in class. Accounts for ~20% of the grade.
Project: Done in pairs. Can be one of the following: (i) a lecture on a data science topic that we did not cover in class, (ii) literature review of a specific data science topic, or (iii) a deep dive on a research question in data science. Accounts for ~30% of the grade.
By enrolling in this course, each student assumes the responsibilities of an active participant in UW-Madison’s community of scholars in which everyone’s academic work and behavior are held to the highest academic integrity standards. Academic misconduct compromises the integrity of the university. Cheating, fabrication, plagiarism, unauthorized collaboration, and helping others commit these acts are examples of academic misconduct, which can result in disciplinary action. This includes but is not limited to failure on the assignment/course, disciplinary probation, or suspension. Substantial or repeated cases of misconduct will be forwarded to the Office of Student Conduct & Community Standards for additional review. [link]
The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform faculty [me] of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. Faculty [I], will work either directly with the student [you] or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student's educational record, is confidential and protected under FERPA. [link]
Institutional Statement on Diversity
Diversity is a source of strength, creativity, and innovation for UW-Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals.
The University of Wisconsin-Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world. [link]