COMP SCI 541: Theory & Algorithms for Data Science (Spring 2025)

Instructor: Jelena Diakonikolas

Email: jelena at cs dot wisc dot edu

Office hours: Thu 10am-12pm

Communication policy: I try to respond to all emails, but during the semester my email load may become too high, in which case I may miss responding to some emails. If your question is urgent and I do not respond promptly, please send me a reminder. For all non-urgent class-related questions, please use the class Piazza (accessible from Canvas) and/or one of the office hours slots.

General Course Information

Prerequisites

Most of the class is theoretical and assumes mathematical maturity: you need to be comfortable with reading, understanding, and writing proofs. Courses in linear algebra and probability and statistics or graduate standing are required. While not required, it is recommended that you have taken a proof-based course or that at least you feel comfortable with mathematical proofs.

Some of the homework problems will require coding in Python, and basic knowledge of Python is assumed. (It is fair to say however that the required level is not advanced and it should be possible catch up if you have any coding experience, even if it is not in Python.)

Course Material

There is no required textbook for this course. Lecture notes for all the material are provided by the instructor. We may sometimes use a few chapters from the following textbooks:

Blum A, Hopcroft J, Kannan R. Foundations of data science. Cambridge University Press; 2020.

Kearns MJ, Vazirani U. An introduction to computational learning theory. MIT press; 1994.

Shalev-Shwartz S, Ben-David S. Understanding machine learning: From theory to algorithms. Cambridge university press; 2014.

Wright SJ, Recht B. Optimization for data analysis. Cambridge University Press; 2022.

Lecture notes for the lectures, as well as additional readings are provided on the course Canvas page. You can also find draft (not fully polished) lecture notes in the Course Outline below. This is what we will follow in the lectures, but I may edit/update them as we go.

Course Outline with References to Material

This class focuses on theoretical foundations of data science. While we will cover some applications and examples in Python, this class is not an applied, coding-based class.

This is a tentative list of topics (with links to lecture notes) that will be covered in class. All the topics listed here will be covered, and some other topics may be added.

Course Load/Assessment

All grades will be posted on Canvas. The information provided here is tentative and is subject to change.

Homework: There will be 5 homework assignments, accounting for 50% of the grade. You may discuss problems with other students, but you need to declare it on your homework submission. Any discussion can be verbal only: you are required to work out and write the solutions on your own. Submitting someone else's work as your own constitutes academic misconduct. Academic honesty is taken very seriously in this class, and any breach of it will be treated according to the University Policy.

Homework assignments and solutions will be posted on Canvas.

Quizzes: Date: March 14 (mid-semester) and May 2 (last lecture). Held in person, in class. Account for 20% of the grade.

Project: Done in pairs. Can be one of the following: (i) a lecture on a data science topic that we did not cover in class, (ii) literature review of a specific data science topic, or (iii) a deep dive on a research question in data science. Accounts for 30% of the grade.

Academic Policies

Academic Integrity

By enrolling in this course, each student assumes the responsibilities of an active participant in UW-Madison’s community of scholars in which everyone’s academic work and behavior are held to the highest academic integrity standards. Academic misconduct compromises the integrity of the university. Cheating, fabrication, plagiarism, unauthorized collaboration, and helping others commit these acts are examples of academic misconduct, which can result in disciplinary action. This includes but is not limited to failure on the assignment/course, disciplinary probation, or suspension. Substantial or repeated cases of misconduct will be forwarded to the Office of Student Conduct & Community Standards for additional review. [link]

Disability Accomodation

The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform faculty [me] of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. Faculty [I], will work either directly with the student [you] or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student's educational record, is confidential and protected under FERPA. [link]

Institutional Statement on Diversity

Diversity is a source of strength, creativity, and innovation for UW-Madison. We value the contributions of each person and respect the profound ways their identity, culture, background, experience, status, abilities, and opinion enrich the university community. We commit ourselves to the pursuit of excellence in teaching, research, outreach, and diversity as inextricably linked goals.

The University of Wisconsin-Madison fulfills its public mission by creating a welcoming and inclusive community for people from every background – people who as students, faculty, and staff serve Wisconsin and the world. [link]