
| Week | Description | Dates | Assignments | |
| 1 | The Data Flywheel | Sep 4 | A1 Released: 9/4 | |
| 2 | Data Warehouses, Data Lakes, and Lakehouses | Sep 9, 11 | A2 Released: 9/11 | A1 Due: 9/11 |
| 3 | Batch Processing I | Sep 16, 18 | ||
| 4 | Batch Processing II | Sep 23, 25 | A3 Released: 9/25 | A2 Due: 9/25 |
| 5 | Rubber, Meet Road | Sept 30, Oct 1 | ||
| 6 | Data Infrastructure for Machine Learning | Oct 7, 9 | A4 Released: 10/9 | A3 Due: 10/9 |
| 7 | Reading Week: No Classes! | |||
| 8 | Midterm Exam | Oct 21, 23 | ||
| 9 | Text Processing I | Oct 28, 30 | A4 Due: 10/30 | |
| 10 | Text Processing II | Nov 4, 6 | A5 Released: 11/4 | |
| 11 | Finding Similar Items | Nov 11, 13 | ||
| 12 | Graph Processing | Nov 18, 20 | A6 Released: 11/18 | A5 Due: 11/18 |
| 13 | Stream Processing | Nov 25, 27 | ||
| 14 | LLMs | Dec 2 | A6 Due: 12/2 | |
| Final Exam | TBD |
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The following are optional. They comprise some of the primary sources from which lecture content is drawn, and can enrich your understanding to provide broader context.
Reread the assigned readings from last week (or read for the first time if you haven't yet). The material will make a lot more sense given the lecture material. In addition:
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The following are optional. They comprise some of the primary sources from which lecture content is drawn, and can enrich your understanding to provide broader context.
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The following are optional. They provide a deeper dive of lecture content that can enrich your understanding and provide broader context.
Some of the above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The following are optional. They comprise some of the primary sources from which lecture content is drawn, and can enrich your understanding to provide broader context.
The following are from online textbooks that explain what's covered in lecture. They are required to the extent that they help you understand what's covered in class. If you're confused about anything, consult these sources.
Note, readings refer to "banding" (i.e., b bands of r rows per band): this is the same idea as the slides, which refer to k minhash signatures (= row) n times (= band).
If you have further interest in the discussion on scale up vs. scale out for graph processing, you might want to read the sources cited in the paper — in particular, the Twitter WTF paper, the survey on graph processing systems, and the paper on COST by McSherry et al.
The above readings are available for free online through the university's library. The links above point directly to Waterloo proxied content, but if you're having trouble accessing the content (e.g., due to VPN settings), you might have go through the library's portal (i.e., search for the book title and follow the appropriate link).
The following are optional, primarily provided to enrich your understanding.