Note that there separate sets of assignments for CS 451/651 and CS 431/631. Make sure you work on the correct asssignments!
This assignment requires you to compute statistics over tokens in the text of Shakespeare's plays, as you did for Assignment 1. However, instead of performing the analysis purely in Python, you will use Spark (with a Python driver program).
For this assignment, you will no longer be able to use the Compute Canada Jupyter hub, since it does not provide access to a Spark installation. Instead, you should use the Waterloo CS Jupyter hub at https://jupyter.student.cs.uwaterloo.ca:8000. To log in to the CS hub, you need to use your password for the CS student computing environment, not your WatIAM password. If you do not know your CS student computing password, you can reset it using this page. This will require you to authenticate using your WatIAM userid and password, to prove that you are you. You will then be able to set your CS student computing password. You can choose the same password that you use for WatIAM, or a different one.
You may wish to create a folder on the CS Jupyter hub to hold all of your CS431/631 work, as you did on the Compute Canada hub. For this assignment, you will need to upload the following files to your working folder on the hub:
Starting with this assignment, you should use Marmoset, rather than e-mail, to submit your assignment. The advantage of this is that it allows you to confirm your submission, and to update your submission if necessary.
To submit A2, use the following steps: