To expose students to both large-scale and unstructured data, we propose to use social network graphs, in Web scale, which are essential to naturally motivate students to use cloud platforms. Second, the next project invites students to implement algorithms and data structures for joining graphs with database tables (containing Twitter account info, profile, and name), to show how Azure SQL can be combined with their service. Lastly, to enable friend recommendation, by mining the number of mutual friends, data/computation needs to be distributed over multiple machines. Depending on the distribution algorithm students design, the accuracy/effectiveness of recommendation will be affected, which can motivate students to develop the best design in class.
The key strength of this project is as follows:
- Design is open-ended, such that students can compete with their designs.
- Difficulty can be controlled, by determining the amount of initial code to provide.
- By combining structured data hosted in Azure SQL, textbook materials on SQL/relational DB can be combined with the project, together with more recent issues of efficiently supporting large-scale graphs.
The following course resources are available to the students:
- Code skeleton and testing codes for students and guiding materials/videos showing how to run this code in local PC (video #1) then deploy to Azure (video #2).
- Web-scale data for students - (a) Web-scale social data and (b) data tables of structured information on Azure SQL.