Please note: This master’s thesis presentation will take place online.
Kritika Iyer, Master’s candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Ian Goldberg
As the amount of online information accessible to users keeps increasing, we have come to rely more on services such as Netflix, Amazon, and eBay that are successful in recommending choices to users. The main goal of such services is to present the user with a more personalized set of choices or recommendations. The growing importance of recommendation systems that provide these services can be attested by the efforts the academic community is taking towards improving their performance. The quality of recommendation systems is primarily determined by the accuracy of the results they can provide to the users. To achieve high-accuracy results, these systems count on finding similarities between different users based on various features, such as the ratings the users provide for the items. Recommendation systems use different techniques, often harvesting private user information, to detect these similarities. Therefore, providing better recommendations frequently comes at the cost of user privacy and at the risk of exposing the user's preferences. Owing to growing concerns about this risk, researchers started to investigate recommendation solutions with better assurances of privacy.
There is a growing body of work with respect to making recommendation systems more sensitive towards user privacy. The current solutions implemented use various methodologies like randomization of the dataset, anonymizing the identities of users, using data aggregation, obfuscating user data, using a trusted third party, and using cryptographic techniques. However, we are yet to have a solution that not only provides privacy guarantees, but is also a practical and efficient system, giving recommendations with high accuracy.
Our goal in this thesis is to implement a solution that enables high guarantees of user privacy, is practical and efficient, that scales well over a large dataset, and provides users with accurate recommendations. A common trend in the solutions mentioned before is to model a system around one or more trusted third parties. All the critical operations such as key generation or user authentication are delegated to these trusted third parties and combined with a threat model that restricts them from behaving in a malicious manner. We aim at implementing a system that is independent of such a trusted third party. We also desire a system that makes collusion among servers ineffective unless the number of corrupt servers exceeds a threshold value. We also want to make all computations independent of the availability of the participants, so that users would get recommendations even if other participants are offline. For our use case, we have considered a scenario where users would like to get recommendations of movies that are based on ratings provided for other movies. To evaluate our system we have used the real world, publicly available “MovieLens” dataset. Our system consists of the following entities: a set of users or clients, a distributed set of servers, and a public bulletin board. Our scheme primarily focuses on maintaining the privacy of user preferences as well as the recommendations and it does not allow anyone other than the user herself to have access to the data.