2022 Cybersecurity and Privacy Institute Annual Conference Poster Session Presenters

Congratulations to our 2022 Poster Award winner, Hossam ElAtali!
TITLE:
BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking

Thank you to all our presenters, it was an impressive display of academic excellence and intriguing ideas.

PRESENTER:
Alex Bie, Vikrant Singhal
Supervisor: Gautam Kamath
Computer Science
TITLE:
Private Estimation with Public Data

PRESENTER:
Rasoul Akhavan Mahdavi
Supervisor: Florian Kerschbaum
Computer Science
TITLE:
Constant-weight PIR: Single-round Keyword PIR via Constant-weight Equality Operators

PRESENTER:
Bailey Kacsmar
Supervisor: Florian Kerschbaum
Computer Science
TITLE:
Caring about Sharing: User Perceptions of Multiparty Data Sharing

PRESENTER:
Asim Waheed
Supervisor: N. Asokan
Computer Science
TITLE:
Are Graph Embeddings Effective Fingerprints for Ownership Verification?

PRESENTER:
Vasisht Duddu
Supervisor: N. Asokan
Computer Science
TITLE:
SHAPr: An Effective and Efficient Membership Privacy Risk Metric for Machine Learning

PRESENTER:
Hossam ElAtali
Supervisor: N. Asokan
Computer Science
TITLE:
BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking

PRESENTER:
Parjanya Vyas
Supervisor: N. Asokan and Yousra Aafer
Computer Science
TITLE:
Learning from Apps: Access Control Recommendations for Android Framework APIs

PRESENTER:
Shubhankar Mohapatra
Supervisor: Xi He
Computer Science
TITLE:
Differentially Private Data Generation with Missing Data

PRESENTER:
Shufan Zhang
Supervisor: Xi He
Computer Science
TITLE:
Don't Be a Tattle-Tale: Preventing Leakages through Data Dependencies on Access Control Protected Data

PRESENTER:
Owura Asare
Supervisor: N. Asokan and Mei Nagappan
Computer Science
TITLE:
Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?

PRESENTER:
Radi Abubaker
Supervisor: Guang Gong
Electrical and Computer Engineering
TITLE:
The Welch-Gong Stream Cipher - Evolution and Breakthroughs

PRESENTER:
Alexi Orchard
Supervisor: Marcel O'Gorman
and Jennifer Boger
English & Media Studies
TITLE:
Enhancing Technologists' Ethical Awareness Through Critical Design

PRESENTER:
Wilson Wu
Kimia Mohammadi
IQC
Supervisor: Thomas Jennewein
Physics & Astronomy
TITLE:
QEYSSat - Quantum Key Distribution With a Satellite

PRESENTER:
Guiwen Luo
Supervisor: Prof. Guang Gong
Electrical and Computer Engineering
TITLE:
Optimization of Pippenger's Bucket Method towards Paring-Based zkSNARKs

PRESENTER:
Mohammadtaghi Badakhshan
Supervisor: Guang Gong
Electrical and Computer Engineering
TITLE:
An Efficient Encoding for Polaris

PRESENTER:
Mahbod Majid
Supervisor: Gautam Kamath
Computer Science
TITLE:
Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

PRESENTER:
Vikrant Singhal
Supervisor: Gautam Kamath
Computer Science
TITLE:
A Private and Computationally-Efficient Estimator for Unbounded Gaussians

PRESENTER:
Jimmy Di
Supervisor: Gautam Kamath
Computer Science
TITLE:
Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

PRESENTER:
Siddharth Priya
Supervisor: Arie Gurfinkel
Electrical and Computer Engineering
TITLE:
SEAURCHIN: Bounded Model Checking for Rust

PRESENTER:
Argyris Mouzakis
Supervisor: Gautam Kamath
Computer Science
TITLE:
New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma

PRESENTER:
Danielle Thompson
Supervisor: Adam Molnar
Sociology & Legal Studies
TITLE:
BossWare in Canada Examining Canadian Companies' Use of Employee Monitoring Applications

PRESENTER:
Alex Bie, Vikrant Singhal
Supervisor: Gautam Kamath
Computer Science
TITLE:
Private Estimation with Public Data
ABSTRACT:
We initiate the study of differentially private (DP) estimation with access to a small amount of public data. For private estimation of d-dimensional Gaussians, we assume that the public data comes from a Gaussian that may have vanishing similarity in total variation distance with the underlying Gaussian of the private data. We show that under the constraints of pure or concentrated DP, d+1 public data samples are sufficient to remove any dependence on the range parameters of the private data distribution from the private sample complexity, which is known to be otherwise necessary without public data. For separated Gaussian mixtures, we assume that the underlying public and private distributions are the same, and we consider two settings: (1) when given a dimension-independent amount of public data, the private sample complexity can be improved polynomially in terms of the number of mixture components, and any dependence on the range parameters of the distribution can be removed in the approximate DP case; (2) when given an amount of public data linear in the dimension, the private sample complexity can be made independent of range parameters even under concentrated DP, and additional improvements can be made to the overall sample complexity.

PRESENTER:
Rasoul Akhavan Mahdavi
Supervisor: Florian Kerschbaum
Computer Science
TITLE:
Constant-weight PIR: Single-round Keyword PIR via Constant-weight Equality Operators
ABSTRACT:
Equality operators are an essential building block in tasks over secure computation such as private information retrieval. In private information retrieval (PIR), a user queries a database such that the server does not learn which element is queried. In this work, we propose \emph{equality operators for constant-weight codewords}. A constant-weight code is a collection of binary codewords that share the same Hamming weight. Our proposed constant-weight equality operators have a multiplicative depth that depends only on the Hamming weight of the code, not the bit-length of the elements. In our experiments, we show how these equality operators are up to 10 times faster than existing equality operators. Furthermore, we propose PIR using the constant-weight equality operator or constant-weight PIR, which is a PIR protocol using an approach previously deemed impractical. We show that for private retrieval of large, streaming data, constant-weight PIR has a smaller communication complexity and lower runtime compared to SEALPIR and MulPIR, respectively, which are two state-of-the-art solutions for PIR. Moreover, we show how constant-weight PIR can be extended to keyword PIR. In keyword PIR, the desired element is retrieved by a unique identifier pertaining to the sought item, e.g., the name of a file. Previous solutions to keyword PIR require one or multiple rounds of communication to reduce the problem to normal PIR. We show that constant-weight PIR is the first practical single-round solution to single-server keyword PIR.

PRESENTER:
Bailey Kacsmar
Supervisor: Florian Kerschbaum
Computer Science
TITLE:
Caring about Sharing: User Perceptions of Multiparty Data Sharing
ABSTRACT:
Data sharing between companies is typically regarded as one-size-fits-all in practice and in research.
For instance, the main source of information available to users about how a company shares their data is privacy policies. Privacy policies use ambiguous terms such as `third-parties' and `partners' with regard to who data is shared with. In the real-world, data sharing has more nuance than is captured by these over-arching terms. We investigate whether users perceive different data sharing scenarios differently through an online survey with scenarios that describe specific types of multiparty data sharing practices. We determine users' perceptions when explicitly presented with how their data is shared, who it is shared with, and why. We show that users have preferences and that variations in acceptability exist which depend on the nature of the data sharing collaboration. Users caring about sharing, necessitates more transparent sharing practices and regulations.

PRESENTER:
Asim Waheed
Supervisor: N. Asokan
Computer Science
TITLE:
Are Graph Embeddings Effective Fingerprints for Ownership Verification?
ABSTRACT:
Graph neural networks (GNNs) have emerged as a state-of-the-art approach to model and draw inferences from large scale graph-structured data in various application settings like social networks. Prior work has shown that GNNs can be extracted via inference API (just like other models trained on non-graph data). Furthermore, graph datasets can be misused by unauthorized parties. One approach to deter extraction or misuse is ownership verification via fingerprinting, for both models and datasets, which has been explored previously for non-graph domains.
GNNs learn an embedding for each graph node that encodes both the node features and the graph structure around that node which makes embeddings unique to a specific input graph. This begs the question: can embeddings be used as a fingerprinting mechanism for verifying the ownership of either the graph data or GNN models trained using it?
Using six benchmark datasets and three state-of-the-art GNN architectures, we analyze distinguishability of the embeddings using their corresponding 2D t-SNE projections. We find that two models are distinguishable regardless of whether they use the same architecture or were trained from the same dataset. This indicates that the embeddings are potentially not an effective technique for dataset ownership verification, but still be useful for model ownership verification. We extend this to model extraction attack setting where an adversary trains a local surrogate model with similar functionality as a deployed victim model. We find that t-SNE plots cannot distinguish between a surrogate model and the victim model but distinguish independent models (trained independently from scratch on a same or different dataset). We develop a fingerprinting mechanism which thresholds the distance between embeddings of the victim model and suspect model on an input verification graph to identify it as stolen or independent. We validate our approach across different model extraction attacks across six benchmark datasets.
Key Words: Graph Machine Learning, Ownership Verification, Fingerprinting, Model Extraction Attacks.

PRESENTER:
Vasisht Duddu
Supervisor: N. Asokan
Computer Science
TITLE:
SHAPr: An Effective and Efficient Membership Privacy Risk Metric for Machine Learning
ABSTRACT:
Data used to train machine learning (ML) models can be sensitive. Membership inference attacks (MIAs), attempting to determine whether a particular data record was used to train an ML model, risk violating membership privacy. ML model builders need a principled definition of a metric to quantify the membership privacy risk of (a) individual training data records, (b) computed independently of specific MIAs, (c) which assesses susceptibility to different MIAs, (d) can be used for different applications, and (e) efficiently. None of the prior membership privacy risk metrics simultaneously meet all these requirements.
We present SHAPr, a membership privacy metric based on Shapley values which is a leave-one-out (LOO) technique, originally intended to measure the contribution of a training data record on model utility. We conjecture that contribution to model utility can act as a proxy for memorization, and hence represent membership privacy risk.
Using ten benchmark datasets, we show that SHAPr is indeed effective in estimating susceptibility of training data records to MIAs. We also show that, unlike prior work, SHAPr is significantly better in estimating susceptibility to newer, and more effective, MIAs. SHAPr is versatile: it can be used for estimating vulnerability of different subgroups to MIAs, and inherits applications of Shapley values (e.g., data valuation). We apply SHAPr to evaluate the efficacy of several defenses against MIAs: adding noise to subset of training data records, removing high risk training data records, and using regularization as a defence. We show that SHAPr has an acceptable computational cost (compared to naive LOO), varying from a few minutes for the smallest dataset to ~92 minutes for the largest dataset.
Key Words: Data Privacy, Membership Inference Attack, Privacy Metric, Machine Learning

PRESENTER:
Hossam ElAtali
Supervisor: N. Asokan
Computer Science
TITLE:
BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking
ABSTRACT:
Outsourced computation has become ubiquitous, but introduces privacy and confidentiality concerns because the users’ sensitive data must be sent to the service provider for processing, where it can be mishandled or misused. Even if the service provider is trusted, other clients running on the same server can access sensitive data using run-time attacks and side-channel leakage. These are common concerns for clients of ML inference services that must send their sensitive data to the server for processing. Existing solutions do not solve this problem; fully homomorphic encryption suffers from prohibitive overhead, and trusted execution environments (TEEs) are vulnerable to run-time attacks and side-channel leakage. We develop a hardware-assisted outsourced computation framework that can preserve data confidentiality even in the presence of software vulnerabilities or side channels. We provide a verifiably secure fixed-function hardware security module (HSM) for data import and export, and develop hardware processor extensions that perform taint-tracking of client data to enforce a security policy that prevents its misuse. Any computation that leaks sensitive data to a visible state, through side-channels or otherwise, is forbidden. To ease deployment, we also develop compiler extensions that take existing source code and produce binaries that satisfy this security policy and so can run on this modified hardware.

PRESENTER:
Parjanya Vyas
Supervisor: N. Asokan and Yousra Aafer
Computer Science
TITLE:
Learning from Apps: Access Control Recommendations for Android Framework APIs
ABSTRACT:
Access controls in Android have been extensively studied to identifying sensitive resources and finding inconsistencies among access controls present at various layers of the Android OS. We observe that the application (app) developers of the pre-built system apps shipped with various Android images protect sensitive operations (service API calls that access or modify sensitive data) using varied methods such as direct permission checks, UI blocks, and user interactions like AlertDialogs and warning messages. The contextual and semantic information available to the app developers enables them to gain a clear perspective on the sensitivity level of the operations they protect. In this work, we aim to infer such access control information present within the pre-built system applications. We utilize a number of 'hints' such as natural language texts, explicit protection of app components using permissions, types of UI elements, etc. to infer if an app, directly or indirectly, imposes access control to a system API invocation. We then compare these learned access controls with the actual access controls present within the Android framework to detect inconsistencies. Through this process, we aim to find inconsistencies in access controls of Android framework APIs and demonstrate how such inconsistencies can result in exploitable vulnerabilities.

PRESENTER:
Shubhankar Mohapatra
Supervisor: Xi He
Computer Science
TITLE:
Differentially Private Data Generation with Missing Data
ABSTRACT:
Differentially private synthetic data generation is an important and well-studied problem. In this work, we consider the problem when the data also has missing values. Our study consists of three types of missingness in the data and evaluation of existing works with missing data as input against their no missing baseline. Our methods consist of three strategies that an analyst may choose to treat missingness -- imputation, complete row and adaptive recourse. We theoretically show that standard imputation techniques are expensive if done differentially privately and are not a viable option. In our experiments, we evaluate five prior works on two popular tabular datasets to show that the performance of prior algorithms degrade in the complete case setting. Furthermore, we extend two existing approaches (PrivBayes and Kamino) to cater towards missing data and show that our enhanced methods are sometimes even capable of matching the no-missing baseline.

PRESENTER:
Shufan Zhang
Supervisor: Xi He
Computer Science
TITLE:
Don't Be a Tattle-Tale: Preventing Leakages through Data Dependencies on Access Control Protected Data
ABSTRACT:
We study the problem of answering queries when (part of) the data may be sensitive and should not be leaked to the querier. Simply restricting the computation to the non-sensitive part of the data may leak sensitive data through inference based on data dependencies. While inference control from data dependencies during query processing has been studied in the literature, existing solution either detect and deny queries causing leakage, or use a weak security model that only protects against the exact reconstruction of the sensitive data. In this paper, we adopt a stronger security model based on full deniability that prevents any information about sensitive data to be inferred from query answers. We identify conditions under which full deniability can be achieved and develop an efficient algorithm that minimally hides non-sensitive cells during query processing to achieve full deniability. We experimentally show that our approach is practical and scales to an increasing proportion of sensitive data, as well as, to increasing database size.

PRESENTER:
Owura Asare
Supervisor: N. Asokan and Mei Nagappan
Computer Science
TITLE:
Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?
ABSTRACT:
Several advances in deep learning have been successfully applied to the software development process. Of recent interest is the use of neural language models to build tools that can assist in writing code. There is a growing body of work to evaluate these tools and their underlying language models. We aim to contribute to this line of research via a comparative empirical analysis of these tools and language models from a security perspective. We use CGT (Code Generation Tool) to refer to language models as well as other tools, such as Copilot, that are built with language models.
The aim of this study is to compare the performance of one CGT, Copilot, with the performance of human developers. Specifically, we investigate whether Copilot is just as likely to introduce the same software vulnerabilities as human developers.
We use the Big-Vul dataset proposed by Fan et al. - a dataset of vulnerabilities introduced by human developers. For each entry in the dataset, we recreate the scenario before the bug was introduced and allow Copilot to generate a completion. The completions are manually inspected by three independent coders in order to be classified as 1. containing the same vulnerability (introduced by the human), 2. containing a fix for the vulnerability or 3. other. The "other" category is used as a catchall for scenarios that are out of scope for this project. We find several scenarios where Copilot is able to generate code free of vulnerabilities previously introduced by human developers, indicating some ability of Copilot to outperform developers (from a security perspective) given the right context.

PRESENTER:
Radi Abubaker
Supervisor: Guang Gong
Electrical and Computer Engineering
TITLE:
The Welch-Gong Stream Cipher - Evolution and Breakthroughs
ABSTRACT:
In this work we want to present the rich and long history of the Welch-Gong (WG) Stream cipher family, which finally led to authenticated encryption scheme WAGE, a Round 2 NIST LWC Candidate. The WG stream cipher is a bit-oriented stream cipher, which generates a keystream with proven randomness and cryptographic properties. It is a synchronous stream cipher based on the Welch-Gong (WG) transformations on an m-sequence. Theoretical foundations for the Welch-Gong transformations were laid in the 90’s and early 2000’s. A WG stream cipher was first proposed by Nawaz and Gong in 2005 and the profile 2 candidate WG-29 reached the phase 2 of the eSTREAM competition. WG-m stream ciphers are composed of a linear feedback shift register (LFSR) of degree n and a d-decimated WG transformation, defined over the same extension field GF(2^m). The security of a (decimated) WG cipher is dependent on the length of the LFSR and the cryptographic strength of the WG transformation used in the cipher. In order to achieve the highest security against exiting generic passive attacks such as algebraic attacks, DFT attacks and distinguishing attacks, the cryptographic and randomness properties for 7 ≤ m ≤ 16 were studied extensively, resulting in the criteria for optimal selection of parameters (m, n, and d). In the past decade, the lightweight variants WG-5, WG-7, and WG-8, suitable for constrained environments, such as RFID, were proposed and extensively studied. The bigger instance, WG-16 was designed for use in confidentiality and integrity algorithms in mobile communications, such as 4G-LTE networks. In 2017, we answered the NIST LWC call for submissions and designed a new lightweight authenticated cipher WAGE. The WAGE permutation is based on a 37-stage Galois NLFSR over GF(2^7) with decimated WG permutations and newly designed 7-bit Sboxes. We use the WAGE permutation in the unified sponge-duplex mode to achieve the authenticated encryption functionality that provides 128-bit security. Our security analysis shows that WAGE is resistant to diffusion, algebraic, differential, linear, and meet-in-the-middle distinguishers. As a feature, a simple tweak in the control circuit of WAGE enables an additional pseudorandom bit generator (PRBG) with proven randomness properties. In the future we will expand WAGE into a family of authenticated encryption schemes WAGE-m, suitable for applications for environments with high computing power.

PRESENTER:
Alexi Orchard
Supervisor: Marcel O'Gorman
and Jennifer Boger
English & Media Studies
TITLE:
Enhancing Technologists' Ethical Awareness Through Critical Design
ABSTRACT:
In light of significant issues in the technology industry, such as algorithms that worsen racial biases, the spread of online misinformation, and the expansion of mass surveillance, it is increasingly important to teach the ethical implications of tech development. As such, it is crucial to examine how educational institutions respond in preparing future graduates to navigate common ethical decisions, assumptions, and pitfalls they may encounter in the workforce. My doctoral research investigates the potential for cross-disciplinary, sociotechnical pedagogical approaches to transform and reinforce the importance of ethics and responsible design in tech development. One way I do this is by designing, facilitating, and surveying the participants of various workshops, modules, and video tutorials related to ethics and responsible design in undergraduate engineering courses at the University of Waterloo. Critical Design, the main methodology of these curricular interventions, is an arts-based research practice that “challenges hegemonies and dominant ideologies in contexts of science and technology, social inequality, and unchallenged disciplinary norms” (Malpass, 2017). This ongoing research hopes to demonstrate how sociotechnical pedagogical approaches can help prepare future technologists to critically analyze and mitigate the complex social, environmental, and economic challenges in the context of their work.

PRESENTER:
Wilson Wu
Kimia Mohammadi
IQC
Supervisor: Thomas Jennewein
Physics & Astronomy
TITLE:
QEYSSat - Quantum Key Distribution With a Satellite
ABSTRACT:
The development of a universal quantum computer will render insecure the RSA public-key cryptosystem which currently underpins much of global communication. Quantum key distribution (QKD) provides a security guarantee grounded in the laws of quantum mechanics, instead of using an assumption of computational hardness. The Quantum Encryption and Science Satellite (QEYSSat) is a low Earth orbit satellite with launch anticipated in 2024 and will implement a satellite uplink and downlink for QKD. The QEYSSat project is funded by the Canadian Space Agency, with the science of the mission led by the Institute for Quantum Computing (IQC) at the University of Waterloo. We are designing and building a transceiver telescope and two photon sources - a weak coherent pulse source and an entangled pair source - to perform decoy state BB84 from our quantum optical ground station at IQC. This will allow the generation of secret keys between distant partners across Canada ensuring secure communication in the post-quantum era, paving the way towards the global quantum internet.

PRESENTER:
Guiwen Luo
Supervisor: Prof. Guang Gong
Electrical and Computer Engineering
TITLE:
Optimization of Pippenger's Bucket Method towards Paring-Based zkSNARKs
ABSTRACT:
In the pairing-based trusted setup zkSNARK, it usually follows such a norm that the prover generates the proof by computing $n$-scalar multiplication over fixed points, where all fixed elliptic curve points are welded into the common reference string, and the verifer checks the proof by computing another multi-scalar multiplication, as well as several bilinear parings. Pippenger's bucket method is the fastest algorithm to compute $n$-scalar multiplication when $n$ is large. In order to speed up $n$-scalar multiplication over fixed points with the help of precomputation, a new construction of bucket set that can be utilized in the context of Pippenger's bucket method is proposed. The proposed construction shrinks down the bucket size to $q/(2 \ell)$ for prime radix $q$ and small integer $\ell$, thus permitting an algorithm that computes $n$-scalar multiplication in an elliptic curve group with order $r$ using at most $n h+ q/(2 \ell)$ additions, with the help of $\ell n h$ precomputed points, where $h =\lceil \log_q (r) \rceil$.

PRESENTER:
Mohammadtaghi Badakhshan
Supervisor: Guang Gong
Electrical and Computer Engineering
TITLE:
An Efficient Encoding for Polaris
ABSTRACT:
Polaris (Fu et al. 2022) is a post quantum secure zkSNARK scheme for R1CS. Polaris explores sparseness of the R1CS instances for efficient encoding, and employs GKR protocol to reduce computations on verifier side. In the process of constructing zkSNARK for R1CS instances, Polaris uses Lagrange interpolation, evaluation and polynomial multiplication. In this progress report, we use additive Fast Fourier Transform (FFT) and its inverse (IFFT) to perform polynomial arithmetic. Our result shows employing FFT can considerably improve the cost of polynomial computation in Polaris. This approach is also applicable to any other zkSNARK schemes which use affine spaces as Lagrange interpolation sets.

PRESENTER:
Mahbod Majid
Supervisor: Gautam Kamath
Computer Science
TITLE:
Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism
ABSTRACT:
We give the first polynomial-time algorithm to estimate the mean of a d-variate probability distribution from O(d) independent samples (up to logarithmic factors) subject to pure differential privacy.
Our main technique is a new approach to use the powerful Sum ofSquares method (SoS) to design differentially private algorithms. SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics – estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees. We demonstrate a similar proofs to private algorithms phenomenon: instances of the workhorse exponential mechanism which apparently require exponential time but which can be analyzed with low-degree SoS proof scan be automatically turned into polynomial-time differentially private algorithms. We prove a meta-theorem capturing this phenomenon, which we expectto be of broad use in private algorithm design.

PRESENTER:
Vikrant Singhal
Supervisor: Gautam Kamath
Computer Science
TITLE:
A Private and Computationally-Efficient Estimator for Unbounded Gaussians
ABSTRACT:
We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\cN(\mu,\Sigma)$ in $\R^d$. All previous estimators are either nonconstructive, with unbounded running time, or require the user to specify a priori bounds on the parameters $\mu$ and $\Sigma$. The primary new technical tool in our algorithm is a new differentially private preconditioner that takes samples from an arbitrary Gaussian $\cN(0,\Sigma)$ and returns a matrix $A$ such that $A \Sigma A^T$ has constant condition number.

PRESENTER:
Jimmy Di
Supervisor: Gautam Kamath
Computer Science
TITLE:
Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks
ABSTRACT:
We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof.

PRESENTER:
Siddharth Priya
Supervisor: Arie Gurfinkel
Electrical and Computer Engineering
TITLE:
SEAURCHIN: Bounded Model Checking for Rust
ABSTRACT:
Rust is a systems programming language with similar applications as C and C++. The Rust compiler ensures that programs written in the “safe” subset of Rust possess certain safety properties, e.g., memory safety. At the same time practical Rust programs contain “unsafe” code for performance or legacy reasons – for example when delegating to existing C libraries . The Rust compiler, rustc, does not type check unsafe code. Thus, Rust programs using unsafe code may contain memory safety issues in addition to usual functional correctness problems. These errors are not restricted to the syntactically unsafe parts of the program but can manifest even in the safe part that calls unsafe code . Bounded Model Checking is an effective way to enable whole program analysis of Rust programs. Since Rust compiles to LLVM IR and SEAHORN has a bounded model checker (SEABMC) for LLVM IR – it becomes feasible to use SEAHORN to verify Rust programs.
We build a pre-processing pipeline for SEAHORN called SEAURCHIN and use it to ascertain conditions for two standard library components String and RefCell to not panic at runtime. Interestingly, we find an invariant missing from the RefCell documentation whose absence leads to an overflow bug. Currently SEAURCHIN does not exploit ownership semantics (borrowed, owned, shared) from Rust to scale verification. Some ideas for doing so are discussed. We conclude by comparing our approach to that taken by other state-of-the-art verification tools for Rust.

PRESENTER:
Argyris Mouzakis
Supervisor: Gautam Kamath
Computer Science
TITLE:
New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma
ABSTRACT:
We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, \delta)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $\Omega(d^2)$ samples, and in spectral norm requires $\Omega(d^{\frac{3}{2}})$ samples, both matching upper bounds up to logarithmic factors. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(\frac{d}{\alpha^2 \varepsilon})$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon,0)$-differential privacy.

PRESENTER:
Danielle Thompson
Supervisor: Adam Molnar
Sociology & Legal Studies
TITLE:
BossWare in Canada Examining Canadian Companies' Use of Employee Monitoring Applications
ABSTRACT:
Employee monitoring apps (i.e., ‘bossware’) have become increasingly affordable and accessible on the open market. Apps such as Interguard and Teramind provide companies with a powerful degree of surveillance about workers, including keystroke logging, location and browser monitoring, and even webcam usage. However, as homes have become offices, and laptops and smartphones are used for business, school, and entertainment, the increasing surveillance of 'remote work' blurs the boundaries between work and personal spaces.
Drawing from an interdisciplinary study on the proliferation of employee monitoring applications in a nascent era of ‘remote work,’ this paper presents findings from a survey examining Canadian companies’ adoption of employee monitoring applications. The findings identify the most prevalent economic sectors that bossware is currently being used within, the rationalities that underpin the ongoing use of employee monitoring applications in Canada (such as Covid-19, ‘productivity/efficiency,’ ‘cybersecurity,’ and ‘health/wellness’), and the features of the most sought after bossware apps for Canadian companies (such as time tracking, website tracking, and keystroke logging). We conclude with a critical analysis of how the current patterns of employee monitoring in Canada reflect dominant neoliberal imaginaries about the anticipated benefits of surveillance, remote work, and digital labour.

Congratulations to our 2022 Poster Award winner, Hossam ElAtali!TITLE:BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking

Congratulations to our 2022 Poster Award winner, Hossam ElAtali!
TITLE:
BliMe: Verifiably Secure Outsourced Computation with Hardware-Enforced Taint Tracking