Privacy Preservation and Verifiability for Federated Learning

Thumbnail Image
Zhao, Jianxiang
Federated Learning , Secure Machine Learning
Federated learning is a distributed machine learning framework to address the bottleneck of traditional machine learning on data collection and privacy leakage, which allows training a learning model using distributedly stored data without exposing them. In federated learning, multiple clients collaboratively train a single global model that is improved iteratively through clients' local data. Each client receives the global model from an aggregation server. Then, they train it on their private data, and send locally trained models back to the server, which are later integrated into the global model. Iteration after iteration, the federated training continues until the global model is considered well-trained. Federated learning provides basic privacy protection against outsider attackers, but it does not mean user privacy would not be leaked. It has been proved that the local models shared with the server are vulnerable to leaking the raw data maintained by users. To preserve the privacy of users, shared models shall not be in the form of plaintext. However, the encryption of the local models in sharing brings the challenge of model aggregation for the server, which is important to ensure fast convergence of the global model. In addition, it is hard to ensure that the server could honestly aggregate the local models, especially in cross-silo federated learning. If the server does not have enough motivation to coordinate the training, the performance of federated learning cannot be guaranteed. In this thesis, we aim to prevent user privacy leakage from the shared local models, and guarantee the correctness of the global models output by the server. Specifically, we first propose PPA-AFL, a fully asynchronous secure federated learning protocol, which addresses the privacy issue in the asynchronous federated aggregation. The disadvantage of PPA-AFL is that it requires two non-colluding servers and cannot provide a correctness guarantee for the global model. Then, we design PPVA-AFL, a secure and verifiable aggregation protocol for asynchronous federated learning, which simultaneously guarantees the privacy of the local model and the correctness of the global model. In short, we have investigated security issues in federated learning and developed two novel schemes that enhance privacy preservation and introduce verifiability for federated learning.
External DOI