Client Clustering Meets Knowledge Sharing: Enhancing Privacy and Robustness in Personalized P2P Learning
This repository contains the source code for the P4 framework and execution scripts for experiments. P4 (Personalized Private Peer-to-Peer) is a method to provide a personalized model for each client under differential privacy guarantees and security against malicious clients. Our approach includes a lightweight algorithm that privately identifies similar clients and groups them in a fully decentralized manner. Once grouped, we enable clients to collaboratively train their models using differentially private knowledge distillation, maintaining high accuracy while ensuring robustness to poisoning attacks. The full version of the paper is available via this link.
srcdirectory contains the source code of P4 with all utils used for experiments.testsdirectory contains tests covering the group formation algorithm.clusterdirectory contains instructions on how to run experiments on the HPC cluster and provides all bash and PBS/SLURM scripts used to execute all experimental pipelines.docsdirectory contains all files related to technical documentation for the experimental study.debug_handcrafteddirectory contains pre-computed handcrafted features (useful for fast debugging).experimentsdirectory contains all supplemental code for experiments.configsdirectory contains config YAMLs for different types of experiments.visualizationsdirectory contains Jupyter notebooks with visualizations.scriptsdirectory contains python scripts for different stages of the experimental pipeline:get_epsilon_bound.pyscript to compute sigma noise based on the target epsilon (Equation 12 from the paper).generate_data.pyscript to generate data for each client.hyper_main*.pyscripts to tune hyperparameters for experiments.run_simulation.pymain script to simulate fully decentralized learning.args_main.pyscript with argparse to simulate fully decentralized learning (older version thanrun_simulation.py).epsilon_hyper_main.pyscript for an ablation study with different epsilons.ablation_hyper_main.pyscript for an ablation study of the impact of P4's component on the privacy-utility trade-off.
Create a virtual environment with Python 3.11 and install the following requirements (tested on Apple Silicon M2 and Ubuntu 22.04):
python -m venv venv
source venv/bin/activate
pip3 install --upgrade pip3
pip3 install -r requiremnents.txtAdd MongoDB secrets (optional). You can deploy a free 500-MB MongoDB cluster on Atlas.
# Create experiments/configs/secrets.env file with database variables
DB_NAME=your_mongodb_name
CONNECTION_STRING=your_mongodb_connection_stringRun P4 locally:
python3 -m experiments.scripts.run_simulation --exp_config_yaml_path=./experiments/configs/exp_config.yamlDescription of arguments. Schema validation is available in this module. Configurations of all attacks and defenses are available in this config YAML.
common_args:
exp_config_name: "p4_example" # Unique name for your experiment config
run_nums: [1] # Run IDs to execute. BASE_SEED * run_num defines a random state for one pipeline execution
debug_handcrafted: false # Reuse handcrafted features from pre-computed files in ./debug_handcrafted/ or not
evaluation_step: 2 # Step to save metrics in the database. Useful for storage optimization
learning: true # Run learning (data required)
proxy_training: true # Enable proxy model training
handcrafted: true # Enable handcrafted features
evaluation: false # Evaluate model without testing
secrets_path: "~/private_decentralized_learning/experiments/configs/secrets.env"
data_args:
dataset: "CIFAR_10"
similarity: 0.25 # Level of similarity between client's data. E.g., alpha-based: 0.25, 0.5, or 0.75; shard-based: 2, 4, or 8
nb_samples_per_client: 200 # Number of data points per client
sample_ratio: 0.8 # Subsampling ratio for data points at each local update
model_args:
model: "NN2" # Chosen model: NN2 or CNN_CIFAR. If using PCA on data, add '_PCA' at the end of the name
weight_decay: 0.0 # Regularization term
local_learning_rate: 0.1 # Multiplicative factor in a learning rate for local client's updates
max_norm: 0.5 # Gradient clipping value
train_args:
federated_algorithm: "P4" # Name of the algorithm to store in a database. Useful for visualizations
federated_optimizer: "FedAvg" # Federated optimization method
client_optimizer: "avg" # Client optimization method
client_selection_method: "MAE-MW-SAMPLING" # Client clustering strategy. All strategies are available in ./src/servers/p4_server.py (define_groups method)
dp: "Gaussian" # Differential Privacy noise type
nb_clients: 240 # Total number of clients for federated learning
group_size: 16 # Size of client groups
num_glob_iters: 10 # Number of communication rounds
client_ratio: 0.5 # Subsampling ratio for clients at each communication round
local_updates: 8 # Number of local updates per selected client
num_sample_cs: 50 # Number of samples for client selection (based on cosine similarity)
sigma_gaussian: 4.71 # Gaussian noise standard deviation for DP
kl_multiplier: 0.1 # Weight of KL divergence loss term
cka_multiplier: 0 # Weight of CKA loss term
multiplier_optimizer: 0.1 # Multiplier for optimizer adjustments
mul_private_wd: 1 # Weight decay multiplier for local model's layers
attack_args:
enable_attack: false
defense_args:
enable_defense: falseThe below table summarizes a list of existing PPFL frameworks we compare P4 with, their source repos, papers, and links to our adaptation in code. Additionally, the source code of FedSecurity and DP-SCAFFOLD has been useful for the implementation of our experimental study.
| Name | Source Paper | Source Repo | Adapted Implementation |
|---|---|---|---|
| FedAvg | Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. | GitHub | Code |
| SCAFFOLD | Maxence Noble, Aurélien Bellet, and Aymeric Dieuleveut. 2022. Differentially private federated learning on heterogeneous data. | GitHub | Code |
| ProxyFL | Shivam Kalra, Junfeng Wen, Jesse C Cresswell, Maksims Volkovs, and HR Tizhoosh. 2023. Decentralized federated learning through proxy model sharing. | GitHub | Code |
| DP-DSGT | Jasmine Bayrooti, Zhan Gao, and Amanda Prorok. 2023. Differentially Private Decentralized Deep Learning with Consensus Algorithms. | Shared by authors privately | - |
| Centralized Learning | - | - | Code |
| Local Training | - | - | Code (run with local_training = 1) |
Table summarizing details for reproducibility of experimental results.
| Name | Description | Execution Scripts | Visualizations |
|---|---|---|---|
| Experiment 1: Privacy-Utility Amplification [Section 5.2 and Appendix B.0.1 in the paper]. | Comparison of P4 with existing PPFL and P2P frameworks in terms of the privacy-utility trade-off. | Config YAMLs | Figures |
| Experiment 2, RQ1: Tolerating Poisoning Attacks [Section 5.3 in the paper]. | How does P4 perform without secure aggregation? | Bash and PBS/SLURM scripts | Jupyter Notebook |
| Experiment 2, RQ2: Tolerating Poisoning Attacks [Section 5.3 in the paper]. | How does P4 without secure aggregation compare to existing defense strategies? | Bash and PBS/SLURM scripts | Jupyter Notebook |
| Experiment 2, RQ3: Tolerating Poisoning Attacks [Section 5.3 in the paper]. | What percentage of malicious clients can P4 with secure aggregation tolerate? | Bash and PBS/SLURM scripts | Jupyter Notebooks |
| Experiment 2, RQ4: Tolerating Poisoning Attacks [Section 5.3 and Appendix C.0.2 in the paper]. | How does P4 perform under poisoning attacks compared to FedAvg? | Bash and PBS/SLURM scripts (those that contain "fedavg") | Jupyter Notebook |
| Experiment 3: Ablation Study [Section 5.4 in the paper]. | Understand the effect of client selection, handcrafted features, and proxy model individually on the performance of P4. | Python script | Figures |
| Experiment 3: Scalability [Appendix D in the paper]. | Evaluate the impact of increasing the number of clients, group size, and the number of samples per client on P4's runtime and test accuracy. | Bash and PBS/SLURM scripts | Jupyter Notebooks |
| Appendix B: Additional Results for Privacy-Utility Amplification [Appendix B.0.2 in the paper]. | Comparison between collaborative and local training. | Python script | Figures |
| Appendix C: Data Poisoning Tolerance [Appendix C.0.1 in the paper]. | Impact of P4 components on attack tolerance under 30% malicious clients. | Bash and PBS/SLURM scripts | Jupyter Notebook |
Hyperparameters and settings:
Number of clients = 240, Client ratio = 40%, Number of global iterations = 100 (only 1 divide by 3 iterations will run), Number of local iterations = 8, Local learning rate = 0.1, Sigma = 4.71, Max norm = 0.5, KL multiplier = 0.1, Multiplier optimizer = 0.1, Group size = 8, Number of samples for client selection = 50, Sample ratio = 20%, Weight decay = 0, Experiment id (it should be same with data generation) = 1
Data generation:
python experiments/scripts/args_main.py \
--generate 1 \
--dataset CIFAR_10 \
--similarity 0.25 \
--nb_clients 240 \
--client_ratio 0.4 \
--nb_samples 200 \
--sample_ratio 0.2 \
--handcrafted 1 \
--number 1
Run our method:
python experiments/scripts/args_main.py \
--learning 1 \
--dataset CIFAR_10 \
--model NN2 \
--similarity 0.25 \
--nb_clients 240 \
--client_ratio 0.4 \
--times 5 \
--num_glob_iters 100 \
--dp Gaussian \
--nb_samples 200 \
--sample_ratio 0.2 \
--local_updates 8 \
--local_learning_rate 0.1 \
--max_norm 0.5 \
--sigma_gaussian 4.71 \
--proxy_training 1 \
--kl_multiplier 0.1 \
--cka_multiplier 0 \
--algorithm FedAvg \
--multiplier_optimizer 0.1 \
--handcrafted 1 \
--weight_decay 0 \
--client_selection_method MAE-MW-SAMPLING \
--mul_private_wd 1 \
--group_size 8 \
--num_sample_cs 50 \
--number 1 \
--evaluation 0
Hyperparameters and settings:
Number of clients = 240, Client ratio = 50%, Number of global iterations = 100 (only 1 divide by 3 iterations will run), Number of local iterations = 10, Local learning rate = 0.1, Sigma = 25.51, Max norm = 0.1, KL multiplier = 0.3, Multiplier optimizer = 1, Group size = 8, Number of samples for client selection = 35, Sample ratio = 90%, Weight decay = 0.001, Experiment id (it should be same with data generation) = 2
Data generation:
python experiments/scripts/args_main.py \
--generate 1 \
--dataset CIFAR_10 \
--similarity 0.5 \
--nb_clients 240 \
--client_ratio 0.5 \
--nb_samples 200 \
--sample_ratio 0.9 \
--handcrafted 1 \
--number 2
Run our method:
python experiments/scripts/args_main.py \
--learning 1 \
--dataset CIFAR_10 \
--model NN2 \
--similarity 0.5 \
--nb_clients 240 \
--client_ratio 0.5 \
--times 5 \
--num_glob_iters 100 \
--dp Gaussian \
--nb_samples 200 \
--sample_ratio 0.9 \
--local_updates 10 \
--local_learning_rate 0.1 \
--max_norm 0.1 \
--sigma_gaussian 25.51 \
--proxy_training 1 \
--kl_multiplier 0.3 \
--cka_multiplier 0 \
--algorithm FedAvg \
--multiplier_optimizer 1 \
--handcrafted 1 \
--weight_decay 0.001 \
--client_selection_method MAE-MW-SAMPLING \
--mul_private_wd 10 \
--group_size 8 \
--num_sample_cs 35 \
--number 2 \
--evaluation 0
Hyperparameters and settings:
Number of clients = 240, Client ratio = 75%, Number of global iterations = 100 (only 1 divide by 3 iterations will run), Number of local iterations = 10, Local learning rate = 0.1, Sigma = 23.1, Max norm = 0.1, KL multiplier = 0.3, Multiplier optimizer = 1, Group size = 8, Number of samples for client selection = 45, Sample ratio = 90%, Weight decay = 0.01, Experiment id (it should be same with data generation) = 3
Data generation:
python experiments/scripts/args_main.py \
--generate 1 \
--dataset CIFAR_10 \
--similarity 0.75 \
--nb_clients 240 \
--client_ratio 0.4 \
--nb_samples 200 \
--sample_ratio 0.9 \
--handcrafted 1 \
--number 3
Run our method:
python experiments/scripts/args_main.py \
--learning 1 \
--dataset CIFAR_10 \
--model NN2 \
--similarity 0.75 \
--nb_clients 240 \
--client_ratio 0.4 \
--times 1 \
--num_glob_iters 100 \
--dp Gaussian \
--nb_samples 200 \
--sample_ratio 0.9 \
--local_updates 10 \
--local_learning_rate 0.1 \
--max_norm 0.1 \
--sigma_gaussian 23.1 \
--proxy_training 1 \
--kl_multiplier 0.3 \
--cka_multiplier 0 \
--algorithm FedAvg \
--multiplier_optimizer 1.0 \
--handcrafted 1 \
--weight_decay 0.01 \
--client_selection_method MAE-MW-SAMPLING \
--mul_private_wd 1 \
--group_size 8 \
--num_sample_cs 45 \
--number 3 \
--evaluation 0
