defense.rnp

class rnp(args)[source]

Bases: defense

Reconstructive Neuron Pruning for Backdoor Defense

basic structure:

config args, save_path, fix random seed
load the backdoor attack data and backdoor test data
load the backdoor model
rnp defense:
1. unlearn the backdoor model and save the unlearned model
2. recover the unlearned model and record the mask value
3. prune the backdoor model by the mask value
test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
rnp.add_arguments(parser)
args = parser.parse_args()
rnp_method = rnp(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = rnp_method.defense(args.result_file)

Note

@article{li2023reconstructive, title={Reconstructive Neuron Pruning for Backdoor Defense}, author={Li, Yige and Lyu, Xixiang and Ma, Xingjun and Koren, Nodens and Lyu, Lingjuan and Li, Bo and Jiang, Yu-Gang}, journal={arXiv preprint arXiv:2305.14876}, year={2023}}

Parameters:

args (baisc) – in the base class
alpha (float) – the weight of the loss of the unlearned model during recovering
clean_threshold (float) – the threshold of the clean accuracy of the unlearned model
unlearning_lr (float) – the learning rate of the unlearning model
recovering_lr (float) – the learning rate of the recovering model
unlearning_epochs (int) – the number of epochs of the unlearning model
recovering_epochs (int) – the number of epochs of the recovering model
mask_file (str) – the file of the mask value (default: None)
pruning_by (str) – the method of pruning (default: threshold)
pruning_max (float) – the maximum value of the pruning (default: 0.90)
pruning_step (float) – the step size of the pruning (default: 0.05)
pruning_number (float) – the default value of the pruning (default: 0.70)
acc_ratio (float) – the tolerance ratio of the clean accuracy (default: 0.95)
ratio (float) – the ratio of the clean data loader (default: 0.1)
index (str) – the index of the clean data (default: None)
schedule (list int) – the schedule of the learning rate (default: [10, 20])