defense.rnp

class rnp(args)[source]

Bases: defense

Reconstructive Neuron Pruning for Backdoor Defense

basic structure:

  1. config args, save_path, fix random seed

  2. load the backdoor attack data and backdoor test data

  3. load the backdoor model

  4. rnp defense:
    1. unlearn the backdoor model and save the unlearned model

    2. recover the unlearned model and record the mask value

    3. prune the backdoor model by the mask value

  5. test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
rnp.add_arguments(parser)
args = parser.parse_args()
rnp_method = rnp(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = rnp_method.defense(args.result_file)

Note

@article{li2023reconstructive, title={Reconstructive Neuron Pruning for Backdoor Defense}, author={Li, Yige and Lyu, Xixiang and Ma, Xingjun and Koren, Nodens and Lyu, Lingjuan and Li, Bo and Jiang, Yu-Gang}, journal={arXiv preprint arXiv:2305.14876}, year={2023}}

Parameters:
  • args (baisc) – in the base class

  • alpha (float) – the weight of the loss of the unlearned model during recovering

  • clean_threshold (float) – the threshold of the clean accuracy of the unlearned model

  • unlearning_lr (float) – the learning rate of the unlearning model

  • recovering_lr (float) – the learning rate of the recovering model

  • unlearning_epochs (int) – the number of epochs of the unlearning model

  • recovering_epochs (int) – the number of epochs of the recovering model

  • mask_file (str) – the file of the mask value (default: None)

  • pruning_by (str) – the method of pruning (default: threshold)

  • pruning_max (float) – the maximum value of the pruning (default: 0.90)

  • pruning_step (float) – the step size of the pruning (default: 0.05)

  • pruning_number (float) – the default value of the pruning (default: 0.70)

  • acc_ratio (float) – the tolerance ratio of the clean accuracy (default: 0.95)

  • ratio (float) – the ratio of the clean data loader (default: 0.1)

  • index (str) – the index of the clean data (default: None)

  • schedule (list int) – the schedule of the learning rate (default: [10, 20])