defense.anp

class anp(parser)[source]

Bases: defense

Adversarial Neuron Pruning Purifies Backdoored Deep Models

basic structure:

  1. config args, save_path, fix random seed

  2. load the backdoor attack data and backdoor test data

  3. load the backdoor attack model

  4. anp defense:
    1. train the mask of old model

    2. prune the model depend on the mask

  5. test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
anp.add_arguments(parser)
args = parser.parse_args()
anp_method = anp(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = anp_method.defense(args.result_file)

Note

@article{wu2021adversarial, title={Adversarial neuron pruning purifies backdoored deep models}, author={Wu, Dongxian and Wang, Yisen}, journal={Advances in Neural Information Processing Systems}, volume={34}, pages={16913–16925}, year={2021}}

Parameters:
  • args (baisc) – in the base class

  • anp_eps (float) – the epsilon for the anp defense in the first step to train the mask

  • anp_steps (int) – the training steps for the anp defense in the first step to train the mask

  • anp_alpha (float) – the alpha for the anp defense in the first step to train the mask for the loss

  • pruning_by (str) – the method for pruning, number or threshold

  • pruning_max (float) – the maximum number/threshold for pruning

  • pruning_step (float) – the step size for evaluating the pruning

  • pruning_number (float) – the default number/threshold for pruning

  • index (str) – the index of the clean data

  • acc_ratio (float) – the tolerance ration of the clean accuracy

  • ratio (float) – the ratio of clean data loader

  • print_every (int) – print results every few iterations

  • nb_iter (int) – the number of iterations for training

Update:

All threshold evaluation results will be saved in the save_path folder as a picture, and the selected fixed threshold model results will be saved to defense_result.pt