defense.anp

class anp(parser)[source]

Bases: defense

Adversarial Neuron Pruning Purifies Backdoored Deep Models

basic structure:

config args, save_path, fix random seed
load the backdoor attack data and backdoor test data
load the backdoor attack model
anp defense:
1. train the mask of old model
2. prune the model depend on the mask
test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
anp.add_arguments(parser)
args = parser.parse_args()
anp_method = anp(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = anp_method.defense(args.result_file)

Note

@article{wu2021adversarial, title={Adversarial neuron pruning purifies backdoored deep models}, author={Wu, Dongxian and Wang, Yisen}, journal={Advances in Neural Information Processing Systems}, volume={34}, pages={16913–16925}, year={2021}}

Parameters:

args (baisc) – in the base class
anp_eps (float) – the epsilon for the anp defense in the first step to train the mask
anp_steps (int) – the training steps for the anp defense in the first step to train the mask
anp_alpha (float) – the alpha for the anp defense in the first step to train the mask for the loss
pruning_by (str) – the method for pruning, number or threshold
pruning_max (float) – the maximum number/threshold for pruning
pruning_step (float) – the step size for evaluating the pruning
pruning_number (float) – the default number/threshold for pruning
index (str) – the index of the clean data
acc_ratio (float) – the tolerance ration of the clean accuracy
ratio (float) – the ratio of clean data loader
print_every (int) – print results every few iterations
nb_iter (int) – the number of iterations for training

Update:: All threshold evaluation results will be saved in the save_path folder as a picture, and the selected fixed threshold model results will be saved to defense_result.pt