defense.bnp

class bnp(args)[source]

Bases: defense

Pre-activation Distributions Expose Backdoor Neurons

basic structure:

  1. config args, save_path, fix random seed

  2. load the backdoor attack data and backdoor test data

  3. load the backdoor attack model

  4. bnp defense:
    1. use batch information to get KL divergences

    2. then prune backdoor neurons for larger KL divergence

  5. test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
bnp.add_arguments(parser)
args = parser.parse_args()
bnp_method = bnp(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = bnp_method.defense(args.result_file)

Note

@article{zheng2022pre, title={Pre-activation Distributions Expose Backdoor Neurons}, author={Zheng, Runkai and Tang, Rongjun and Li, Jianze and Liu, Li}, journal={Advances in Neural Information Processing Systems}, volume={35}, pages={18667–18680}, year={2022}}

Parameters:
  • args (baisc) – in the base class

  • u (float) – u in the bnp defense, for neuron KL divergence value, how many std away from mean will be regarded as backdoored

  • u_min (float) – the default minimum value of u

  • u_max (float) – the default maximum value of u

  • u_num (float) – the default number of u

  • ratio (float) – the ratio of clean data loader

  • index (str) – index of clean data

Update:

All threshold evaluation results will be saved in the save_path folder as a picture, and the selected fixed threshold model results will be saved to defense_result.pt