defense.nc

class nc[source]

Bases: defense

Neural Cleanse: Identifying And Mitigating Backdoor Attacks In Neural Networks

basic structure:

  1. config args, save_path, fix random seed

  2. load the backdoor attack data and backdoor test data

  3. load the backdoor model

  4. nc defense:
    1. initialize the model and trigger

    2. train triggers according to different target labels

    3. Determine whether the trained reverse trigger is a real backdoor trigger

      If it is a real backdoor trigger:

    4. select samples as clean samples and unlearning samples, finetune the origin model

  5. test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
nc.add_arguments(parser)
args = parser.parse_args()
nc_method = nc(args)
if "result_file" not in args.__dict__:
    args.result_file = 'one_epochs_debug_badnet_attack'
elif args.result_file is None:
    args.result_file = 'one_epochs_debug_badnet_attack'
result = nc_method.defense(args.result_file)

Note

@inproceedings{wang2019neural, title={Neural cleanse: Identifying and mitigating backdoor attacks in neural networks}, author={Wang, Bolun and Yao, Yuanshun and Shan, Shawn and Li, Huiying and Viswanath, Bimal and Zheng, Haitao and Zhao, Ben Y}, booktitle={2019 IEEE Symposium on Security and Privacy (SP)}, pages={707–723}, year={2019}, organization={IEEE}}

Parameters:
  • args (baisc) – in the base class

  • ratio (float) – the ratio of training data

  • index (str) – the index of clean data

  • cleaning_ratio (float) – the ratio of cleaning data used for finetuning the backdoor model

  • unlearning_ratio (float) – the ratio of unlearning data (the clean data + the learned trigger) used for finetuning the backdoor model

  • nc_epoch (int) – the epoch for neural cleanse to train the trigger