defense.sau

class sau(args)[source]

Bases: defense

Shared adversarial unlearning: Backdoor mitigation by unlearning shared adversarial examples

basic sturcture for defense method:

basic setting: args
attack result(model, train data, test data)
sau defense:
1. get some clean data
2. SAU:
  
  generate the shared adversarial examples
  
  unlearn the backdoor model by the pertubation
test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
sau.add_arguments(parser)
args = parser.parse_args()
sau_method = sau(args)
if "result_file" not in args.__dict__:
    args.result_file = 'defense_test_badnet'
elif args.result_file is None:
    args.result_file = 'defense_test_badnet'
result = sau_method.defense(args.result_file)

Note

@inproceedings{wei2023shared, title={Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples}, author={Wei, Shaokui and Zhang, Mingda and Zha, Hongyuan and Wu, Baoyuan}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}}

Parameters:

args (baisc) – in the base class
n_rounds (str) – type of outer loop optimizer utilized
outer_steps (int) – the maximum number of unelarning rounds
lmd_1 (int) – steps for outer loop, the number of unlearning rounds
lmd_2 (float) – clean acc, L_cl
lmd_3 (float) – AT acc. By default, lmd_2 = 0 and AT is not used.
beta_1 (float) – shared adv risk, L_sar
beta_2 (float) – L_adv
trigger_norm (float) – L_share
pgd_init (float) – threshold for PGD. Larger may not be good.
norm_type (str) – init type for pgd. zero|random|max|min
adv_lr (str) – type of norm used for generating perturbation. L1|L2|L_inf|Reg
adv_steps (float) – lr for pgd
train_mode (bool) – number of steps for pgd