defense.sau
- class sau(args)[source]
Bases:
defense
Shared adversarial unlearning: Backdoor mitigation by unlearning shared adversarial examples
basic sturcture for defense method:
basic setting: args
attack result(model, train data, test data)
- sau defense:
get some clean data
- SAU:
generate the shared adversarial examples
unlearn the backdoor model by the pertubation
test the result and get ASR, ACC, RC
parser = argparse.ArgumentParser(description=sys.argv[0]) sau.add_arguments(parser) args = parser.parse_args() sau_method = sau(args) if "result_file" not in args.__dict__: args.result_file = 'defense_test_badnet' elif args.result_file is None: args.result_file = 'defense_test_badnet' result = sau_method.defense(args.result_file)
Note
@inproceedings{wei2023shared, title={Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples}, author={Wei, Shaokui and Zhang, Mingda and Zha, Hongyuan and Wu, Baoyuan}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}}
- Parameters:
args (baisc) – in the base class
n_rounds (str) – type of outer loop optimizer utilized
outer_steps (int) – the maximum number of unelarning rounds
lmd_1 (int) – steps for outer loop, the number of unlearning rounds
lmd_2 (float) – clean acc, L_cl
lmd_3 (float) – AT acc. By default, lmd_2 = 0 and AT is not used.
beta_1 (float) – shared adv risk, L_sar
beta_2 (float) – L_adv
trigger_norm (float) – L_share
pgd_init (float) – threshold for PGD. Larger may not be good.
norm_type (str) – init type for pgd. zero|random|max|min
adv_lr (str) – type of norm used for generating perturbation. L1|L2|L_inf|Reg
adv_steps (float) – lr for pgd
train_mode (bool) – number of steps for pgd