defense.npd

class npd[source]

Bases: defense

Neural polarizer: A lightweight and effective backdoor defense via purifying poisoned features

basic structure:
  1. config args, save_path, fix random seed

  2. load the backdoor attack data and backdoor test data

  3. load the backdoor model

  4. npd defense (train a neural polarizer layer):
    1. warm up with a small learning rate

    2. define optimizer

    3. preparation

    4. for each epoch of training the plug layer (random targeted AE and training)

  5. test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
npd.add_arguments(parser)
args = parser.parse_args()
ft_method = npd(args)
result = ft_method.defense(args.result_file)

Note

@inproceedings{ zhu2023neural, title={Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features}, author={Mingli Zhu and Shaokui Wei and Hongyuan Zha and Baoyuan Wu}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}, url={https://openreview.net/forum?id=VFhN15Vlkj}}

Parameters:
  • args (baisc) – in the base class

  • warm_epochs (int) – warm up epochs for defense

  • target_layer_name (str) – the selected layer to insert the polarizer

  • trigger_norm (float) – the norm bound for the perturbation

  • norm_type (str) – the norm type of the bound (choices=[“L_inf”,”L2”,”L1”])

  • inner_steps (int) – the step for generate adversarial examples (relatively insensitive)

  • model_name (str) – decide which polarizer structure to use (for ablation study)

  • lmd1|lmd2|lmd3|lmd4 (str) – hyperparameters of four different losses

  • lr (float) – learning rate for learning the polarizer

  • max_init (bool) – the norm of the bound

  • use_residual (str) – use residual for the polarizer layer or not