Source code for defense.npd

'''
This is the official implementation of the paper "Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features".
Paper link: https://openreview.net/forum?id=VFhN15Vlkj
This code provides the implementation of the npd defense.
After training, the "neural polarizer layer will be saved separately as name "NP_layer.pt".
To evaluate the performance of npd, please use the "evaluate.py" in the "utils/defense_utils/npd" folder.

Notations: There are some important hyper-parameters you can tune in the npd defense.
    --target_layer_name: the selected layer to insert the polarizer
    --trigger_norm: the norm bound for the perturbation
    --norm_type: the norm type of the bound
    --inner_steps: the step for generate adversarial examples (relatively insensitive)
    --model_name: decide which polarizer structure to use (for ablation study)
    --lmd1|lmd2|lmd3|lmd4: hyperparameters of four different losses
    --lr: learning rate for learning the polarizer

'''
from defense.base import defense


[docs]class npd(defense): r"""Neural polarizer: A lightweight and effective backdoor defense via purifying poisoned features basic structure: 1. config args, save_path, fix random seed 2. load the backdoor attack data and backdoor test data 3. load the backdoor model 4. npd defense (train a neural polarizer layer): a. warm up with a small learning rate b. define optimizer c. preparation d. for each epoch of training the plug layer (random targeted AE and training) 5. test the result and get ASR, ACC, RC .. code-block:: python parser = argparse.ArgumentParser(description=sys.argv[0]) npd.add_arguments(parser) args = parser.parse_args() ft_method = npd(args) result = ft_method.defense(args.result_file) .. Note:: @inproceedings{ zhu2023neural, title={Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features}, author={Mingli Zhu and Shaokui Wei and Hongyuan Zha and Baoyuan Wu}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}, url={https://openreview.net/forum?id=VFhN15Vlkj}} Args: baisc args: in the base class warm_epochs(int): warm up epochs for defense target_layer_name(str): the selected layer to insert the polarizer trigger_norm(float): the norm bound for the perturbation norm_type(str): the norm type of the bound (choices=["L_inf","L2","L1"]) inner_steps(int): the step for generate adversarial examples (relatively insensitive) model_name(str): decide which polarizer structure to use (for ablation study) lmd1|lmd2|lmd3|lmd4(str): hyperparameters of four different losses lr(float): learning rate for learning the polarizer max_init(bool): the norm of the bound use_residual(str): use residual for the polarizer layer or not """