defense.npd

Bases: defense

Neural polarizer: A lightweight and effective backdoor defense via purifying poisoned features

basic structure:

config args, save_path, fix random seed
load the backdoor attack data and backdoor test data
load the backdoor model
npd defense (train a neural polarizer layer):
1. warm up with a small learning rate
2. define optimizer
3. preparation
4. for each epoch of training the plug layer (random targeted AE and training)
test the result and get ASR, ACC, RC

parser = argparse.ArgumentParser(description=sys.argv[0])
npd.add_arguments(parser)
args = parser.parse_args()
ft_method = npd(args)
result = ft_method.defense(args.result_file)

Note

@inproceedings{ zhu2023neural, title={Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features}, author={Mingli Zhu and Shaokui Wei and Hongyuan Zha and Baoyuan Wu}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023}, url={https://openreview.net/forum?id=VFhN15Vlkj}}

Parameters:

args (baisc) – in the base class
warm_epochs (int) – warm up epochs for defense
target_layer_name (str) – the selected layer to insert the polarizer
trigger_norm (float) – the norm bound for the perturbation
norm_type (str) – the norm type of the bound (choices=[“L_inf”,”L2”,”L1”])
inner_steps (int) – the step for generate adversarial examples (relatively insensitive)
model_name (str) – decide which polarizer structure to use (for ablation study)
lmd1|lmd2|lmd3|lmd4 (str) – hyperparameters of four different losses
lr (float) – learning rate for learning the polarizer
max_init (bool) – the norm of the bound
use_residual (str) – use residual for the polarizer layer or not