Build Your Own Backdoor Attack
This is a simple demonstration of how to build a backdoor attack using our framework.
We take the default case of creating a data poisoning attack as an example. (We inherit from the BadNet class in ./attack/badnet.py and create your own attack)
Hyperparameter setting and basic configuration
First by inheriting from the NormalCase class, we have basic training hyperaparameters in args. You can add more into the parser for your specific usages.
parser = argparse.ArgumentParser(description=sys.argv[0]) parser = self.set_args(parser)
Then we add the hyperparameters for the backdoor attack
parser = self.set_bd_args(parser)
We load the default clean and backdoor hyperparameters into our args
self.add_bd_yaml_to_args(args) self.add_yaml_to_args(args)
After all above we parse the args and initial all save path and other log settings
args = self.process_args(args) self.prepare(args)
Backdoor dataset generation
We first load the clean dataset
train_dataset_without_transform, \ train_img_transform, \ train_label_transform, \ test_dataset_without_transform, \ test_img_transform, \ test_label_transform, \ clean_train_dataset_with_transform, \ clean_train_dataset_targets, \ clean_test_dataset_with_transform, \ clean_test_dataset_targets \ = self.benign_prepare()
Then we can get image trigger injection functions and label manipulation function
train_bd_img_transform, test_bd_img_transform = bd_attack_img_trans_generate(args) bd_label_transform = bd_attack_label_trans_generate(args)
bd_attack_img_trans_generate() is at backdoorbench/utils/aggregate_block/bd_attack_generate.py, you can add you trigger injection function here and call them by args.attack
bd_attack_label_trans_generate is also at backdoorbench/utils/aggregate_block/bd_attack_generate.py, we already implement basic all-to-one and all-to-all, you can add your own label manipulation function.
We utilize the bd_label_transform to select which sample being poisoned
train_poison_index = generate_poison_index_from_label_transform( clean_train_dataset_targets, label_transform=bd_label_transform, train=True, pratio=args.pratio if 'pratio' in args.__dict__ else None, p_num=args.p_num if 'p_num' in args.__dict__ else None, )
Here the pratio and p_num can be used to poison by fraction or poison by exact sample number
All preparation is done, we carry out it by calling the prepro_cls_DatasetBD_v2 dataset.
bd_train_dataset = prepro_cls_DatasetBD_v2( deepcopy(train_dataset_without_transform), poison_indicator=train_poison_index, bd_image_pre_transform=train_bd_img_transform, bd_label_pre_transform=bd_label_transform, save_folder_path=f"{args.save_path}/bd_train_dataset", )
The test dataset follows the same procedure, but remember we do not want target class in bd_test. So one step more is to do subset.
bd_test_dataset.subset( np.where(test_poison_index == 1)[0] )
Notice that we need to warp the dataset with training/test transformation by calling dataset_wrapper_with_transform. We drop the transformation is for data manipulation convenience.
bd_test_dataset_with_transform = dataset_wrapper_with_transform( bd_test_dataset, test_img_transform, test_label_transform, )
Backdoor model training
First we need to configure the model and device, we support both single machine and DataParrallel mode.
self.net = generate_cls_model( model_name=args.model, num_classes=args.num_classes, image_size=args.img_size[0], ) self.device = torch.device( ( f"cuda:{[int(i) for i in args.device[5:].split(',')][0]}" if "," in args.device else args.device # since DataParallel only allow .to("cuda") ) if torch.cuda.is_available() else "cpu" ) if "," in args.device: self.net = torch.nn.DataParallel( self.net, device_ids=[int(i) for i in args.device[5:].split(",")] # eg. "cuda:2,3,7" -> [2,3,7] )
Further we set training critertion, optimizer, and scheduler
criterion = argparser_criterion(args) optimizer, scheduler = argparser_opt_scheduler(self.net, args)
Finally, we pu them all together and pass to BackdoorModelTrainer, this class handle both training and output plots, metric informations. You can find them in your save path.
trainer.train_with_test_each_epoch_on_mix( DataLoader(bd_train_dataset_with_transform, batch_size=args.batch_size, shuffle=True, drop_last=True, pin_memory=args.pin_memory, num_workers=args.num_workers, ), DataLoader(clean_test_dataset_with_transform, batch_size=args.batch_size, shuffle=False, drop_last=False, pin_memory=args.pin_memory, num_workers=args.num_workers, ), DataLoader(bd_test_dataset_with_transform, batch_size=args.batch_size, shuffle=False, drop_last=False, pin_memory=args.pin_memory, num_workers=args.num_workers, ), args.epochs, criterion=criterion, optimizer=optimizer, scheduler=scheduler, device=self.device, frequency_save=args.frequency_save, save_folder_path=args.save_path, save_prefix='attack', amp=args.amp, prefetch=args.prefetch, prefetch_transform_attr_name="ori_image_transform_in_loading", # since we use the preprocess_bd_dataset non_blocking=args.non_blocking, )
Backdoor model saving
This is handled by save_attack_result, you should give it the basic setting information for further loading.
save_attack_result( model_name=args.model, num_classes=args.num_classes, model=trainer.model.cpu().state_dict(), data_path=args.dataset_path, img_size=args.img_size, clean_data=args.dataset, bd_train=bd_train_dataset_with_transform, bd_test=bd_test_dataset_with_transform, save_path=args.save_path, )
Notice that some training controllable attack may not have bd_train, since they are batch-wise poisoning, so you can pass None to it. And if it is None, you saved attack can still being loaded in defenses that do not need bd_train.