detection_pretrain.strip

class strip(args, inspection_set, clean_set, model, strip_alpha: float = 0.5, N: int = 64, defense_fpr: float = 0.05, batch_size=128)[source]

Bases: object

STRIP: A Defence Against Trojan Attacks on Deep Neural Networks

basic sturcture for defense method:

  1. basic setting: args

  2. attack result(model, train data, test data)

  3. STRIP detection:
    1. mix up clean samples and record the entropy of prediction, and record smallest entropy and largest entropy as thresholds.

    2. mix up the tested samples and clean samples, and record the entropy.

    3. detection samples whose entropy exceeds the thresholds as poisoned.

  4. Record TPR and FPR.

parser = argparse.ArgumentParser(description=sys.argv[0])
strip.add_arguments(parser)
args = parser.parse_args()
strip_method = strip(args)
if "result_file" not in args.__dict__:
    args.result_file = 'defense_test_badnet'
elif args.result_file is None:
    args.result_file = 'defense_test_badnet'
result = strip_method.detection(args.result_file)

Note

@inproceedings{gao2019strip, title={Strip: A defence against trojan attacks on deep neural networks}, author={Gao, Yansong and Xu, Change and Wang, Derui and Chen, Shiping and Ranasinghe, Damith C and Nepal, Surya}, booktitle={Proceedings of the 35th Annual Computer Security Applications Conference}, pages={113–125}, year={2019}}

Parameters:
  • args (baisc) – in the base class

  • clean_sample_num (int) – number of clean sample given

  • target_label (int) – attack target class