IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0266060.html
   My bibliography  Save this article

Transferability of features for neural networks links to adversarial attacks and defences

Author

Listed:
  • Shashank Kotyan
  • Moe Matsuki
  • Danilo Vasconcellos Vargas

Abstract

The reason for the existence of adversarial samples is still barely understood. Here, we explore the transferability of learned features to Out-of-Distribution (OoD) classes. We do this by assessing neural networks’ capability to encode the existing features, revealing an intriguing connection with adversarial attacks and defences. The principal idea is that, “if an algorithm learns rich features, such features should represent Out-of-Distribution classes as a combination of previously learned In-Distribution (ID) classes”. This is because OoD classes usually share several regular features with ID classes, given that the features learned are general enough. We further introduce two metrics to assess the transferred features representing OoD classes. One is based on inter-cluster validation techniques, while the other captures the influence of a class over learned features. Experiments suggest that several adversarial defences decrease the attack accuracy of some attacks and improve the transferability-of-features as measured by our metrics. Experiments also reveal a relationship between the proposed metrics and adversarial attacks (a high Pearson correlation coefficient and low p-value). Further, statistical tests suggest that several adversarial defences, in general, significantly improve transferability. Our tests suggests that models having a higher transferability-of-features have generally higher robustness against adversarial attacks. Thus, the experiments suggest that the objectives of adversarial machine learning might be much closer to domain transfer learning, as previously thought.

Suggested Citation

  • Shashank Kotyan & Moe Matsuki & Danilo Vasconcellos Vargas, 2022. "Transferability of features for neural networks links to adversarial attacks and defences," PLOS ONE, Public Library of Science, vol. 17(4), pages 1-19, April.
  • Handle: RePEc:plo:pone00:0266060
    DOI: 10.1371/journal.pone.0266060
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0266060
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0266060&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0266060?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qi Zhang & Zhenkai Qin & Yunjie Zhang, 2024. "MCGCL:Adversarial attack on graph contrastive learning based on momentum gradient candidates," PLOS ONE, Public Library of Science, vol. 19(6), pages 1-19, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0266060. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.