A Mutual Information Regularization for Adversarial training

Modeste Atsague (Iowa State University)*; Olukorede J Fakorede (Iowa State University ); Jin Tian (Iowa State University)

PMLR Page

Abstract

Recently, a number of methods have been developed to alleviate the vulnerability of deep neural networks to adversarial examples, among which adversarial training and its variants have been demonstrated to be the most effective empirically. This paper aims to further improve the robustness of adversarial training against adversarial examples. We propose a new training method called mutual information and mean absolute error adversarial training (MIMAE-AT) in which the mutual information between the probabilistic predictions of the natural and the adversarial examples along with the mean absolute error between their logits are used as regularization terms to the standard adversarial training. We conduct experiments and demonstrate that the proposed MIMAE-AT method improves the state-of-the-art on adversarial robustness.