Adversarial attacks pose significant challenges for vision models in critical fields like healthcare, where reliability is essential. Although adversarial training has been well studied in natural images, its application to biomedical and microscopy data remains limited. Existing self-supervised adversarial training methods overlook the hierarchical structure of histopathology images, where patient-slide-patch relationships provide valuable discriminative signals. To address this, we propose Hierarchical Self-Supervised Adversarial Training (HSAT), which exploits these properties to craft adversarial examples using multi-level contrastive learning and integrate it into adversarial training for enhanced robustness. We evaluate HSAT on multiclass histopathology dataset OpenSRH and the results show that HSAT outperforms existing methods from both biomedical and natural image domains. HSAT enhances robustness, achieving an average gain of 54.31% in the white-box setting and reducing performance drops to 3-4% in the black-box setting, compared to 25-30% for the baseline. These results set a new benchmark for adversarial training in this domain, paving the way for more robust models. Our code and pretrained models will be made publicly available.
We observe that vision models trained using our HSAT framework exhibit significantly higher adversarial robustness in white-box settings compared to baseline methods. Under the PGD attack at ε = 8/255 using a ResNet-50 backbone, HSAT achieves gains of 43.90%, 60.70%, and 58.33% in patch, slide, and patient classification, respectively, compared to non-adversarial hierarchical training methods. Additionally, HSAT outperforms instance-level adversarial training (HSAT-Patch) by 6.68%, 10.51%, and 10% across the same tasks. Despite a slight drop in clean accuracy compared to non-adversarial methods, HSAT maintains a superior clean performance with improvements of 15.68%, 17.12%, and 20% over HSAT-Patch.
We observe that target vision models trained using our HSAT framework are significantly more robust against transferability of adversarial examples crafted across different surrogate models. On average, HSAT trained target models show a performance drop of around 3-4%, while target models trained using Hidisc show a performance drop of around 25-30%.
Ablation studies for Hierarchical Adversarial Training (HSAT) in Tables 3 and 4 show that increasing hierarchical discrimination—progressing from patch-level (HSAT-Patch) to slide-level (HSAT-Slide) and patient-level (HSAT-Patient)—consistently improves adversarial robustness. These results highlight the effectiveness of multi-level adversarial training in aligning robust representations across hierarchical levels.
@article{malik2025hierarchical,
title={Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology},
author={Malik, Hashmat Shadab and Kunhimon, Shahina and Naseer, Muzammal and Khan, Fahad Shahbaz and Khan, Salman},
journal={arXiv preprint arXiv:2503.10629},
year={2025}
}