HSAT:Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology

Hashmat Shadab Malik¹, Shahina Kunhimon¹, Muzammal Naseer², Fahad Shahbaz Khan^1,3, Salman Khan^1,4

¹Mohamed bin Zayed University of Artificial Intelligence, UAE
²Center of Secure Cyber-Physical Security Systems, Khalifa University, UAE
³Linköping University
⁴Australian National University
(MICCAI 2025)

arXiv Poster Code Model Weights

Current adversarial training methods in self-supervised learning primarily focus on instance-level contrastive learning, where adversarial perturbations are generated by pushing an image away from its own augmented views. However, such approaches often overlook the rich hierarchical structure present in complex domains like histopathology. In contrast, we introduce Hierarchical Self-Supervised Adversarial Training (HSAT), a framework that extends adversarial training beyond the instance level by leveraging hierarchical positives — Patch Positives (augmented views of a patch), Slide Positives (patches from the same slide), and Patient Positives (patches from slides of the same patient). By jointly optimizing contrastive objectives at multiple hierarchies, HSAT reduces overfitting to specific adversarial patterns and strengthens the model’s ability to generalize across complex relationships in histopathology data. Our method involves two key steps:
Maximization Step: We generate adversarial examples by pushing an image away from its positive pairs across all hierarchical levels, crafting diverse perturbations that challenge the model not just at the patch level but also across slide and patient relationships.

Minimization Step: The encoder is then updated by pulling these adversarial examples closer to their corresponding hierarchical positives in the feature space, reinforcing robust feature representations across all levels.

Abstract

Adversarial attacks pose significant challenges for vision models in critical fields like healthcare, where reliability is essential. Although adversarial training has been well studied in natural images, its application to biomedical and microscopy data remains limited. Existing self-supervised adversarial training methods overlook the hierarchical structure of histopathology images, where patient-slide-patch relationships provide valuable discriminative signals. To address this, we propose Hierarchical Self-Supervised Adversarial Training (HSAT), which exploits these properties to craft adversarial examples using multi-level contrastive learning and integrate it into adversarial training for enhanced robustness. We evaluate HSAT on multiclass histopathology dataset OpenSRH and the results show that HSAT outperforms existing methods from both biomedical and natural image domains. HSAT enhances robustness, achieving an average gain of 54.31% in the white-box setting and reducing performance drops to 3-4% in the black-box setting, compared to 25-30% for the baseline. These results set a new benchmark for adversarial training in this domain, paving the way for more robust models. Our code and pretrained models will be made publicly available.

Quantitative Results

MY ALT TEXT

We observe that vision models trained using our HSAT framework exhibit significantly higher adversarial robustness in white-box settings compared to baseline methods. Under the PGD attack at ε = 8/255 using a ResNet-50 backbone, HSAT achieves gains of 43.90%, 60.70%, and 58.33% in patch, slide, and patient classification, respectively, compared to non-adversarial hierarchical training methods. Additionally, HSAT outperforms instance-level adversarial training (HSAT-Patch) by 6.68%, 10.51%, and 10% across the same tasks. Despite a slight drop in clean accuracy compared to non-adversarial methods, HSAT maintains a superior clean performance with improvements of 15.68%, 17.12%, and 20% over HSAT-Patch.

MY ALT TEXT

We observe that target vision models trained using our HSAT framework are significantly more robust against transferability of adversarial examples crafted across different surrogate models. On average, HSAT trained target models show a performance drop of around 3-4%, while target models trained using Hidisc show a performance drop of around 25-30%.

MY ALT TEXT

Ablation studies for Hierarchical Adversarial Training (HSAT) in Tables 3 and 4 show that increasing hierarchical discrimination—progressing from patch-level (HSAT-Patch) to slide-level (HSAT-Slide) and patient-level (HSAT-Patient)—consistently improves adversarial robustness. These results highlight the effectiveness of multi-level adversarial training in aligning robust representations across hierarchical levels.

BibTeX

@article{malik2025hierarchical,
  title={Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology},
  author={Malik, Hashmat Shadab and Kunhimon, Shahina and Naseer, Muzammal and Khan, Fahad Shahbaz and Khan, Salman},
  journal={arXiv preprint arXiv:2503.10629},
  year={2025}
}