Multi-Metric Trustworthiness Evaluation of AI-Assisted Medical Imaging Diagnosis: Integrating Confidence Calibration and Distribution Shift Detection

Authors

  • Yanhuan Chen Master of Engineering, Dartmouth College, NH, USA Author
  • Jiawen Lai Computer Engineering, University of California, Riverside, CA, USA Author

DOI:

https://doi.org/10.66372/JGER.V4I1.8

Keywords:

Confidence Calibration, Distribution Shift Detection, Medical Imaging AI, Trustworthiness Evaluation

Abstract

The rapid proliferation of artificial intelligence in medical imaging diagnosis has created an urgent need for comprehensive trustworthiness evaluation frameworks. This study proposes a multi-metric evaluation methodology that integrates confidence calibration assessment with distribution shift detection to address the critical challenge of silent failures in deployed AI diagnostic systems. The proposed framework encompasses Expected Calibration Error measurement across demographic subgroups, statistical distance-based distribution shift detection in deep feature spaces, and a novel trustworthiness scorecard that synthesizes prediction confidence, distributional similarity, and historical accuracy metrics. Experimental validation across three chest radiograph datasets demonstrates that the integrated evaluation approach identifies 23.7% more potentially unreliable predictions compared to single-metric methods while maintaining clinical utility. The framework provides actionable insights for FDA postmarket surveillance requirements and supports evidence-based clinical AI governance strategies.

Author Biography

  • Jiawen Lai, Computer Engineering, University of California, Riverside, CA, USA

     

     

Downloads

Published

2026-01-26

How to Cite

Multi-Metric Trustworthiness Evaluation of AI-Assisted Medical Imaging Diagnosis: Integrating Confidence Calibration and Distribution Shift Detection. (2026). Journal of Global Engineering Review, 4(1), 113-126. https://doi.org/10.66372/JGER.V4I1.8