Artificial intelligence in radiology is undergoing rapid expansion, but its real-world deployment in hospitals is lagging. A lecture by radiologist Juan Guerra summarizes why this is happening, what to do about it, and where AI already delivers value today. The key is clear clinical needs, evidence, and responsible implementation.
Where it stalls: barriers before, during, and after implementation
Before deployment we run into a lack of high-quality, properly labeled data for training algorithms and a lack of independent evidence of effectiveness. Comparing solutions is difficult: if there are multiple tools for the same problem, physicians usually lack training to benchmark them robustly, and the process is costly. Added to this are questions of economics and the interconnection of software and hardware, which are complex in radiology.
After deployment comes the need for systematic performance monitoring. AI performance can degrade over time due to 'drift'—changes in the patient population, a new device, or a software upgrade; one study found that the decline was caused specifically by a system update. Regulatory trends (including the European AI Act) are therefore moving toward mandatory ongoing oversight. And finally, perceptions of obstacles differ by level of knowledge: the better we understand AI, the more realistically we can identify risks—education is therefore key.
How to implement responsibly: use cases, governance, and evidence
AI should be built around clinical 'use cases'—precisely defined situations where it is meant to deliver value (e.g., detection of large-vessel occlusion in stroke or pulmonary nodules). Many early solutions were created more according to available data than to needs and ran into weak adoption. Catalogs of clinical use cases therefore help, with an overview of data requirements, technical specifications, and context of use, which make it easier for developers to focus on real problems and for hospitals to make better selections.
Institutions should create interdisciplinary governance teams (clinicians, IT, developers, data management, privacy) that set up data access, integration, selection of tools based on clinical value, and safety criteria. The foundation is data security and GDPR compliance, local validation of performance, and monitoring of generalizability—it can happen that a tool with a stated accuracy of 99 % achieves only 89 % locally. Published evidence is also lacking: a 2022 analysis showed that out of 100 CE-marked applications, 64 had zero peer-reviewed clinical evidence; however, there are methodologies for preclinical, early clinical, and prospective comparative evaluation. This makes sense: the Swedish MASAI study in mammography screening demonstrated that AI can halve workload without worsening sensitivity or recall and, in the next round, increase cancer detection by 28 %.