Machine learning methods for biomedical image analysis that are better and more precise than humans for specific tasks have been around for nearly two decades.1 However, they are rarely used in current practice. In fact, visually inspecting and subjectively describing an imaging study has not fundamentally changed since the very first scientific report of findings on an X-ray in 1896.2
Despite the prowess of newer deep learning methods for recognizing patterns in images and other complex forms of data, the same fundamental barriers from the past two decades still stand in the way.
Integration into a hospital system’s PACS and dictation software is a necessary first step. Fortunately, commercial entities understand this, and tools are starting to be deployed centrally through AI marketplaces affiliated with various PACS. However, standardization and integration of these tools are still in early stages.
Clinical deployment of advanced biomedical image analysis research tools is also slowed because image preprocessing pipelines often require lengthy processing times and multiple quality control (QC) steps. While deep learning methods allow for rapid analysis of new data, the development of robust clinical systems requires that preprocessing and QC steps be automated — so that few or no manual QC steps are required.
Data variability and ability to generalize
Research in biomedical image processing has a historical focus on clean, homogeneous data sets — such as a 3-D MPRAGE or FLAIR sequence used in a research protocol to investigate a particular disease. However, the performance of these datasets, which are typically acquired from a single institution, has been shown to overestimate performance in the real world.3 This is particularly true when applied to data from other institutions.4
Deep learning methods have the potential to overcome issues of lower-quality clinical data and variability across imaging parameters and institutions, but they require larger sample sizes and more diverse, unbiased training data. Acquiring this data is no small task, especially given privacy concerns and the time required for labeling and annotation.
Disease diversity within and between individuals
To be successful, the majority of image processing tools and commercial radiology AI algorithms have been tailored to tackle a specific disease or abnormality, such as measuring calcium scores, ejection fraction, or multiple sclerosis plaques.5,6,7 This “narrow” perspective of defining the specific tasks that are solvable by AI algorithms is a critical first step that is embraced by the ACR Data Science Institute™. General AI is still a long way off and there is debate over whether computers will ever be able to excel in general AI, given the complexity of the human brain in analyzing problems.
Although we are starting to remove the barriers for integrating biomedical image analysis and machine learning into radiology practice, many hurdles that have prevented clinical translation in the past 20 years still remain.
While the narrow tasks being solved by AI provide decision support, they are far different from the reality of a radiologist’s overall job. To be useful, narrow AI solutions will have to be integrated into more comprehensive solutions. This must be done for every body part, multiplied by every modality. Ignoring the sheer diversity of human diseases is called the spectrum bias.8
Further complicating narrow AI approaches is the problem that more than one disease or abnormality is often present in the same patient — such as when small vessel ischemic disease and chronic infarcts/insults are adjacent to a brain metastasis. If an AI tool can only address one (or even a few) types of abnormalities within a single patient, it will be of limited use for most imaging studies.
One potential solution to overcoming disease diversity and developing a more comprehensive solution is combining data-driven and domain-expert approaches. Ignoring the body of knowledge that we have already developed about the appearance of different diseases seems foolish, particularly for rare diseases in which adequate novel training examples will be hard to come by.
Even if we overcome the first three barriers by developing an AI tool both robust in data variation and fully integrated into clinical workflow, it must fundamentally add value to be adopted into clinical practice and be worth paying for (see sidebar on page 20).
Although we are starting to remove the barriers for integrating biomedical image analysis and machine learning into radiology practice, many hurdles that have prevented clinical translation in the past 20 years still remain. Overcoming them will require significant effort and collaboration among academia, industry partners, and professional societies.