by Mohamed Amgad, M.D., M.Sc., Ph.D. Candidate (Emory University), Predoctoral Fellow (Northwestern University)


TL; DR: Annotation data, delineating histopathologic regions and cells, is critical for developing artificial intelligence models to answer fundamental research questions. Unfortunately, this data is prohibitively difficult to obtain manually at scale due to pathologist time constraints. Passive, effortless data collection during routine diagnosis and teaching will help solve this problem.


Without any effort, the very act of online browsing generates a mind-boggling amount of data. Social media and search engines use this data to build algorithms that model your past behavior and predict your future preferences. These algorithms are then used to show you relevant advertisements and, infamously, to affect your political and social choices. If you’ve been following the news lately, you’re probably already familiar with this practice, and you may even have strong opinions about it. In this blog post, I make the argument for copying that model to the computational pathology research domain.


In 2017, the FDA approved the first slide scanner and viewing system for primary clinical diagnosis. This year, they approved the first AI algorithm for computer-aided diagnosis. Everyone is excited— I know I am! That said, I think it’s important to recognize that there are opportunities for growth and, more importantly, that the current paradigm for developing computational pathology algorithms is unsustainable. 


First, Definitions!

Let’s make sure we’re on the same page regarding definitions. According to a white paper from the Digital Pathology Association, Digital Pathology is “a blanket term that encompasses tools and systems to digitize pathology slides and associated meta-data, their storage, review, analysis, and enabling infrastructure.” Computational Pathology is a sub-field, defined as “the omics or big-data approach to pathology, where multiple sources of patient information including pathology image data and meta-data are combined to extract patterns and analyze features.” Finally, Deep Learning is a computational modeling approach that learns to perform various tasks, like recognizing tumor tissue in a future whole-slide scan, using existing data. I recommend you take a look at Table 1 from the paper for more definitions.


Why the current paradigm is unsustainable

Let’s say you want to train a deep-learning model to recognize metastatic breast cancer foci in lymph nodes. This is an ideal task for computer-aided diagnosis because it is tedious, time-consuming, and error-prone when done visually. The most straightforward way to train that algorithm is to provide it with tens of thousands of annotations. These annotations are painstakingly produced by pathologists, who spend hours manually tracing tissue boundaries in whole slide scans. This is precisely what the CAMELYON dataset curators did, and the resulting algorithm achieved state-of-the-art diagnostic performance — as expected. This method for training deep learning models is called supervised learning. It almost always yields the highest accuracy results. If it’s not already obvious, the main drawback of this approach is the tremendous investment needed to produce annotation data. Anyone would be bored to death if they had to draw tens of thousands of polygons to delineate various tissue structures. That’s anyone — now imagine asking a busy, highly-trained pathologist to do this task!


Two principal solutions have been proposed to address this issue — either increase the supply of-, or reduce the demand for manual annotations. To increase supply, I published a couple of papers showing that non-pathologists with some medical background (like medical students) can reliably annotate visually distinctive tissue regions and cells in breast cancer. This crowdsourcing approach is helpful as long as we only care about simple visual patterns, and requires a non-trivial investment to offer financial or career advancement incentives. Even so, research investigators using this approach in a medical school setting need to make sure that: 1. These tasks are well-aligned with the students’ academic training; 2. The students receive the necessary training, and there are adequate quality control measures; 3. There are no commercial or other conflicts of interest. Alternatively, some research groups have focused on increasing supply by supplementing training datasets with synthetically generated data.


An alternative paradigm called weakly supervised, multiple-instance learning is gaining popularity. A key paper used this approach in 2019 and led to the recently FDA-approved prostate cancer algorithm. Moreover, it does not use manual annotations at all, only the whole-slide label. It seems like a magic solution, doesn’t it? Well, not quite! Because the signal is so weak, a lot more clinical data is needed to train these algorithms — over 10,000 slides were required by the weakly supervised approach, compared to 270 in CAMELYON. Additionally, this approach is challenging to apply for dense mapping of tissue regions and cells or for detecting uncommon histopathologic patterns. However, as this paper illustrates, mapping all tissues and cells in a whole-slide scan is critical for developing explainable models to address fundamental biological questions.


Passive data collection to the rescue

I hope I have convinced you that none of the above approaches is a magic bullet solution for training reliable computational pathology algorithms. The framework I am proposing is no magic bullet either, but I do believe it is the most practical solution. Let’s imagine how this could work in an academic setting. The pathology attending is sitting in a digital sign-out room with several pathology residents and fellows. In front of her is a computer screen showing a digital scan of a breast cancer slide. She first briefly describes salient clinical parameters, then starts panning and zooming on various areas within the slide. “This area has ductal carcinoma in situ, comedo pattern.” “This area has invasive ductal carcinoma.” “This is a clear example of stromal hyalinization.” A real-time voice recognition system captures everything she says. Everything she does on the screen is tracked and synchronized with the audio transcripts. At the end of the teaching session, we collected a sizable amount of weakly labeled training data with no extra effort from the pathologist.


We can imagine how this would work in a routine diagnostic setting too. Of course, the signal might be weaker because of the lack of synchronized audio transcripts, but the basic principle would still apply. After all, the pathologist produces a diagnostic report, which can be parsed in clever computational ways and linked to the screen capture recording.


Ethical and regulatory considerations

Anyone who has been following the news will know that passive data collection schemes are a double-edged sword, and there need to be clear ethical and regulatory frameworks that govern the collection and usage of such data. For example, does this data belong to the patient or the hospital system? Would pathologists feel comfortable knowing that their every move is being mined and recorded? Does this impact their diagnostic performance, and if so, is the impact positive or negative? Can this data be used by malpractice insurance companies and support lawsuits filed against pathologists? If so, how would this impact the pathologists’ willingness to adopt such systems in their routine work?


Final thoughts

Progress in computational pathology applications is strongly tied to our ability to generate large volumes of annotation data to train and validate accurate models. Passive data collection is a tried-and-true solution that we should seriously consider as an untapped source of high-quality weak annotation data to advance computational pathology research.


Disclaimer: In seeking to foster discourse on a wide array of ideas, the Digital Pathology Association believes that it is important to share a range of prominent industry viewpoints. This article does not necessarily express the viewpoints of the DPA, however we view this as a valuable point with which to facilitate discussion.