Joint Pathology Center
Mark D. Zarella, PhD
Director of Digital Pathology
Johns Hopkins University
Building a digital pathology repository to support AI: lessons learned from joint efforts by the Joint Pathology Center and Johns Hopkins University
Background: Development of AI/ML algorithms benefits significantly from the curation of large multi-institutional data sets that span multiple indications, capture differences in slide preparation, and include clinically relevant annotations. The Joint Pathology Center (JPC) houses the world's largest slide repository with over 55 million slides collected for over a century and sourced by laboratories spanning the globe. In 2020, the JPC, in part through capabilities at the Johns Hopkins University (JHU), embarked on digitizing this repository to support AI/ML efforts, requiring careful coordination and development of high throughput slide scanning and repository management across two physical sites. In this talk, we will provide the perspective from both the JPC and JHU, including institutional barriers, technical challenges, and lessons learned.
Methods: In the first phase, JPC pulled ~10,000 slides per week from its repository, barcoded and QA'd slides with a singular ID convention, and delivered slides to JHU on a biweekly basis. JHU in turn scanned the slides at 40x in an automated fashion, evaluated slide quality using a semi-automated QA procedure, and delivered images to a digital repository in the cloud.
Results: In the first full three months of the project following go live, the total throughput of this procedure was approximately 30,000 whole-slide images per month, with a 98% final scan success rate (84% initial scan success rate before rescans were triggered).
Conclusion: Slide digitization efforts across multiple institutions can be an efficient process even when slides are to be scanned at a different location from the slide repository.
- Develop automated high throughput slide scanning at their home institutions
- Embark on collaborations in which the slide scanning and slide repository may be separate
- Understand the impact of large and diverse data sets on AI/ML development
Dr. Zarella’s focus is in the deployment of digital pathology in clinical practice and the development and analysis of novel techniques in imaging and computational pathology. His work relies on imaging modalities such as whole-slide imaging and optical coherence tomography (OCT) as well as computational approaches such as deep learning, image processing, explainable AI, and visual and cognitive analytics. He received his undergraduate degree in Physics at the University of Massachusetts and PhD in Neuroscience at the State University of New York in 2011. Dr. Zarella joined the Johns Hopkins faculty in 2020 as Director of Digital Pathology. Previously, Dr. Zarella was the Technical Director of Pathology Imaging & Informatics at Drexel University College of Medicine. Dr. Zarella is a member of the DPA Board, a member of the CAP Digital and Computational Pathology and AI Committees, and has contributed to several white papers on whole-slide imaging and computational pathology.