An update on the European Bigpicture project
Background: Many promising results have been shown for the application of artificial intelligence (AI) in histopathology, and we are seemingly very close to the widespread introduction of these techniques into routine workflows. This is witnessed, for example, by the large increase in companies focusing on developing AI for pathology. Still, several hurdles have so far hindered the adoption of AI. One important factor is that the quality of the AI solutions may not always be as good as we would like, which often boils down to limited generalizability: the AI works well on data that are sufficiently similar to the data it was developed with but shows decreased performance on new data. This problem may be mitigated by having very large and heterogenous data sets for AI training and validation, collected from heterogeneous sources. However, collecting such data sets is challenging, for technical, financial, ethical, and legal reasons.
Methods: We developed the Bigpicture consortium, which aims to establish an EU-wide repository of whole slide images (WSI) with associated meta-data. The project will focus on setting up the required infrastructure (hard- and software), the collection of 3 million WSI’s from (pre-)clinical studies, development of guidelines for collecting data and making these available in a legal and ethical manner, and development of generic AI algorithms. The project is funded by the EU under the Innovative Medicines Initiative (IMI) program and started February 2021.
Results: The Bigpicture consortium consists of 45 parties, including academic centers, large pharmaceutical companies, small and medium-sized enterprises, and regulatory bodies. At this time, a first version of the repository has been developed and is in operation. New features will be added as requested by users. The first pilot data sets have been collected and included in the Bigpicture repository. AI developers in the project have been able to access data sets and train AI models using these data. A draft data sharing agreement was established, which will expectedly be finalized and agreed upon in Q4/2023. On the technical side, tools were developed to convert WSI’s into a standardized DICOM format, and to automatically extract data from diverse image management systems, as being in use at different partners' medical centers. Also, generic AI technology was developed for training of AI models in a weakly supervised manner.
Conclusion: The Bigpicture project is well underway to establish an ecosystem in which value is created by connecting the different stakeholders in the AI pathology domain. It will ensure access to data and AI solutions in a legally and ethically compliant environment, facilitating creation and adoption of innovative pathology diagnostics.
Objectives:
- Understand the relevance of having access to large, diverse data sets for AI development
- Understand the setup and progress within the Bigpicture project
Presented by:
Jeroen van der Laak, PhD
Professor
Radboud University Medical Center
Jeroen van der Laak is associate professor of computational Pathology at the Pathology department of Radboud University Medical Center in Nijmegen, Netherlands and guest professor at the Center for Medical Image Science and Visualization in Linköping, Sweden. His research focuses on the use of AI for digitized histopathological images. His group was among the first to show the potential of AI for analysis of whole slide images. Further research focused on AI improvements to increase robustness and accuracy, and on AI application for diverse tasks, developing models for analysis of breast and colorectal cancer and for renal transplant biopsies. In 2016 and 2017, he coordinated the CAMELYON grand challenges. He is member of the board of directors of DPA and is leading the ‘AI in Pathology’ taskforce of the European Society of Pathology and is coordinator of the Bigpicture project. Dr van der Laak is USCAP Nathan Kaufman laureate 2019.