PV24 Speakers

Subject to change.

 

 

image

Ehsan Ullah, MBBS, MPhil, PhD

Operations Manager, Health New Zealand, Auckland


Ehsan works as Operations Manager for Anatomical Pathology departments at LabPlus Auckland City Hospital, Auckland region's community pathology, and New Zealand's National Perinatal Pathology Service. Ehsan is a member of International Standards Organization (ISO)'s technical committee on the medical laboratories and in vitro diagnostic systems. Ehsan's research interest includes use of AI to improve workflow efficiency and patient outcomes.

 

 

SESSIONS

Extraction of Discrete Information from Pathology Reports Using Local and Private LLMs
   Mon, Nov 4
   04:20PM - 04:40PM ET

Background: Surgical pathology reports provide detailed descriptions of tumor samples and are the primary communication tool between pathologists and other clinical specialists involved in a patient's treatment journey. These reports may include critical information such as cancer site, laterality, tumor stage and grade, histology, behavior, and disease codes. Extracting this information as discrete variables has numerous downstream applications, including the maintenance of cancer registries. Cancer registries are databases that capture essential information about cancer patients, and pathology reports are a vital source for creating and maintaining accurate tumor records. Currently, the extraction of relevant information from pathology reports for cancer registries is performed manually, with human experts reviewing the documents and populating the records. Current Solution and their Limitation: Various natural language processing (NLP) methods have been proposed for extracting information from pathology reports. However, these methods often fail to achieve the desired accuracy due to the complex nature of pathology reports, including cancer typing, sub-typing, and specialized medical terminology. Additionally, pathology reports may be stored in formats such as PDF or RTF, necessitating pre-processing steps like optical character recognition (OCR), which introduces additional artifacts and noise into the data. Recently, techniques based on large language models (LLMs) have been proposed. However, using LLMs presents several challenges, including (1) privacy: most LLMs are accessed via APIs, requiring the transmission of user data over the internet to the LLM server for processing; (2) cost: LLMs are billed per token (approximately three-quarters of a word in English), and the cost of processing pathology reports from both historical data and current clinical records can accumulate to substantial amounts; (3) computational requirements: running LLMs on-premise necessitates significant investment in computational infrastructure.Methods: We addressed these challenges by running compressed and quantized LLMs locally within Moffitt's firewall to process over 7,000 pathology reports from the TCGA project. Data: The PDF pathology reports from twelve solid cancers (bladder, brain, cervix, colorectal, head and neck, kidney, lung, liver, ovarian, pancreas, prostate, and uterus) were downloaded and stored locally. We developed a software pipeline to load the PDF files, perform OCR, and then use an LLM to extract six discrete variables, along with an explanation for each output selection. The variables included cancer site, laterality, stage, grade, histology, and behavior. Our prompt strategy involved two calls to the LLMs using the LangChain library: (1) we prompted the LLM to extract the six variables and provide an explanation for each extraction, and (2) we used Pydantic to force the LLM to output the variables in a JSON dictionary format using the results from the first call as the input for the LLM in the second call. The pipeline's output was stored in JSON and CSV formats. We experimented with different LLMs, including Mistral, Llama-2, Llama-3, and Mixtal, and found that the Mixtral 8x7b model (quantized at Q4_0 with 46.7B parameters) provided the best balance between processing time and accuracy. Our experiments were conducted on a desktop computer with an NVIDIA RTX A4500 (30GB VRAM) GPU and a data center compute node with an NVIDIA A30 (24GB VRAM) GPU. A pathology expert on our team analyzed the extracted variables. Given the large number of reports ( 6,944 in total across all cancers), the experts randomly selected between 10 to 30 reports from each cancer and manually verified the correctness of the LLM-extracted variables. Reports where OCR failed to produce correct text were excluded from the experiment. Any variables not present in the original report or not applicable (as determined by the subject matter expert) were excluded from the analysis.Preliminary Results: The LLM was able to extract all six variables with an average accuracy of 99.2% across all variables and cancer sites in 145 reports. Specifically, the extraction accuracy for each variable was as follows: cancer site 100%, laterality 99.3%, stage 99.3%, grade 98.6%, histological entity 100%, and behavior 97.9%. When analyzing accuracy by cancer type, we observed the following results: bladder 100% (n=22), brain 83.3% (n=12), cervix 90.9% (n=11), colorectal 92.8% (n=14), head and neck 95% (n=18), kidney 100% (N=9), lung 100% (n=22), liver 100% (n=11), prostate 100% (n=15). The PDF files, source code, LLM prompt, and configurations have been publicly shared via GitHub.

 

Learning Objectives

  1. Explain the current challenges and limitations in extracting discrete information from pathology reports for the cancer registry.      
  2. Describe the innovative approach using compressed, quantized, private, and local LLMs for information extraction.         
  3. Present the preliminary results and performance evaluation of the developed system.
Generative AI in Anatomical Pathology: Today’s Innovations and Tomorrow’s Possibilities
   Mon, Nov 4
   01:35PM - 01:55PM ET

Background:Generative Artificial Intelligence (AI) models such as Large Language Models, Vision Language Models, and Foundation Models have emerged as transformational tools across various disciplines. Anatomic pathologists are already interacting with generative AI chatbots during case sign out to get assistance on differential diagnosis work-up, recommended special stains as well as immunohistochemistry and molecular testing based on current guidelines, and to produce a draft of the pathology report. Generative AI is starting to reshape the technical and administrative processes within anatomic pathology too and holds immense potential for a host of new potential applications to completely transform the discipline of anatomic pathology and cancer diagnosis. Aims:This presentation aims to delve into the generative AI applications, advantages, and challenges in anatomic pathology, emphasizing its influence on technical laboratory processes and pathologist sign out procedures, to highlight the opportunities to make workflow efficiency gains. This talk will also note the impact of generative AI applications on the educational paradigms and research advancements specifically within anatomic pathology. Design:This presentation is based on a collaborative work conducted by a group of researchers and professionals encompassing pathology and AI fields who have conducted a thorough literature review into the recent developments in generative AI applications within anatomic pathology. These generative AI applications will be categorized into unimodal and multimodal applications and will be assessed for their current clinical utility, ethical implications, and future potential. Results:Generative AI exhibits substantial promise across several domains in anatomic pathology. AI-driven image analysis, virtual staining, and synthetic data generation significantly enhance diagnostic precision. Automation of routine tasks, quality control, and reflex testing demonstrates potential for considerable workflow improvements leading to quicker turnaround times assisting faster treatment and better patient outcomes. AI-generated educational materials, synthetic histology images, and advanced data analysis methods foster enhanced educational and research opportunities. Initial findings suggest anatomic pathology workforce seems cautiously optimistic about the transformative potential of AI. Pathologists show interest in adopting AI tools for non-diagnostic tasks. There is a growing spectrum of various applications in academic settings. Dependable AI tools will need to go through rigorous testing and evaluation before and after each implementation to ensure quality. Conclusions:Generative AI holds the potential to revolutionize anatomic pathology by enhancing diagnostic accuracy, improving workflow efficiency, and advancing education and research. However, its successful integration into clinical practice demands ongoing interdisciplinary collaboration, meticulous validation, and strict adherence to ethical standards to ensure that AI's benefits are fully realized while maintaining the highest levels of patient care. This talk will explore the transformative potential of generative AI in anatomic pathology, offering participants valuable insights into its current and future applications and addressing the necessary steps for its successful and ethical implementation in clinical practice. Keywords: Generative AI, Anatomic Pathology, Diagnostic Accuracy, Workflow Efficiency, Education, Research, Ethical Considerations

 

Learning Objectives

  1. Understand generative AI and how various current applications of this technology are reshaping anatomical pathology.  improve diagnostic accuracy, streamline workflows, and boost research in anatomic pathology, with a focus on AI-driven image analysis and virtual staining.        
  2. Discover how generative AI could be applied in near future to improve diagnostic accuracy, streamline workflows, boost research in anatomical pathology.
  3. Identify the ethical and practical challenges of using generative AI in anatomical pathology and explore methods to ensure reliable and unbiased AI applications.
Chat bot