PV25 Schedule of Events
At birth, placentas contain a plethora of potentially life‑saving information. Examining placentas for abnormalities, such as decidual vasculopathy (DV), can be critical information to risk stratify, predict, and prevent key adverse obstetric outcomes such as preeclampsia in future pregnancies. The complexity of DV diagnosis requires highly specialized perinatal pathologists, who are scarce, so most placentas are discarded without microscopic inspection. Employing AI-assisted DV diagnosis with whole-slide imaging for triage can augment the pathologists' efforts and greatly increase the number of accurately diagnosed patients. In previous work, our team developed a state-of-the-art hierarchical deep learning approach for this purpose. Although effective on a limited dataset, we now assess and improve its generalization capabilities to external datasets to ensure reliable performance. Accuracy can drop across data sources due to variations in tissue color, sampling technique, scanners, image quality, and inconsistent ground truths based on ambiguous human diagnoses. We collected 454 cases (slides and metadata) from two independent sources. Reliable ground truth labels for DV were established through overlap and consensus between general and expert perinatal pathologists, including adjudication to resolve disagreements. Experts agreed at a significantly greater rate than typical pathologists, who matched only 56.5% of expert diagnoses, underscoring the need for greater consistency. We improved AI-based cross‑dataset performance through data heterogeneity, color/format normalization, noise filtering, data augmentation, dataset balancing, and calibration on a validation set separate from the held-out test set to avoid overfitting. These methods increased patient‑level accuracy to 140% of general pathologists' performance while matching expert diagnoses in an external dataset, indicating robust generalization capabilities. Fine‑tuning on a subset of the new dataset further improved vessel detection and DV‑diagnosis accuracy toward our ultimate goal of greater-than-expert AI performance. The findings underscore the clinical need for effective AI‑assisted DV screening, the impact of dataset shifts, and the value of robust normalization, reliable ground truths, and limited fine‑tuning to generalize this technology to broader and more diverse contexts, ultimately enabling more placentas to be accurately screened to improve preeclampsia prevention and management.
Learning Objectives: