Chi Kang Pai

Frontier of pathology AI, Faisal Mahmood’s talk

August 5, 2025 · Research, Summary

Link: https://www.youtube.com/watch?v=tbJwdK48hJw&t=922s

Co-authored with Gemini 2.5 pro

This talk by Faisal Mahmood from the Broad Institute offers a comprehensive overview of the application of advanced AI techniques in computational pathology. The presentation covers the historical context, recent innovations, and future directions of the field, with a strong focus on multimodal, generative, and agentic AI.

Here is a detailed, structured summary of the key points:

1. Introduction to Computational Pathology

Core Concept: The field focuses on analyzing large, digitized glass slide images using computational and machine learning tools [00:33]. These images are hierarchical and information-rich, comparable to satellite imagery.
Primary Goals: To improve early diagnosis, prognosis, prediction of treatment response, and patient stratification by analyzing these images, often in conjunction with other data modalities [02:03].
Historical Context: The talk acknowledges the foundational work of Judith Prewitt in the 1960s and 70s, whose methods for patch-based image analysis are still relevant today [03:17]. This led to the first FDA-approved algorithm, AutoPap, in the 1990s [03:40].

2. Key Methodologies and Innovations

Weakly Supervised Learning: Due to the lack of pixel-level annotations in routine clinical data, weakly supervised methods like Multiple Instance Learning (MIL) are essential [04:41].
CLAM (Clustering Constraint Attention MIL): To overcome the data inefficiency of traditional MIL, Mahmood's group developed CLAM. This method improves performance by incorporating attention mechanisms and clustering similar morphological regions within a slide [07:07].
Foundation Models: A significant portion of the talk is dedicated to the development of powerful foundation models for pathology:
- UNI & CONCH: Inspired by the idea that data diversity is more crucial than sheer quantity, these models were trained on a curated, diverse set of cases from Brigham and MGH archives [14:31]. UNI is a general-purpose model trained on 100 million image patches [16:07], while CONCH is a vision-language model that learns by contrasting images with text descriptions from medical literature [19:30].
- TITAN: This is a slide-level foundation model that learns from images, AI-generated morphological descriptions, and pathology reports, enabling powerful applications like rare disease classification and similar-slide search for diagnostics [22:42].
- My lab previously(before I joined) developed an also powerful foundation model: CHIEF (Computational Histopathology Image-based Feature Extractor).
  - It is a weakly supervised deep learning framework designed to predict clinical outcomes directly from gigapixel digital pathology slides.
  - By pre-training on nearly 80,000 slides, CHIEF learns to encode an entire whole-slide image into a single, effective feature vector. This approach allows it to perform a wide array of clinical tasks without needing detailed, pixel-level annotations. It has been extensively validated on over 19,000 slides from 32 international cohorts, where it not only achieved state-of-the-art performance in predicting patient survival and molecular profiles but also demonstrated remarkable resilience to domain shifts—maintaining its accuracy across different hospitals, patient populations, and slide preparation techniques.
Multimodal Integration: The talk emphasizes the power of contrasting different data types to improve image representation. This includes:
- H&E and IHC/Special Stains (MADELINE) [26:27]
- Histology and Transcriptomics (TANGLE, THREADS) [26:54, 28:00]

3. Applications and Future Directions