Long-Chain Reasoning of LLM for Diagnostic Knowledge Tracing and Image-Text Fusion
This project proposes a next-generation intelligent diagnostic system for gastrointestinal diseases, leveraging multimodal large language models (MLLMs) enhanced with retrieval-augmented reasoning and visual-text fusion. Current AI systems in medical imaging remain limited to perceptual-level recognition, lacking the capacity for evidence-based inference, knowledge traceability, and clinically aligned report generation. These limitations are particularly critical in early-stage tumor screening, where diagnostic accuracy and interpretability are paramount.
Building on a high-quality endoscopic knowledge base, we aim to build a system that integrates real-time access to authoritative medical guidelines and literature through vector-based retrieval. A novel chain-of-thought reasoning framework—structured around “evidence discovery, hypothesis generation, and deductive verification”—is implemented via reinforcement learning to emulate clinical cognition. Furthermore, we introduce a visual-enhanced report generation mechanism that binds key image frames and lesion localization with structured diagnostic conclusions, enabling closed-loop multimodal reporting.
The proposed research addresses the societal challenge of improving early cancer detection and reducing diagnostic disparities across healthcare systems. By embedding clinical logic and traceable evidence into AI workflows, the system enhances trust, transparency, and decision support for clinicians. Expected impacts include improved patient outcomes through earlier and more accurate diagnosis, reduced healthcare costs via optimized resource use, and increased accessibility to expert-level diagnostics in underserved regions. Environmentally, the system supports more efficient clinical workflows, minimizing unnecessary procedures and resource waste. This project represents a significant advancement in AI-driven precision medicine, with broad implications for scalable, equitable, and explainable healthcare innovation.