Skip to main content
  1. PaperReading/
  2. Arxiv/

Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation

Arxiv KG 2024
Table of Contents

[ 2408.04187] Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation (arxiv.org)

Contributions
#

  • fisrt MedRAG
  • Graph Contruction & Retrival (evidence-based & private data)
  • SOTA on QA

Method
#

image-20240825154233762

Layer Inner Relationship Source
0 Chunk-Chunk
1 Entity-Entity MIMIC4
2 Entity-Entity MedC-K
3 Entity-Entity UMLS

本质上是构建了一个以Report Chunk Meta Graph为基本单位的Graph,通过层级链接来为每一个Meta Graph生成通用语言的描述,并添加tag。

层级链接和顶层实体间链接都是通过LLM计算得到。

在检索时,基于问题的Query来检索相应的Meta-graph,并进一步检索相应的Entity。

Experiments
#

QA Dataset Source Distribution
PubMedQA PubMed abstracts yes, no, or maybe PQA-L, 1,000 manually labeled pairs, used for testing;
PQA-U, consisting of 61.2k unlabeled pairs which are not used;
PQA-A, featuring 211.3k artificially generated pairs.
MedMCQA Indian medical school entrance tests 4-choice a training set with 182,822 questions
a testing set containing 4,183 questions
USMLE United States Medical Licensing Exams 4-choice multilingual
only the English portion is considered, which includes 10,178 + 1,273 + 1,273 pieces of data.

image-20240825171322226

image-20240825171407685