Skip to main content
  1. PaperReading/
  2. CVPR/

KiUT: Knowledge-injected U-Transformer for Radiology Report Generation

CVPR 2023

59640cde-1954-47c5-8071-703a20474e8d

  • Signals
    • Visual Knowledge signal
      • ResNet101 -> 分patch -> RRSA
      • RRSA (Region Relationship Self Attention) : 不仅考虑patch特征的相似度,还考虑两个patch之间的相对位置
    • Clinical Knowledge Signal
      • symptom node embedding (from MBERT) ✖️ symptom probability(from classifier)
      • Graph Attention: based on symptom relationship graph
    • Contextual Knowledge Signal
      • Previous output -> MBERT -> Masked MHA
  • U connection
    • 自回归式Decoder,考虑previous output和encoder visual signals
    • Cross Attention的Q是previous output,但K和V不再是last Encoder的输出,而是N-i+1个Encoder的输出
  • Injected Knowledge Distiller
    • Last Decoder的输出通过attention整合clinical knowledge signal和contextual knowledge signal,沿序列维度拼接后预测下一个词的输出