Skip to main content
  1. PaperReading/
  2. Others/

Hybrid transformer with multi-level fusion for multimodal knowledge graph completion

SIGIR 2022

用统一的框架实现三个KG任务

  • Multimodal Link Prediction
  • Multimodal Relation extraction (MRE)
  • MNER is the task of extracting named entities from text sequences and corresponding images (MNER)

框架
#

694f3561-2f9a-4428-a8c0-c15f91a1b7b6

RE:对CLS token进行预测

NER:对每个词向量进行预测

链路预测:对mask token进行预测

模块
#

8b3a1611-9772-4eda-bf22-c93d1e69bce5

视觉:直接混合attention

文本:先self-attention再cross-attention

效果图
#

e1eb4326-46bf-42b3-a8b3-db4458a0722f