SciMKG A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio

Jan 15, 2024·
Tong Lu
Tong Lu
· 2 min read
post

🚀 Introduction

SciMKG is a large-scale multimodal educational knowledge graph (MEKG) covering text, images, videos, and audio for K-12 science education. It is automatically constructed using a novel LLM-powered pipeline for concept extraction and multimodal alignment.

  • Four modalities covered: text, image, video, audio
  • 1,356 knowledge points
  • 34,630 multimodal concepts
  • 403,400 triples
  • 10,527 images · 10,425 videos · 34,630 audios

🔥 Framework

SciMKG Framework

SciMKG is built using an Extraction–Verification–Integration–Augmentation (EVIA) pipeline:
  • Extraction Use multiple LLMs to extract K–12 science concepts from MOOC subtitles.
  • Verification Apply self-feedback (SELF-REFINE) to prune ambiguous or irrelevant concepts.
  • Integration Use self-consistency voting to merge multiple LLM outputs.
  • Augmentation Expand concepts through ConceptNet & Wikipedia; generate rewritten text and audio.
  • Multimodal Alignment Align images, videos, and audio to concepts using multimodal LLMs (e.g., GPT-4o, Gemini).

This pipeline ensures robustness, high precision, and semantic consistency across modalities.

📦 Installation & Usage

Installation

pip install scimkg

Usage

import  scimkg
kg = scimkg("video_path,pdf_path")
triples = kg.build("subject")
rdf = triples.rdf()

📊 Dataset Statistics

Discipline Knowledge Points Concepts Exercises Triples
Biology 526 16,839 255 191,928
Physics 521 11,015 288 145,666
Chemistry 309 6,776 220 65,806
Modality Items Concept Coverage
Image 10,527 39%
Video 10,425 80%
Audio 34,630 100%

🧠 Applications

SciMKG enables:

  • Multimodal educational question answering
  • Multimodal question generation
  • Cross-modal knowledge retrieval
  • Intelligent tutoring systems
  • Science education agents
  • Curriculum-level analytics

📄 Citation

If you use SciMKG or our construction framework, please cite:

@article{SciMKG2026,
  title={SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio},
  author={Tong Lu, Zhichun Wang, Yaoyu Zhou, Yiming Guan, Zhiyong Bai, Junsheng Du},
  year={2026},
  journal={AAAI}
}
Tong Lu
Authors
PhD candidate
I am a second year Ph.D. candidate in the School of Artificial Intelligence at Beijing Normal University, Beijing, China. I obtained a Bachelor of Science degree from Hebei GEO University and a Master of Engineering degree from Yunnan University. Now, I engage in research related to the field of natural language processing.