Academic Foundation
ExplainGrade is built on two published research works and benchmarked against the Mohler ASAG Dataset.
🎓 Project Overview
ExplainGrade addresses the "length-bias noise" (longer answers getting artificially inflated scores) and "black-box scoring" (students receive no actionable feedback) present in most ASAG systems.
By anchoring every grade to specific, measurable NLP comparisons and providing sentence-level attributions, every point is traceable.
📄 Key References
-
01
Kulkarni et al. (2015)
Statistically Significant Detection of Linguistic Change. Foundation for temporal semantic drift tracking and concept evolution detection. -
02
Hamilton et al. (2016)
Cultural shift or linguistic drift? Word embeddings and diachronic semantics. Core methodology for temporal semantic drift measurement. -
03
Bamler & Mandt (2017)
Dynamic Word Embeddings. Framework for tracking semantic changes over time sequences. -
04
Gama et al. (2014)
A Survey on Concept Drift Adaptation. Classification of concept drift detection techniques applied to student understanding tracking. -
05
Ahmad Ayaan (2024)
PMC12171532. Automated grading using NLP and semantic analysis. -
06
Filighera et al. (2023)
Our System for Short Answer Grading using Generative Models. BEA Workshop, ACL 2023. Sentence-level attribution framework. -
07
Mohler et al. (2011)
Learning to grade short answer questions using semantic similarity. ACL. Foundational ASAG dataset and methodology.
🛠 Tech Stack & Tools
📈 Temporal Semantic Drift Analysis
Track how student understanding evolves over multiple submissions.
🔬 What It Measures
-
📊
Improvement Score (-1 to +1)
Tracks whether student answers are getting better or worse across submissions. -
⚡
Consistency Score (0-1)
Measures stability of understanding. High = consistent understanding, Low = volatile responses. -
🎯
Learning Trend
Classification: Improving, Degrading, or Stable—shows learning direction over time. -
🌊
Volatility
Unpredictability measure—identifies erratic or inconsistent response patterns.
🎓 How It Works
When you submit multiple answer attempts in the demo:
- Submission 1 — Answer is graded and stored
- Submission 2+ — System automatically computes trajectory
- Analysis — Temporal metrics are calculated and visualized
- Feedback — You see improvement/consistency scores and learning trend
Try it: Go to the Live Demo, grade an answer, then modify it and grade again. The temporal analysis will appear automatically.
📚 Research Foundation
Temporal Semantic Drift Analysis is built directly on four peer-reviewed research papers (references 01-04):
Provides statistical methodology for detecting significant linguistic changes across time periods. Directly implemented for improvement_score computation.
Word embedding drift measurement framework used to compute semantic shift magnitude and direction across submissions.
Dynamic embeddings framework validates temporal similarity tracking and consistency scoring methodology.
Concept drift detection taxonomy—identifies sudden understanding changes, critical for education applications.
💡 Why This Matters for Education
- 📉 Detect confusion early: A degrading trend signals the student may be struggling, not just different approaches.
- ✅ Recognize improvement patterns: Chart actual learning trajectories—improving score + improving consistency = true learning.
- 🎯 Personalized feedback: System can suggest whether to clarify concepts vs. encourage refinement.
- 📊 Research-backed insights: Directly implements four peer-reviewed papers on temporal linguistics and concept drift.