Research Background

Academic Foundation

ExplainGrade is built on two published research works and benchmarked against the Mohler ASAG Dataset.

🎓 Project Overview

ExplainGrade addresses the "length-bias noise" (longer answers getting artificially inflated scores) and "black-box scoring" (students receive no actionable feedback) present in most ASAG systems.

By anchoring every grade to specific, measurable NLP comparisons and providing sentence-level attributions, every point is traceable.

📄 Key References

01
Kulkarni et al. (2015)
Statistically Significant Detection of Linguistic Change. Foundation for temporal semantic drift tracking and concept evolution detection.
02
Hamilton et al. (2016)
Cultural shift or linguistic drift? Word embeddings and diachronic semantics. Core methodology for temporal semantic drift measurement.
03
Bamler & Mandt (2017)
Dynamic Word Embeddings. Framework for tracking semantic changes over time sequences.
04
Gama et al. (2014)
A Survey on Concept Drift Adaptation. Classification of concept drift detection techniques applied to student understanding tracking.
05
Ahmad Ayaan (2024)
PMC12171532. Automated grading using NLP and semantic analysis.
06
Filighera et al. (2023)
Our System for Short Answer Grading using Generative Models. BEA Workshop, ACL 2023. Sentence-level attribution framework.
07
Mohler et al. (2011)
Learning to grade short answer questions using semantic similarity. ACL. Foundational ASAG dataset and methodology.

🛠 Tech Stack & Tools

Python 3.12 KeyBERT Sentence Transformers all-MiniLM-L6-v2 SHAP PapaParse Canvas API Mammoth .docx XLSX.js

ExplainGrade leverages modern NLP models for semantic understanding while remaining light enough to run client-side in the browser for privacy and speed.

Innovation

📈 Temporal Semantic Drift Analysis

Track how student understanding evolves over multiple submissions.

🔬 What It Measures

📊
Improvement Score (-1 to +1)
Tracks whether student answers are getting better or worse across submissions.
⚡
Consistency Score (0-1)
Measures stability of understanding. High = consistent understanding, Low = volatile responses.
🎯
Learning Trend
Classification: Improving, Degrading, or Stable—shows learning direction over time.
🌊
Volatility
Unpredictability measure—identifies erratic or inconsistent response patterns.

🎓 How It Works

When you submit multiple answer attempts in the demo:

Submission 1 — Answer is graded and stored
Submission 2+ — System automatically computes trajectory
Analysis — Temporal metrics are calculated and visualized
Feedback — You see improvement/consistency scores and learning trend

Try it: Go to the Live Demo, grade an answer, then modify it and grade again. The temporal analysis will appear automatically.

📚 Research Foundation

Temporal Semantic Drift Analysis is built directly on four peer-reviewed research papers (references 01-04):

Kulkarni et al. (2015)

Provides statistical methodology for detecting significant linguistic changes across time periods. Directly implemented for improvement_score computation.

Hamilton et al. (2016)

Word embedding drift measurement framework used to compute semantic shift magnitude and direction across submissions.

Bamler & Mandt (2017)

Dynamic embeddings framework validates temporal similarity tracking and consistency scoring methodology.

Gama et al. (2014)

Concept drift detection taxonomy—identifies sudden understanding changes, critical for education applications.

💡 Why This Matters for Education

📉 Detect confusion early: A degrading trend signals the student may be struggling, not just different approaches.
✅ Recognize improvement patterns: Chart actual learning trajectories—improving score + improving consistency = true learning.
🎯 Personalized feedback: System can suggest whether to clarify concepts vs. encourage refinement.
📊 Research-backed insights: Directly implements four peer-reviewed papers on temporal linguistics and concept drift.