Automated Estimation of Difficulty Levels in Math Word Problems using Linguistic, Mathematical, and Semantic Features

20260 citationsJournal Articlegold Open Access

Authors

Sakshi Kadam · Birla Institute of Technology and Science - Hyderabad Campus

Jabez Christopher · Birla Institute of Technology and Science - Hyderabad Campus

PTV Praveen Kumar · Birla Institute of Technology and Science - Hyderabad Campus

Dipak Kumar Satpathi · Birla Institute of Technology and Science - Hyderabad Campus

Abstract

Math Word Problems (MWPs) remain challenging for learners due to linguistic complexity, mathematical reasoning demands, and contextual variability. Accurately estimating item difficulty is essential for adaptive learning and automated assessment, yet many existing approaches rely on expert annotation or Item Response Theory (IRT), which are resource-intensive and difficult to scale to new items. This paper proposes IDEA, an integrated data-driven framework that extracts linguistic, mathematical, and semantic embedding features to predict MWP difficulty on a five-level scale. Using 4,244 algebra problems from the MATH dataset, we evaluate multiple feature sets and models, showing that embedding-based representations outperform handcrafted features; on a held-out test set, (Macro-F1 = 0.40 vs. 0.29). Since difficulty levels are ordinal, we additionally report ordinal-aware evaluation: an ordinal regression model achieves MAE = 1.08, quadratic weighted kappa = 0.37, and within-one-level accuracy of 0.71, indicating that most predictions are close even when exact matching is difficult. Model-interpretability analysis using SHAP highlights readability and sentence-structure features as dominant contributors to predicted difficulty; exploratory SEM analysis is included to examine relationships among feature groups but is interpreted cautiously due to limited global fit. Finally, external validation using seven expert ratings and IRT estimates from 61 students suggests variability in human judgment while supporting the practical utility of automated calibration. Overall, IDEA provides a scalable approach to item calibration and helps mitigate cold-start challenges in adaptive learning settings as well as contribute in fine-tuning large language models.

Topics & Keywords

Intelligent Tutoring Systems and Adaptive Learning Psychometric Methodologies and Testing Text Readability and Simplification

UN Sustainable Development Goals

Quality Education

Publication Details

Published in: International Journal of Mathematical Engineering and Management Sciences

Volume 11, Issue 2, pp. 679-679

DOI: 10.33889/ijmems.2026.11.2.028

Field-Weighted Citation Impact: 0.00