A Position-Aware Multi-Head Self-Attention Model for Student Performance Prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Student performance prediction is a central problem in educational data mining and learning analytics, aiming to build generalizable and interpretable models from students’ historical learning-process data to support personalized instruction and early academic warning. However, educational data are often high-dimensional and strongly temporal, with complex feature interactions, making it challenging for conventional regression approaches to jointly capture temporal regularities and nonlinear dependencies. To address this issue, we propose PAM-MLP , a student performance prediction model that integrates Position-Aware Attention (PAA) and Multi-head Self-Attention (MSA) . The PAA module incorporates learnable positional encodings to capture stage-wise and periodic patterns in learning trajectories, and adopts a gated scaled dot-product attention to dynamically adjust the importance of different time steps. Meanwhile, the MSA module models feature dependencies from multiple perspectives, enhanced by adaptive head weighting and a non-uniform attention distribution strategy to better characterize heterogeneous learning behaviors. On top of attention-based representations, a multi-layer perceptron is employed to capture higher-order nonlinear interactions and improve regression fitting. Experimental results show that PAM-MLP consistently outperforms competitive regression baselines, achieving improvements of 9% , 11% , and 10% in MAE , RMSE , and R² , respectively, demonstrating its effectiveness and robustness for student performance prediction in educational settings.

Article activity feed