Quantile-based e-Learning Student Engagement Classification

Aditya Galih Sulaksono; Syaad Patmanthara; Harits Ar Rosyid

doi:10.34190/ejel.24.3.4678

Authors

Aditya Galih Sulaksono State University of Malang, Indonesia/Universitas Merdeka Malang, Indonesia https://orcid.org/0000-0003-3748-4902
Syaad Patmanthara State University of Malang, Indonesia
Harits Ar Rosyid State University of Malang, Indonesia

DOI:

https://doi.org/10.34190/ejel.24.3.4678

Keywords:

Learning analytics, Student engagement, Quantile classification, Cross-dataset validation, Random forest, Educational data mining

Abstract

Classifying student engagement accurately is critical for timely academic intervention; however, most existing approaches rely on arbitrarily defined thresholds that lack statistical grounding and are difficult to transfer across institutional contexts. This limitation reduces the practical applicability of engagement analytics in diverse educational settings. This study evaluates a quantile-based engagement classification framework across two contrasting datasets to assess its validity, transferability, and consistency of predictive features. Unlike threshold-based approaches, the proposed framework derives engagement categories directly from dataset-specific interaction distributions. The Open University Learning Analytics Dataset (OULAD) represents large-scale fully online learning, while the Unistudium dataset reflects a smaller blended learning context. The two datasets differ substantially in size and delivery mode, with a student ratio of approximately 17.4 to 1. This contrast provides a rigorous basis for assessing method transferability. Engagement categories (passive, moderate, and active) are derived using dataset-specific quartile thresholds (Q1 and Q3). This strategy adapts automatically to local interaction distributions and avoids manual parameter tuning. Five temporal behavioural features were extracted, including active days, unique actions, and learning consistency. Random Forest was employed as the proposed model, while a Decision Tree classifier was included as a baseline for comparative evaluation. The results indicate that the proposed framework remains effective across different educational contexts. In the OULAD dataset, the model achieved an accuracy of 92.04% with a Cohen’s κ of 0.87. In the Unistudium dataset, accuracy reached 72.50% with a Cohen’s κ of 0.59. Although performance differed between datasets, variance remained low. Feature importance analysis further revealed strong consistency across contexts, with a Spearman correlation of 0.90. Active days and unique actions were the most influential predictors in both cases. The baseline comparison further confirmed the superiority of Random Forest over the Decision Tree baseline across both datasets. These findings support e-learning practice by offering institutions a statistically grounded and automated method for engagement classification. The approach removes the need for arbitrary thresholds and reduces operational overhead in analytics deployment. From a research perspective, the study establishes realistic performance benchmarks for engagement analytics at different institutional scales, demonstrates the applicability of quantile-based engagement classification across heterogeneous datasets, and confirms that key behavioural engagement indicators transfer reliably across online and blended learning environments.

Quantile-based e-Learning Student Engagement Classification

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

Categories

License

Current Issue

Information

Make a Submission