Abstract
<jats:p xml:lang="en">Machine learning has become an important tool for predicting student performance. This paper aims to create a dataset of the participation of some students who took Turkish Language, Atatürk’s Principles and History of Revolution, and English joint courses given via distance education at Istanbul Arel University to synchronous and asynchronous course activities for 14 weeks, and to predict the students’ success by employing fuzzy parameterized fuzzy soft k-nearest neighbor (FPFS-kNN) and the dataset. First, anonymized participation data from a 14-week lecture period is collected. Later, these data are processed to be used in machine learning. Two data sets are obtained from each raw dataset, whose class labels consist of two classes (pass/fail) and multi-class (letter grades). Then, FPFS-kNN and well-known/state-of-the-art machine learning algorithms are applied to the datasets. The performance results are compared using accuracy (Acc), precision (Pre), recall (Rec), macro F1-score (MacF1), and micro F1-score (MicF1) performance metrics. The results show that FPFS-kNN outperforms the other algorithms in binary pass–fail classification, achieving the highest accuracy with (ING1), (ATA1), and (TDE1), while maintaining competitive F1-scores (up to on TDE1). In the letter-grades datasets, performance decreased overall, with Boosted Tree reaching the best MicF1 ( on TDE2), yet FPFS-kNN still produced strong and stable results ( on TDE2, on ATA2). These findings indicate that FPFS-kNN is highly effective in binary classification and competitive in multi-class problems. Finally, a discussion of performance results and the use of machine learning in predicting student achievement is provided.Machine learning has become an important tool for predicting student performance. This paper aims to create a dataset of the participation of some students who took Turkish Language, Atatürk’s Principles and History of Revolution, and English joint courses given via distance education at Istanbul Arel University to synchronous and asynchronous course activities for 14 weeks, and to predict the students’ success by employing fuzzy parameterized fuzzy soft k-nearest neighbor (FPFS-kNN) and the dataset. First, anonymized participation data from a 14-week lecture period is collected. Later, these data are processed to be used in machine learning. Two data sets are obtained from each raw dataset, whose class labels consist of two classes (pass/fail) and multi-class (letter grades). Then, FPFS-kNN and well-known/state-of-the-art machine learning algorithms are applied to the datasets. The performance results are compared using accuracy (Acc), precision (Pre), recall (Rec), macro F1-score (MacF1), and micro F1-score (MicF1) performance metrics. The results show that FPFS-kNN outperforms the other algorithms in binary pass–fail classification, achieving the highest accuracy with (ING1), (ATA1), and (TDE1), while maintaining competitive F1-scores (up to on TDE1). In the letter-grades datasets, performance decreased overall, with Boosted Tree reaching the best MicF1 ( on TDE2), yet FPFS-kNN still produced strong and stable results ( on TDE2, on ATA2). These findings indicate that FPFS-kNN is highly effective in binary classification and competitive in multi-class problems. Finally, a discussion of performance results and the use of machine learning in predicting student achievement is provided.</jats:p>