Machine Learning Techniques for Assessing Students' Environments' Impact Factors on Their Academic Performance

Mohammed Hussein Jabardi


Performance factors analysis has recently gained popularity as a method for assessing how students' environments affect their academic performance. However, most of the progress has been made in analyzing student behaviour during the learning process. Machine Learning provides many powerful methods that could improve student performance prediction.  Our aim is to examine all features of students' environmental life using the machine learning paradigm to assess how students' environment affects their grades. These features are divided into three categories (personality, family, and education) and their impact factors are calculated. To improve predictive accuracy, different models (Random Forest, AdaBoost, Decision Tree, Naive Bayes, and Multi-Layer perceptron) are used to score the features in each group according to their contribution to the solution. Results show that personality features are a minor effect on students' academic performance with 53%. Concerning the educational factors, outcomes offer the average impact was 60%. Regarding family factors, results indicate that students' family life significantly affects academic achievement with 64%.


Machine Learning; Performance; academic performance; students' environment; assessing; educational factors

Full Text:



E. B. Costa, B. Fonseca, M. A. Santana, F. F. de Araújo, and J. Rego, “Evaluating the effectiveness of educational data mining techniques for early prediction of students’ academic failure in introductory programming courses,” Comput. Human Behav., vol. 73, pp. 247–256, Aug. 2017, doi: 10.1016/J.CHB.2017.01.047.

S. Natek and M. Zwilling, “Student data mining solution–knowledge management system related to higher education institutions,” Expert Syst. Appl., vol. 41, no. 14, pp. 6400–6407, Oct. 2014, doi: 10.1016/J.ESWA.2014.04.024.

A. M. Shahiri, W. Husain, and N. A. Rashid, “A Review on Predicting Student’s Performance Using Data Mining Techniques,” Procedia Comput. Sci., vol. 72, pp. 414–422, Jan. 2015, doi: 10.1016/J.PROCS.2015.12.157.

H. Zeineddine, U. Braendle, and A. Farah, “Enhancing prediction of student success: Automated machine learning approach,” Comput. Electr. Eng., vol. 89, p. 106903, Jan. 2021, doi: 10.1016/J.COMPELECENG.2020.106903.

A. Rivas, A. González-Briones, G. Hernández, J. Prieto, and P. Chamoso, “Artificial neural network analysis of the academic performance of students in virtual learning environments,” Neurocomputing, vol. 423, pp. 713–720, Jan. 2021, doi: 10.1016/J.NEUCOM.2020.02.125.

F. Giannakas, C. Troussas, I. Voyiatzis, and C. Sgouropoulou, “A deep learning classification framework for early prediction of team-based academic performance,” Appl. Soft Comput., vol. 106, p. 107355, Jul. 2021, doi: 10.1016/J.ASOC.2021.107355.

A. Asselman, M. Khaldi, and S. Aammou, “Enhancing the prediction of student performance based on the machine learning XGBoost algorithm,”, 2021, doi: 10.1080/10494820.2021.1928235.

O. H. T. Lu, A. Y. Q. Huang, and S. J. H. Yang, “Impact of teachers’ grading policy on the identification of at-risk students in learning analytics,” Comput. Educ., vol. 163, p. 104109, Apr. 2021, doi: 10.1016/J.COMPEDU.2020.104109.

M. Adnan et al., “Predicting at-Risk Students at Different Percentages of Course Length for Early Intervention Using Machine Learning Models,” IEEE Access, vol. 9, pp. 7519–7539, 2021, doi: 10.1109/ACCESS.2021.3049446.

P. Kaur, H. Kumar, and S. Kaushal, “Affective state and learning environment based analysis of students’ performance in online assessment,” Int. J. Cogn. Comput. Eng., vol. 2, pp. 12–20, Jun. 2021, doi: 10.1016/J.IJCCE.2020.12.003.

S. Kotsiantis, K. Patriarcheas, and M. Xenos, “A combinational incremental ensemble of classifiers as a technique for predicting students’ performance in distance education,” Knowledge-Based Syst., vol. 23, no. 6, pp. 529–535, Aug. 2010, doi: 10.1016/J.KNOSYS.2010.03.010.

C. S. Galbraith, G. B. Merrill, and D. M. Kline, “Are Student Evaluations of Teaching Effectiveness Valid for Measuring Student Learning Outcomes in Business Related Classes? A Neural Network and Bayesian Analyses,” Res. High. Educ. 2011 533, vol. 53, no. 3, pp. 353–374, Jun. 2011, doi: 10.1007/S11162-011-9229-0.

S. B. Kotsiantis, “Use of machine learning techniques for educational proposes: a decision support system for forecasting students’ grades,” Artif. Intell. Rev. 2011 374, vol. 37, no. 4, pp. 331–344, May 2011, doi: 10.1007/S10462-011-9234-X.

Osmanbegovic and M. Suljic., “Data mining approach for predicting student performance,” Econ. Rev. J. Econ. Bus., vol. 10, no. 1, pp. 3–12, 2012, [Online]. Available:

C. Watson, F. W. B. Li, and J. L. Godwin, “Predicting performance in an introductory programming course by logging and analyzing student programming behavior,” Proc. - 2013 IEEE 13th Int. Conf. Adv. Learn. Technol. ICALT 2013, pp. 319–323, 2013, doi: 10.1109/ICALT.2013.99.

C. Márquez-Vera, A. Cano, C. Romero, and S. Ventura, “Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data,” Appl. Intell. 2012 383, vol. 38, no. 3, pp. 315–330, Aug. 2012, doi: 10.1007/S10489-012-0374-8.

“UCI Machine Learning Repository: Higher Education Students Performance Evaluation Dataset Data Set.” (accessed Apr. 06, 2022).

R. J. McQueen, S. R. Garner, C. G. Nevill-Manning, and I. H. Witten, “Applying machine learning to agricultural data,” Comput. Electron. Agric., vol. 12, no. 4, pp. 275–293, Jun. 1995, doi: 10.1016/0168-1699(95)98601-9.

J. I. Arribas, G. V. Sánchez-Ferrero, G. Ruiz-Ruiz, and J. Gómez-Gil, “Leaf classification in sunflower crops by computer vision and neural networks,” Comput. Electron. Agric., vol. 78, no. 1, pp. 9–18, Aug. 2011, doi: 10.1016/J.COMPAG.2011.05.007.

D. M. Farid, L. Zhang, C. M. Rahman, M. A. Hossain, and R. Strachan, “Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks,” Expert Syst. Appl., vol. 41, no. 4, pp. 1937–1946, Mar. 2014, doi: 10.1016/J.ESWA.2013.08.089.

Priyanka and D. Kumar, “Decision tree classifier: A detailed survey,” Int. J. Inf. Decis. Sci., vol. 12, no. 3, pp. 246–269, 2020, doi: 10.1504/IJIDS.2020.108141.

J. Ion Titapiccolo et al., “Artificial intelligence models to stratify cardiovascular risk in incident hemodialysis patients,” Expert Syst. Appl., vol. 40, no. 11, pp. 4679–4686, Sep. 2013, doi: 10.1016/J.ESWA.2013.02.005.

K. Maswadi, N. A. Ghani, S. Hamid, and M. B. Rasheed, “Human activity classification using Decision Tree and Naïve Bayes classifiers,” Multimed. Tools Appl. 2021 8014, vol. 80, no. 14, pp. 21709–21726, Mar. 2021, doi: 10.1007/S11042-020-10447-X.

V. F. Rodriguez-Galiano, B. Ghimire, J. Rogan, M. Chica-Olmo, and J. P. Rigol-Sanchez, “An assessment of the effectiveness of a random forest classifier for land-cover classification,” ISPRS J. Photogramm. Remote Sens., vol. 67, no. 1, pp. 93–104, Jan. 2012, doi: 10.1016/J.ISPRSJPRS.2011.11.002.

G. Biau and E. Scornet, “A random forest guided tour,” TEST 2016 252, vol. 25, no. 2, pp. 197–227, Apr. 2016, doi: 10.1007/S11749-016-0481-7.

V. Y. Kulkarni and P. K. Sinha, “Pruning of random forest classifiers: A survey and future directions,” Proc. - 2012 Int. Conf. Data Sci. Eng. ICDSE 2012, pp. 64–68, 2012, doi: 10.1109/ICDSE.2012.6282329.

I. Rish and I. Rish, “An empirical study of the naive bayes classifier,” 2001, Accessed: Apr. 06, 2022. [Online]. Available:

J. N. Sulzmann, J. Fürnkranz, and E. Hüllermeier, “On pairwise naive bayes classifiers,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 4701 LNAI, pp. 371–381, 2007, doi: 10.1007/978-3-540-74958-5_35.

T. Hastie, J. Friedman, and R. Tibshirani, “The Elements of Statistical Learning,” 2001, doi: 10.1007/978-0-387-21606-5.

M. M. Saritas and A. Yasar, “Performance Analysis of ANN and Naive Bayes Classification Algorithm for Data Classification,” Int. J. Intell. Syst. Appl. Eng., vol. 7, no. 2, pp. 88–91, Jun. 2019, doi: 10.18201//ijisae.2019252786.

M. Jabardi and H. Kuar, “Artificial Neural Network Classification for Handwritten Digits Recognition,” Int. J. Adv. Res. Comput. Sci., vol. 5, no. April, pp. 107–111, 2014, doi:

L. M. Belue and K. W. Bauer, “Determining input features for multilayer perceptrons,” Neurocomputing, vol. 7, no. 2, pp. 111–121, Mar. 1995, doi: 10.1016/0925-2312(94)E0053-T.

A. J. Wyner, M. Olson, J. Bleich, and D. Mease, “Explaining the success of adaboost and random forests as interpolating classifiers,” J. Mach. Learn. Res., vol. 18, pp. 1–33, 2017.

Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119–139, Aug. 1997, doi: 10.1006/JCSS.1997.1504.

T. K. An and M. H. Kim, “A new Diverse AdaBoost classifier,” Proc. - Int. Conf. Artif. Intell. Comput. Intell. AICI 2010, vol. 1, pp. 359–363, 2010, doi: 10.1109/AICI.2010.82.

M. Story and R. G. Congalton, “Remote Sensing Brief Accuracy Assessment: A User’s Perspective,” Photogramm. Eng. Remote Sensing, vol. 52, no. 3, pp. 397–399, 1986, [Online]. Available:

M. Sokolova, N. Japkowicz, and S. Szpakowicz, “Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation,” AAAI Work. - Tech. Rep., vol. WS-06-06, pp. 1015–1021, 2006, doi: 10.1007/11941439_114.

J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves,” ACM Int. Conf. Proceeding Ser., vol. 148, pp. 233–240, 2006, doi: 10.1145/1143844.1143874.



  • There are currently no refbacks.

Copyright (c) 2022 International Journal of Advanced Research in Computer Science