Abstract:
Machine learning methods for predictive analytics have great potential for uncovering trends in educational data. However, simple linear models still appear to be most wi...Show MoreMetadata
Abstract:
Machine learning methods for predictive analytics have great potential for uncovering trends in educational data. However, simple linear models still appear to be most widely used, in part, because of their interpretability. This study aims to address the issues of interpretability of complex machine learning classifiers by conducting feature extraction by neighborhood components analysis (NCA). Our dataset comprises 287 features from both process data indicators (i.e., derived from log data of an online statistics learning platform) and self-report data from high school students enrolled in Advanced Placement (AP) Statistics (N=733). As a label for prediction, we use students’ scores on the AP Statistics exam. We evaluated the performance of machine learning classifiers with a given feature extraction method by evaluation criteria including F1 scores, the area under the receiver operating characteristic curve (AUC), and Cohen’s Kappas. We find that NCA effectively reduces the dimensionality of training datasets, stabilizes machine learning predictions, and produces interpretable scores. However, interpreting the NCA weights of features, while feasible, is not very straightforward compared to linear regression. Future research should consider developing guidelines to interpret NCA weights.
Date of Conference: 01-04 July 2022
Date Added to IEEE Xplore: 17 August 2022
ISBN Information:
ISSN Information:
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Data Processing ,
- Machine Learning ,
- Student Learning ,
- Neighborhood Component Analysis ,
- Linear Model ,
- Cohen’s Kappa ,
- Self-reported Data ,
- F1 Score ,
- Machine Learning Classifiers ,
- Feature Extraction Methods ,
- Learning Platform ,
- Advanced Placement ,
- Prediction Model ,
- Academic Performance ,
- Support Vector Machine ,
- Random Forest ,
- Linear Discriminant Analysis ,
- Academic Year ,
- Multilayer Perceptron ,
- Online Learning ,
- Model Interpretation ,
- Learning Management System ,
- Item Response Theory ,
- Student Model ,
- Metric Learning Methods ,
- Metric Learning ,
- Score Assignment ,
- Statistics Course ,
- AUC Score ,
- Kinds Of Features
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Data Processing ,
- Machine Learning ,
- Student Learning ,
- Neighborhood Component Analysis ,
- Linear Model ,
- Cohen’s Kappa ,
- Self-reported Data ,
- F1 Score ,
- Machine Learning Classifiers ,
- Feature Extraction Methods ,
- Learning Platform ,
- Advanced Placement ,
- Prediction Model ,
- Academic Performance ,
- Support Vector Machine ,
- Random Forest ,
- Linear Discriminant Analysis ,
- Academic Year ,
- Multilayer Perceptron ,
- Online Learning ,
- Model Interpretation ,
- Learning Management System ,
- Item Response Theory ,
- Student Model ,
- Metric Learning Methods ,
- Metric Learning ,
- Score Assignment ,
- Statistics Course ,
- AUC Score ,
- Kinds Of Features
- Author Keywords