ERIC Number: ED661382
Record Type: Non-Journal
Publication Date: 2022-Aug-17
Pages: 6
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Application of Neighborhood Components Analysis to Process and Survey Data to Predict Student Learning of Statistics
Grantee Submission, Paper presented at the International Conference on Advanced Learning Technologies (ICALT) (Hybrid, Jul 1-4, 2022)
Machine learning methods for predictive analytics have great potential for uncovering trends in educational data. However, simple linear models still appear to be most widely used, in part, because of their interpretability. This study aims to address the issues of interpretability of complex machine learning classifiers by conducting feature extraction by neighborhood components analysis (NCA). Our dataset comprises 287 features from both process data indicators (i.e., derived from log data of an online statistics learning platform) and self-report data from high school students enrolled in Advanced Placement (AP) Statistics (N=733). As a label for prediction, we use students' scores on the AP Statistics exam. We evaluated the performance of machine learning classifiers with a given feature extraction method by evaluation criteria including F1 scores, the area under the receiver operating characteristic curve (AUC), and Cohen's Kappas. We find that NCA effectively reduces the dimensionality of training datasets, stabilizes machine learning predictions, and produces interpretable scores. However, interpreting the NCA weights of features, while feasible, is not very straightforward compared to linear regression. Future research should consider developing guidelines to interpret NCA weights.
Descriptors: Prediction, Statistics Education, Data Analysis, Learning Analytics, Trend Analysis, Educational Trends, Learning Management Systems, High School Students, Advanced Placement, Scores, Mathematics Tests, Evaluation Criteria, Guidelines, Data Interpretation, Online Courses, Learning Problems
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: High Schools; Secondary Education
Audience: N/A
Language: English
Sponsor: Institute of Education Sciences (ED); National Science Foundation (NSF), Division of Research on Learning in Formal and Informal Settings (DRL)
Authoring Institution: N/A
IES Funded: Yes
Grant or Contract Numbers: R305A180269; 1350787