NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: ED608066
Record Type: Non-Journal
Publication Date: 2020-Jul
Pages: 10
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Towards Accurate and Fair Prediction of College Success: Evaluating Different Sources of Student Data
Yu, Renzhe; Li, Qiujie; Fischer, Christian; Doroudi, Shayan; Xu, Di
International Educational Data Mining Society, Paper presented at the International Conference on Educational Data Mining (EDM) (13th, Online, Jul 10-13, 2020)
In higher education, predictive analytics can provide actionable insights to diverse stakeholders such as administrators, instructors, and students. Separate feature sets are typically used for different prediction tasks, e.g., student activity logs for predicting in-course performance and registrar data for predicting long-term college success. However, little is known about the overall utility of different data sources across prediction tasks and the fairness of their predictions with respect to different subpopulations. Using data from over 2,000 college students at a large public university, we examined the utility of institutional data, learning management system (LMS) data, and survey data for accurately and fairly predicting short-term and long-term student success. We found that institutional data and LMS data both have decent predictive power, but survey data shows very little predictive utility. Combining institutional data with LMS data leads to even higher accuracy than using either alone. In terms of fairness, using institutional data consistently underestimates historically disadvantaged student subpopulations more than their peers, whereas LMS data tend to overestimate some of these groups more often. Combining the two data sources does not fully neutralize the biases and still leads to high rates of underestimation among disadvantaged groups. Moreover, algorithmic biases affect not only demographic minorities but also students with acquired disadvantages. These analyses serve to inform more cost-effective and equitable use of student data for predictive analytics applications in higher education. [For the full proceedings, see ED607784.]
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: http://www.educationaldatamining.org
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: National Science Foundation (NSF)
Authoring Institution: N/A
Grant or Contract Numbers: 1535300