NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
PDF on ERIC Download full text
ERIC Number: ED606760
Record Type: Non-Journal
Publication Date: 2020
Pages: 11
Abstractor: As Provided
ISBN: N/A
ISSN: EISSN-
EISSN: N/A
"Hello, [REDACTED]": Protecting Student Privacy in Analyses of Online Discussion Forums
Bosch, Nigel; Crues, R. Wes; Shaik, Najmuddin; Paquette, Luc
Grantee Submission, Paper presented at the International Conference on Educational Data Mining (EDM) (13th, 2020)
Online courses often include discussion forums, which provide a rich source of data to better understand and improve students' learning experiences. However, forum messages frequently contain private information that prevents researchers from analyzing these data. We present a method for discovering and redacting private information including names, nicknames, employers, hometowns, and contact information. The method utilizes set operations to restrict the list of words that might be private information, which are then confirmed as private or not private via manual annotation or machine learning. To test the method, two raters manually annotated a corpus of words from an online course's discussion forum. We then trained an ensemble machine learning model to automate the annotation task, achieving 95.4% recall and 0.979 AUC (area under the receiver operating characteristic curve) on a held-out dataset obtained from the same course offered 2 years later, and 97.0% recall and 0.956 AUC on a held-out dataset from a different online course. This work was motivated by research questions about students' interactions with online courses that proved unanswerable without access to anonymized forum data, which we discuss. Finally, we queried two online course instructors about their perspectives on this work, and provide their perspectives on additional potential applications. [This paper was published in: Anna N. Rafferty, Jacob Whitehill, Violetta Cavalli-Sforza, and Cristobal Romero (Eds.), "Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)" (pp. 39-49).]
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: Institute of Education Sciences (ED)
Authoring Institution: N/A
IES Funded: Yes
Grant or Contract Numbers: R305A180211