ERIC Number: ED608010
Record Type: Non-Journal
Publication Date: 2020-Jul
Pages: 7
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
More Data and Better Keywords Imply Better Educational Transcript Classification?
Danciulescu, Theodora Ioana; Mihaescu, Marian Cristian; Heras, Stella; Palanca, Javier; Julian, Vicente
International Educational Data Mining Society, Paper presented at the International Conference on Educational Data Mining (EDM) (13th, Online, Jul 10-13, 2020)
Building and especially improving a classification kernel represents a challenging task. The works presented in this paper continue an already developed semi-supervised classification approach that aimed at labelling transcripts from educational videos. We questioned whether the size of the ground-truth data-set (Wikipedia articles) or the quality of the keywords used in the semi-supervised labelling have a significant impact on the accuracy metrics of the final obtained data model. Experimental results took into consideration three Wikipedia data-sets of "Small," "Medium" and "Large" sizes. For each data-set there were used three sets of keywords: offered by video authors, determined by "rake-nltk" on available transcripts and determined by "rake-nltk" on Wikipedia articles that serve as training and testing data for the LDA [latent Dirichlet allocation] model that determine keywords on the transcripts. Experiments show that the size of the data-set has little importance, while the quality of the keywords has a more significant impact. Therefore, an improved version of the previously developed classifier has been obtained by improving the quality of the keywords involved in semi-supervised training. This result paves the way towards further improvements that may finally be deployed as within a recommender system of educational videos at the Universitat Politècnica de València. [For the full proceedings, see ED607784.]
Descriptors: Data Analysis, Classification, Information Retrieval, Video Technology, Teaching Methods, Web Sites, Collaborative Writing, Accuracy, Comparative Analysis, Authors, Models, Foreign Countries, Universities, Information Systems
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: http://www.educationaldatamining.org
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Spain
Grant or Contract Numbers: N/A