ERIC Number: ED615542
Record Type: Non-Journal
Publication Date: 2021
Pages: 13
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
Investigating the Validity of Methods Used to Adjust for Multiple Comparisons in Educational Data Mining
Matayoshi, Jeffrey; Karumbaiah, Shamya
International Educational Data Mining Society, Paper presented at the International Conference on Educational Data Mining (EDM) (14th, Online, Jun 29-Jul 2, 2021)
Research studies in Educational Data Mining (EDM) often involve several variables related to student learning activities. As such, it may be necessary to run multiple statistical tests simultaneously, thereby leading to the problem of multiple comparisons. The Benjamini-Hochberg (BH) procedure is commonly used in EDM research to address this issue, and it has proven to be a useful method. However, the main limitation of the procedure is that it requires the statistical tests to either be independent or satisfy certain dependency conditions. The Benjamini-Yekutieli (BY) procedure is an alternative that can be applied under arbitrary dependence assumptions, but this extra flexibility comes with a loss of statistical power; hence, the BH procedure is preferred whenever it can be properly applied. Based on these considerations, in this work we employ simulation studies to assess the validity of the BH procedure in two scenarios common to EDM research. The first scenario considers the evaluation and comparison of different classification models--such an analysis might occur, for instance, during the model tuning and validation stage of a study. Then, in the second scenario we look at experiments involving the study of state transitions in sequential data, examples of which occur in affect dynamics research. We find that the BH procedure performs as expected when used with simulated classification model predictions; however, when applied to simulated sequential data, it does not perform at the expected level. Based on these results, as well as previous studies evaluating the BH and BY methods, we discuss the appropriate usage of these procedures for the scenarios under examination. [For the full proceedings, see ED615472.]
Descriptors: Statistical Analysis, Validity, Classification, Hypothesis Testing, Methods, Data Analysis, Educational Research
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Publication Type: Speeches/Meeting Papers; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A