NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED657408
Record Type: Non-Journal
Publication Date: 2024
Pages: 146
Abstractor: As Provided
ISBN: 979-8-3827-7752-8
ISSN: N/A
EISSN: N/A
Essays on Statistics and Data Science Education
Emma Mary Klugman
ProQuest LLC, Ph.D. Dissertation, Harvard University
Statistics & data science are growing, rapidly evolving, and increasingly important for an informed citizenry in a data-saturated world. In this dissertation, I address two central questions: (1) who is taking statistics? and (2) what are statistics courses teaching? I estimate that 920,000 US students take statistics in high school each year, but this population has not yet been well studied. Using a rich set of survey responses describing 15,727 students' demographics, career interests and values, STEM identity, grades, and test scores, my first study compares four groups of high-school course-takers: those who take statistics, calculus, both, and neither. I then employ latent profile analysis to shed light on who these students are, showing that students with different profiles take statistics at surprisingly similar rates: statistics is as an important part of the academic pathway for a wide range of students and serves a demographically diverse population. In my second study, I build upon tools from natural language processing and psychometric measurement to develop a human-in-the-loop methodology for measuring latent constructs in large text corpora, and present a framework for doing so. I construct a lexicon-based instrument to measure the extent to which syllabi from college statistics and data science courses align with a vision for modernizing instruction set forth in the Guidelines for Assessment and Instruction in Statistics Education (GAISE) project and across 145 journal articles spanning almost a century. In so doing, I illustrate an approach that researchers can take in bringing measurement questions to text data, a method that I believe strikes a useful balance between interpretability, communicability, validity, and scalability. My final study applies these instruments to 32,483 syllabi from US statistics and data science courses taught between 2010 and 2018. I find a modest overall increase in modern approaches over this decade. Finally, I explore differences between institution types using multilevel models, finding that private and four-year institutions, as well as those with higher admissions rates and Pell-recipient populations, have more modern syllabi, though two-year institutions and schools serving fewer Pell recipients seem to be gaining ground. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://bibliotheek.ehb.be:2222/en-US/products/dissertations/individuals.shtml.]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://bibliotheek.ehb.be:2222/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: High Schools; Secondary Education; Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A