Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 40 |
Descriptor
Scoring | 61 |
Statistical Analysis | 61 |
Foreign Countries | 17 |
Qualitative Research | 17 |
English (Second Language) | 14 |
Research Methodology | 14 |
Second Language Learning | 14 |
Evaluation Methods | 12 |
Educational Research | 11 |
Second Language Instruction | 11 |
Language Tests | 9 |
More ▼ |
Source
Author
Newhouse, C. Paul | 2 |
Riazi, Mehdi | 2 |
Tarricone, Pina | 2 |
Abe, Mariko | 1 |
Alcaraz-Mármol, Gema | 1 |
Alexander, Patricia A. | 1 |
Alkire, Sabina | 1 |
Allalouf, Avi | 1 |
Avery, Mitchell | 1 |
Babaii, Esmat | 1 |
Bailey, Kathleen M., Ed. | 1 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 4 |
Practitioners | 1 |
Teachers | 1 |
Location
Australia | 3 |
Japan | 3 |
India | 2 |
Florida | 1 |
Iran | 1 |
Kentucky | 1 |
Mongolia | 1 |
Netherlands | 1 |
Russia | 1 |
South Korea | 1 |
Spain | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Kern, Holger L.; Stuart, Elizabeth A.; Hill, Jennifer; Green, Donald P. – Journal of Research on Educational Effectiveness, 2016
Randomized experiments are considered the gold standard for causal inference because they can provide unbiased estimates of treatment effects for the experimental participants. However, researchers and policymakers are often interested in using a specific experiment to inform decisions about other target populations. In education research,…
Descriptors: Educational Research, Generalization, Sampling, Participant Characteristics
Tarricone, Pina; Newhouse, C. Paul – Educational Assessment, 2017
In this article we describe a three-year study that was conducted in three phases to evaluate the feasibility of assessing digitized portfolios of student creative work for high-stakes purposes. The first two phases suggested that creative work could be digitized with adequate fidelity, and that students could submit their own work from schools to…
Descriptors: Scoring, Reliability, Comparative Analysis, Portfolios (Background Materials)
Elicited Imitation as a Measure of Second Language Proficiency: A Narrative Review and Meta-Analysis
Yan, Xun; Maeda, Yukiko; Lv, Jing; Ginther, April – Language Testing, 2016
Elicited imitation (EI) has been widely used to examine second language (L2) proficiency and development and was an especially popular method in the 1970s and early 1980s. However, as the field embraced more communicative approaches to both instruction and assessment, the use of EI diminished, and the construct-related validity of EI scores as a…
Descriptors: Second Language Learning, Language Proficiency, Meta Analysis, Effect Size
Hansen, Ben B.; Fredrickson, Mark M. – Society for Research on Educational Effectiveness, 2014
The goal of this research is to make sensitivity analysis accessible not only to empirical researchers but also to the various stakeholders for whom educational evaluations are conducted. To do this it derives anchors for the omitted variable (OV)-program participation association intrinsically, using the Love plot to present a wide range of…
Descriptors: Research Methodology, Quasiexperimental Design, Evaluation Methods, Comparative Analysis
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Zeng, Songtian – ProQuest LLC, 2017
Over 30 states have adopted the Early Childhood Environmental Rating Scale-Revised (ECERS-R) as a component of their program quality assessment systems, but the use of ECERS-R on such a large scale has raised important questions about implementation. One of the most pressing question centers upon decisions users must make between two scoring…
Descriptors: Rating Scales, Scoring, Validity, Comparative Analysis
Bainter, Sierra A.; Curran, Patrick J. – Journal of Cognition and Development, 2015
Amid recent progress in cognitive development research, high-quality data resources are accumulating, and data sharing and secondary data analysis are becoming increasingly valuable tools. Integrative data analysis (IDA) is an exciting analytical framework that can enhance secondary data analysis in powerful ways. IDA pools item-level data across…
Descriptors: Data Analysis, Integrated Activities, Inferences, Statistical Analysis
Cornish, Disa Lubker; Losch, Mary E.; Avery, Mitchell – American Journal of Sexuality Education, 2016
Monitoring fidelity of implementation is a critical task when initiating evidence-based programs. This pilot study sought to identify best practices in a fidelity monitoring process and determine the feasibility of continuing a fidelity monitoring process with a multisite, multiprogram initiative. A fidelity log was created for each of 11…
Descriptors: Evidence Based Practice, Pilot Projects, Best Practices, Fidelity
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Greene, Barbara A.; Lubin, Ian A.; Slater, Janis L.; Walden, Susan E. – Journal of Science Education and Technology, 2013
Two studies were conducted to examine content knowledge changes following 2 weeks of professional development that included scientific research with university scientists. Engaging teachers in scientific research is considered to be an effective way of encouraging knowledge of both inquiry pedagogy and content knowledge. We used concept maps with…
Descriptors: Scoring, Science Teachers, Concept Mapping, Replication (Evaluation)
List, Alexandra; Alexander, Patricia A.; Stephens, Lori A. – Discourse Processes: A multidisciplinary journal, 2017
Three indicators of undergraduate students' (n = 197) source evaluation were investigated as students completed an academic task requiring the use of multiple texts. The source evaluation metrics examined were students' (1) accessing of document information, (2) trustworthiness ratings, and (3) citation in written responses. All three indicators…
Descriptors: Undergraduate Students, Evaluation Methods, Information Sources, Credibility
Newhouse, C. Paul; Tarricone, Pina – Canadian Journal of Learning and Technology, 2014
High-stakes external assessment for practical courses is fraught with problems impacting on the manageability, validity and reliability of scoring. Alternative approaches to assessment using digital technologies have the potential to address these problems. This paper describes a study that investigated the use of these technologies to create and…
Descriptors: High Stakes Tests, Student Evaluation, Evaluation Methods, Scoring
Han, Chao; Riazi, Mehdi – Assessment & Evaluation in Higher Education, 2018
The accuracy of self-assessment has long been examined empirically in higher education research, producing a substantial body of literature that casts light on numerous potential moderators. However, despite the growing popularity of self-assessment in interpreter training and education, very limited evidence-based research has been initiated to…
Descriptors: Accuracy, Educational Research, Higher Education, Self Evaluation (Individuals)
Han, Turgay; Huang, Jinyan – PASAA: Journal of Language Teaching and Learning in Thailand, 2017
Using generalizability (G-) theory and rater interviews as both quantitative and qualitative approaches, this study examined the impact of scoring methods (i.e., holistic versus analytic scoring) on the scoring variability and reliability of an EFL institutional writing assessment at a Turkish university. Ten raters were invited to rate 36…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring
Kobayashi, Yuichiro; Abe, Mariko – Journal of Pan-Pacific Association of Applied Linguistics, 2016
The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and…
Descriptors: Second Language Learning, Computer Assisted Testing, Scoring, Automation