Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 25 |
Since 2006 (last 20 years) | 109 |
Descriptor
Test Items | 228 |
Test Reliability | 144 |
Test Construction | 90 |
Test Validity | 88 |
Reliability | 58 |
Scoring | 46 |
Scores | 45 |
Item Response Theory | 42 |
Psychometrics | 40 |
Item Analysis | 36 |
Difficulty Level | 34 |
More ▼ |
Source
Author
Lee, Guemin | 4 |
Meijer, Rob R. | 4 |
Feldt, Leonard S. | 3 |
Frisbie, David A. | 3 |
Nicewander, W. Alan | 3 |
Alonzo, Julie | 2 |
Bock, R. Darrell | 2 |
Bramley, Tom | 2 |
Budescu, David V. | 2 |
Davis-Becker, Susan L. | 2 |
Gierl, Mark J. | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 21 |
Secondary Education | 16 |
Elementary Secondary Education | 14 |
Postsecondary Education | 11 |
High Schools | 9 |
Middle Schools | 9 |
Elementary Education | 7 |
Grade 8 | 7 |
Grade 5 | 5 |
Grade 7 | 4 |
Junior High Schools | 4 |
More ▼ |
Audience
Practitioners | 4 |
Researchers | 3 |
Teachers | 3 |
Location
California | 6 |
Canada | 4 |
Nebraska | 4 |
United Kingdom | 4 |
China | 3 |
New York | 3 |
Alabama | 2 |
Malaysia | 2 |
Oregon | 2 |
Taiwan | 2 |
Texas | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models
Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020
One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…
Descriptors: Reliability, Probability, Skill Development, Classification
Jafri, Mairaj – Waikato Journal of Education, 2022
This paper reports how I addressed the issue of extensive missing values in my PhD study, "Digital Competencies of High School Mathematics Teachers". I collected data using an online survey. Several methods exist to address the issue of missing values. I utilised multiple imputation (MI) as it provides more accurate results. The mean…
Descriptors: Data Collection, Research Problems, Doctoral Dissertations, Online Surveys
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2019
This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights…
Descriptors: Test Validity, Test Reliability, Test Items, Correlation
Rigney, Alexander M. – Journal of Psychoeducational Assessment, 2019
The "Detroit Tests of Learning Aptitude" has been in use for more than three quarters of a century (Baker & Leland, 1935). Its longevity in the field speaks to its popularity as a broad measure of cognitive abilities. Its most recent iteration, in the form of the "Detroit Tests of Learning Abilities--Fifth Edition" (DTLA-5;…
Descriptors: Aptitude Tests, Cognitive Ability, Test Construction, Test Items
Ravand, Hamdollah; Baghaei, Purya – International Journal of Testing, 2020
More than three decades after their introduction, diagnostic classification models (DCM) do not seem to have been implemented in educational systems for the purposes they were devised. Most DCM research is either methodological for model development and refinement or retrofitting to existing nondiagnostic tests and, in the latter case, basically…
Descriptors: Classification, Models, Diagnostic Tests, Test Construction
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
Gotch, Chad M.; French, Brian F. – Educational Assessment, 2020
The State of Washington requires school districts to file court petitions on students with excessive unexcused absences. The "Washington Assessment of Risks and Needs of Students" (WARNS), a self-report screening instrument developed for use by high school and juvenile court personnel in such situations, purports to measure six facets of…
Descriptors: Risk Assessment, Needs Assessment, Truancy, Measurement Techniques
Babcock, Sarah E.; Wilson, Claire A.; Lau, Chloe – Canadian Journal of School Psychology, 2018
This article describes and reviews The School Motivation and Learning Strategies Inventory (SMALSI™; Stroud & Reynolds, 2006), published by Western Psychological Services, a self-report inventory designed to assess academic motivation, as well as learning and study strategies. The test identifies 10 primary constructs, referred to broadly as…
Descriptors: Motivation, Measures (Individuals), Test Anxiety, Test Wiseness
Adedokun, Omolola A. – Journal of Extension, 2018
This article provides an illustrative description of the pre-post difference index (PPDI), a simple, nontechnical yet robust tool for examining the instructional sensitivity of assessment items. Extension educators often design pretest-posttest instruments to assess the impact of their curricula on participants' knowledge and understanding of the…
Descriptors: Extension Education, Extension Agents, Pretests Posttests, Curriculum Evaluation
Kane, Michael T. – Assessment in Education: Principles, Policy & Practice, 2017
In response to an argument by Baird, Andrich, Hopfenbeck and Stobart (2017), Michael Kane states that there needs to be a better fit between educational assessment and learning theory. In line with this goal, Kane will examine how psychometric constraints might be loosened by relaxing some psychometric "rules" in some assessment…
Descriptors: Educational Assessment, Psychometrics, Standards, Test Reliability
Quaid, Ethan Douglas – Language Testing in Asia, 2018
This paper reviews the International English Language Testing System's speaking sub-test in the East Asia region with reference to theoretical and practice-based perspectives and identifies future research opportunities to enhance the measures of test qualities found. The test's construct validity was seen to accurately measure the abilities…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Speech Tests
Nebraska Department of Education, 2021
This technical report documents the processes and procedures implemented to support the Spring 2021 Nebraska Student-Centered Assessment System (NSCAS) Phase I Pilot in English Language Arts (ELA), Mathematics, and Science assessments by NWEA® under the supervision of the Nebraska Department of Education (NDE). The technical report shows how the…
Descriptors: Psychometrics, Standard Setting, English, Language Arts
Giraldo, Frank – HOW, 2019
The purpose of this article of reflection is to raise awareness of how poor design of language assessments may have detrimental effects, if crucial qualities and technicalities of test design are not met. The article first discusses these central qualities for useful language assessments. Then, guidelines for creating listening assessments, as an…
Descriptors: Test Construction, Consciousness Raising, Language Tests, Second Language Learning