Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 35 |
Descriptor
Test Theory | 37 |
Item Response Theory | 23 |
Test Items | 16 |
Foreign Countries | 14 |
Difficulty Level | 13 |
Elementary School Students | 12 |
Grade 8 | 12 |
Correlation | 10 |
Mathematics Tests | 10 |
Grade 4 | 9 |
Grade 6 | 8 |
More ▼ |
Source
Author
Tindal, Gerald | 4 |
Liu, Kimy | 3 |
Ketterlin-Geller, Leanne R. | 2 |
Lee, Young-Sun | 2 |
Abedalaziz, Nabeel | 1 |
Almehrizi, Rashid S. | 1 |
Alonzo, Julie | 1 |
Anderson, Daniel | 1 |
Andrich, David | 1 |
Assouline, Susan G. | 1 |
Atar, Hakan Yavuz | 1 |
More ▼ |
Publication Type
Education Level
Elementary Education | 37 |
Middle Schools | 18 |
Secondary Education | 15 |
Junior High Schools | 14 |
Grade 8 | 13 |
Grade 4 | 10 |
Grade 6 | 9 |
Intermediate Grades | 9 |
Grade 7 | 8 |
Grade 5 | 7 |
Grade 3 | 6 |
More ▼ |
Audience
Location
Florida | 3 |
Texas | 3 |
Turkey | 3 |
Australia | 2 |
Colorado | 2 |
New York | 2 |
Tennessee | 2 |
United States | 2 |
California | 1 |
Cyprus | 1 |
France | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021
Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…
Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory
Kaya Uyanik, Gulden; Demirtas Tolaman, Tugba; Gur Erdogan, Duygu – International Journal of Assessment Tools in Education, 2021
This paper aims to examine and assess the questions included in the "Turkish Common Exam" for sixth graders held in the first semester of 2018 which is one of the common exams carried out by The Measurement and Evaluation Centers, in terms of question structure, quality and taxonomic value. To this end, the test questions were examined…
Descriptors: Foreign Countries, Grade 6, Standardized Tests, Test Items
LeBeau, Brandon; Assouline, Susan G.; Mahatmya, Duhita; Lupkowski-Shoplik, Ann – Gifted Child Quarterly, 2020
This study investigated the application of item response theory (IRT) to expand the range of ability estimates for gifted (hereinafter referred to as high-achieving) students' performance on an above-level test. Using a sample of fourth- to sixth-grade high-achieving students (N = 1,893), we conducted a study to compare estimates from two…
Descriptors: Item Response Theory, Test Theory, Academically Gifted, High Achievement
Different Analyses, Different Conclusions? Validity Evidence from the EGMA Spatial Reasoning Subtask
Perry, Lindsey – Global Education Review, 2018
As the global development community shifts its focus from improving access to education to improving learning and instruction, the need for instruments that accurately measure student achievement in mathematics and meet technical standards is increasing. This paper explores the importance of collecting high-quality validity evidence that aligns…
Descriptors: Mathematics Tests, Test Validity, Spatial Ability, Foreign Countries
Ayva Yörü, Fatma Gökçen; Atar, Hakan Yavuz – Journal of Pedagogical Research, 2019
The aim of this study is to examine whether the items in the mathematics subtest of the Centralized High School Entrance Placement Test [HSEPT] administered in 2012 by the Ministry of National Education in Turkey show DIF according to gender and type of school. For this purpose, SIBTEST, Breslow-Day, Lord's [chi-squared] and Raju's area…
Descriptors: Test Bias, Mathematics Tests, Test Items, Gender Differences
Longabach, Tanya; Peyton, Vicki – Language Testing, 2018
K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…
Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency
Güler, Nese; Ilhan, Mustafa; Güneyli, Ahmet; Demir, Süleyman – Educational Sciences: Theory and Practice, 2017
This study evaluates the psychometric properties of three different forms of the Writing Apprehension Test (WAT; Daly & Miller, 1975) through Rasch analysis. For this purpose, the fit statistics and correlation coefficients, and the reliability, separation ratio, and chi-square values for the facets of item and person calculated for the…
Descriptors: Writing Apprehension, Psychometrics, Item Response Theory, Tests
Schoen, Robert C.; Liu, Sicong; Yang, Xiaotong; Paek, Insu – Grantee Submission, 2017
The Early Fractions Test is a paper-pencil test designed to measure mathematics achievement of third- and fourth-grade students in the domain of fractions. The purpose, or intended use, of the Early Fractions Test is to serve as a student pretest covariate and a test of baseline equivalence in the larger study. In this report, we discuss our…
Descriptors: Mathematics Achievement, Fractions, Mathematics Tests, Grade 3
Çokluk, Ömay; Gül, Emrah; Dogan-Gül, Çilem – Educational Sciences: Theory and Practice, 2016
The study aims to examine whether differential item function is displayed in three different test forms that have item orders of random and sequential versions (easy-to-hard and hard-to-easy), based on Classical Test Theory (CTT) and Item Response Theory (IRT) methods and bearing item difficulty levels in mind. In the correlational research, the…
Descriptors: Test Bias, Test Items, Difficulty Level, Test Theory
Stugart, Melissa – ProQuest LLC, 2016
Our nation is in the midst of one of the largest education reforms in decades centered on the adoption of the Common Core State Standards (CCSS) and aligned assessments. In an era of rising accountability measures and declining literacy proficiency, it is vital to ensure that educational resources, such as benchmark assessments, are appropriately…
Descriptors: Common Core State Standards, Benchmarking, Educational Assessment, Test Items
Shanmugam, S. Kanageswari Suppiah; Wong, Vincent; Rajoo, Murugan – Malaysian Journal of Learning and Instruction, 2020
Purpose: This study examined the quality of English test items using psychometric and linguistic characteristics among Grade Six pupils. Method: Contrary to the conventional approach of relying only on statistics when investigating item quality, this study adopted a mixed-method approach by employing psychometric analysis and cognitive interviews.…
Descriptors: English (Second Language), Second Language Instruction, Language Tests, Psychometrics
Relkin, Emily; de Ruiter, Laura; Bers, Marina Umaschi – Journal of Science Education and Technology, 2020
There is a need for developmentally appropriate Computational Thinking (CT) assessments that can be implemented in early childhood classrooms. We developed a new instrument called "TechCheck" for assessing CT skills in young children that does not require prior knowledge of computer programming. "TechCheck" is based on…
Descriptors: Developmentally Appropriate Practices, Computation, Thinking Skills, Early Childhood Education
Murphy, Daniel L.; Beretvas, S. Natasha – Applied Measurement in Education, 2015
This study examines the use of cross-classified random effects models (CCrem) and cross-classified multiple membership random effects models (CCMMrem) to model rater bias and estimate teacher effectiveness. Effect estimates are compared using CTT versus item response theory (IRT) scaling methods and three models (i.e., conventional multilevel…
Descriptors: Teacher Effectiveness, Comparative Analysis, Hierarchical Linear Modeling, Test Theory
Abedalaziz, Nabeel; Leng, Chin Hai – Malaysian Online Journal of Educational Sciences, 2013
Most of the tests and inventories used by counseling psychologists have been developed using CTT; IRT derives from what is called latent trait theory. A number of important differences exist between CTT- versus IRT-based approaches to both test development and evaluation, as well as the process of scoring the response profiles of individual…
Descriptors: Test Theory, Item Response Theory, Difficulty Level, Models
Kingsley, Laurie; Romine, William – European Journal of Educational Research, 2014
Schools and teacher induction programs around the world routinely assess teaching best practice to inform accreditation, tenure/promotion, and professional development decisions. Routine assessment is also necessary to ensure that teachers entering the profession get the assistance they need to develop and succeed. We introduce the Item-Level…
Descriptors: Test Construction, Test Validity, Beginning Teacher Induction, Best Practices