Publication Date
In 2025 | 12 |
Since 2024 | 187 |
Since 2021 (last 5 years) | 818 |
Since 2016 (last 10 years) | 1951 |
Since 2006 (last 20 years) | 4074 |
Descriptor
Item Response Theory | 5553 |
Test Items | 1817 |
Foreign Countries | 1196 |
Models | 1148 |
Psychometrics | 918 |
Scores | 782 |
Comparative Analysis | 761 |
Test Construction | 750 |
Simulation | 740 |
Statistical Analysis | 659 |
Difficulty Level | 570 |
More ▼ |
Source
Author
Sinharay, Sandip | 48 |
Wilson, Mark | 45 |
Cohen, Allan S. | 43 |
Meijer, Rob R. | 43 |
Tindal, Gerald | 42 |
Wang, Wen-Chung | 40 |
Alonzo, Julie | 37 |
Ferrando, Pere J. | 36 |
Cai, Li | 35 |
van der Linden, Wim J. | 35 |
Glas, Cees A. W. | 34 |
More ▼ |
Publication Type
Education Level
Location
Turkey | 94 |
Australia | 89 |
Germany | 79 |
United States | 74 |
Netherlands | 68 |
Taiwan | 59 |
Indonesia | 53 |
China | 51 |
Canada | 49 |
Japan | 38 |
Florida | 37 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 4 |
Meets WWC Standards with or without Reservations | 4 |
Carlson, James E.; Jirele, Tom – 1992
Some results are presented relating to the dimensionality of the 1990 National Assessment of Educational Progress (NAEP) mathematics item-response data. Based on theoretical considerations, practical limitations, and previous research, two procedures were selected for study: full information factor analysis as implemented in the TESTFACT computer…
Descriptors: Comparative Testing, Computer Software Evaluation, Factor Analysis, Grade 4
Adema, Jos J. – 1990
Methods are proposed for the construction of weakly parallel tests, that is, tests with the same test information function. A mathematical programing model for constructing tests with a prespecified test information function and a heuristic for assigning items to tests such that their information functions are equal play an important role in the…
Descriptors: College Entrance Examinations, Computer Assisted Testing, Equations (Mathematics), Foreign Countries
Yang, Wen-Ling; Chen, Wen-Hung – 2000
This paper provides a historical review of the changes and improvements made in estimating numerical cutpoints for the National Assessment of Educational Progress (NAEP). While reviewing the various methodologies used for collecting judgment data, the paper discusses: (1) the incorporation of Item Response Theory for setting standards; (2)…
Descriptors: Academic Achievement, Academic Standards, Constructed Response, Cutting Scores
Liu, Jinghua; Feigenbaum, Miriam; Cook, Linda – College Entrance Examination Board, 2004
This study explored possible configurations of the new SAT® critical reading section without analogy items. The item pool contained items from SAT verbal (SAT-V) sections of 14 previously administered SAT tests, calibrated using the three-parameter logistic IRT model. Multiple versions of several prototypes that do not contain analogy items were…
Descriptors: College Entrance Examinations, Critical Reading, Logical Thinking, Difficulty Level

Dorans, Neil J.; And Others – Journal of Educational Measurement, 1992
The standardization approach to comprehensive differential item functioning is described and contrasted with the log-linear approach to differential distractor functioning and the item-response-theory-based approach to differential alternative functioning. Data from an edition of the Scholastic Aptitude Test illustrate application of the approach…
Descriptors: Black Students, College Entrance Examinations, Comparative Testing, Distractors (Tests)

Samejima, Fumiko – Applied Psychological Measurement, 1994
The Level-11 vocabulary subtest of the Iowa Tests of Basic Skills was analyzed using a two-stage latent trait approach and data set of 2,356 examinees, approximately 11 years of age. It is concluded that the nonparametric approach leads to efficient estimation of the latent trait. (SLD)
Descriptors: Achievement Tests, Distractors (Tests), Elementary Education, Elementary School Students

Bejar, Isaac I.; Yocom, Peter – Applied Psychological Measurement, 1991
An approach to test modeling is illustrated that encompasses both response consistency and response difficulty. This generative approach makes validation an ongoing process. An analysis of hidden figure items with 60 high school students supports the feasibility of the method. (SLD)
Descriptors: Construct Validity, Difficulty Level, Evaluation Methods, High School Students

Skaggs, Gary; Lissitz, Robert W. – Journal of Educational Measurement, 1992
The consistency of several item bias detection methods was studied across different test administrations of the same items using data from a mathematics test given to approximately 6,600 eighth grade students in all. The Mantel Haenszel and item-response-theory-based sum-of-squares methods were the most consistent. (SLD)
Descriptors: Comparative Testing, Grade 8, Item Bias, Item Response Theory

Camilli, Gregory – Applied Psychological Measurement, 1992
A mathematical model is proposed to describe how group differences in distributions of abilities, which are distinct from the target ability, influence the probability of a correct item response. In the multidimensional approach, differential item functioning is considered a function of the educational histories of the examinees. (SLD)
Descriptors: Ability, Comparative Analysis, Equations (Mathematics), Factor Analysis

Gumpel, Thomas; Wilson, Mark; Shalev, Ruth – Journal of Learning Disabilities, 1998
A study involving 453 pairs of Israeli parents and teachers used an item response theory measurement model(IRT) to examine the 28-item Conners Teacher's Rating Scale (CTRS) for diagnosing attention deficit hyperactivity disorder. IRT analyses revealed structural inadequacies involving the inappropriateness of the 4-point Likert-type scale used by…
Descriptors: Attention Deficit Disorders, Behavior Rating Scales, Clinical Diagnosis, Disability Identification
Pechenizkiy, Mykola; Calders, Toon; Conati, Cristina; Ventura, Sebastian; Romero, Cristobal; Stamper, John – International Working Group on Educational Data Mining, 2011
The 4th International Conference on Educational Data Mining (EDM 2011) brings together researchers from computer science, education, psychology, psychometrics, and statistics to analyze large datasets to answer educational research questions. The conference, held in Eindhoven, The Netherlands, July 6-9, 2011, follows the three previous editions…
Descriptors: Academic Achievement, Logical Thinking, Profiles, Tutoring
Bennett, Randy Elliot; And Others – 1991
This exploratory study applied two new cognitively sensitive measurement models to constructed-response quantitative data. The models, intended to produce qualitative characteristics of examinee performance, were fitted to algebra word problem solutions produced by 285 examinees taking the Graduate Record Examinations (GRE) General Test. The two…
Descriptors: Algebra, College Entrance Examinations, College Students, Constructed Response
Lukhele, Robert; And Others – 1993
Analyses based on fitting item response models to data from the College Board's Advanced Placement exams in Chemistry and United States History indicated that the constructed-response portion of the tests yielded little information over and above that provided by the multiple-choice sections. These tests also allow examinees to select subsets of…
Descriptors: Achievement Tests, Advanced Placement, Chemistry, Constructed Response

Hennings, Sara S.; And Others – 1996
Three methods of equating performance assessments are compared. Equipercentile equating, item response theory (IRT) partial credit model, and IRT graded response model methods were applied to the same data using an anchor test design. Comparisons are based primarily on the resulting raw score-to-raw score tables and the effects these tables have…
Descriptors: Classification, Comparative Analysis, Cutting Scores, Elementary School Students
Lam, Peter; Foong, Yoke-Yeen – 1996
This study attempts to estimate Structure of Learning Outcome (SOLO) levels in mathematics using the Partial Credit and Rating Scale models. A 30-item test comprising 10 testlets of 3 items each was designed and administered to 674 lower secondary school students. The items were arranged in a hierarchical manner, each testing SOLO levels in this…
Descriptors: Classification, Computer Assisted Testing, Foreign Countries, Goodness of Fit