NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
McNamara, Tim; Knoch, Ute – Language Testing, 2012
This paper examines the uptake of Rasch measurement in language testing through a consideration of research published in language testing research journals in the period 1984 to 2009. Following the publication of the first papers on this topic, exploring the potential of the simple Rasch model for the analysis of dichotomous language test data, a…
Descriptors: Language Tests, Testing, English (Second Language), Item Response Theory
Kaplan, Randy M.; And Others – 1995
The increased use of constructed-response items, like essays, creates a need for tools to score these responses automatically in part or as a whole. This study explores one approach to analyzing essay-length natural language constructed-responses. A decision model for scoring essays was developed and evaluated. The decision model uses…
Descriptors: Computer Software, Constructed Response, Essay Tests, Grammar
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Yong-Won – Language Testing, 2006
A multitask speaking measure consisting of both integrated and independent tasks is expected to be an important component of a new version of the TOEFL test. This study considered two critical issues concerning score dependability of the new speaking measure: How much would the score dependability be impacted by (1) combining scores on different…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Generalizability Theory
McKinley, Robert L.; Way, Walter D. – 1992
An analysis of the skills necessary for performance on the Test of English as a Foreign Language (TOEFL) tends to support the view that there are important, although subtle, secondary dimensions present in the test. This research explored the feasibility of an item response theory (IRT) based method of modeling examinee performance on these…
Descriptors: Ability, Goodness of Fit, Identification, Item Response Theory
Yamamoto, Kentaro – 1995
The traditional indicator of test speededness, missing responses, clearly indicates a lack of time to respond (thereby indicating the speededness of the test), but it is inadequate for evaluating speededness in a multiple-choice test scored as number correct, and it underestimates test speededness. Conventional item response theory (IRT) parameter…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Multiple Choice Tests
Peer reviewed Peer reviewed
Wainer, Howard; Lukhele, Robert – Educational and Psychological Measurement, 1997
The reliability of scores from four forms of the Test of English as a Foreign Language (TOEFL) was estimated using a hybrid item response theory model. It was found that there was very little difference between overall reliability when the testlet items were assumed to be independent and when their dependence was modeled. (Author/SLD)
Descriptors: English (Second Language), Item Response Theory, Scores, Second Language Learning
Hicks, Marilyn M. – 1988
Several exploratory analyses of the fifths data generated by Test of English as a Foreign Language (TOEFL) item analyses were developed in order to evaluate the effects of options on the discriminability of difficult items and to identify difficult items with low, unreliable biserials that had been rejected by test developers, but for which…
Descriptors: Difficulty Level, Estimation (Mathematics), Identification, Item Analysis
Way, Walter D.; Reese, Clyde M. – 1991
The use of two alternative item response theory (IRT) estimation models in the scaling and equating of the Test of English as a Foreign Language (TOEFL) was explored; and item scaling and test equating results based on these models were compared with results based on the three-parameter (3PL) model currently being used with the TOEFL. Models were…
Descriptors: Correlation, Equated Scores, Estimation (Mathematics), Goodness of Fit
Tang, K. Linda; And Others – 1993
This study compared the performance of the LOGIST and BILOG computer programs on item response theory (IRT) based scaling and equating for the Test of English as a Foreign Language (TOEFL) using real and simulated data and two calibration structures. Applications of IRT for the TOEFL program are based on the three-parameter logistic (3PL) model.…
Descriptors: Comparative Analysis, Computer Simulation, Equated Scores, Estimation (Mathematics)
Boldt, R. F. – 1994
The comparison of item response theory models for the Test of English as a Foreign Language (TOEFL) was extended to an equating context as simulation trials were used to "equate the test to itself." Equating sample data were generated from administration of identical item sets. Equatings that used procedures based on each model (simple…
Descriptors: Comparative Analysis, Cutting Scores, English (Second Language), Equated Scores
Way, Walter D.; And Others – 1992
This study provided an exploratory investigation of item features that might contribute to a lack of invariance of item parameters for the Test of English as a Foreign Language (TOEFL). Data came from seven forms of the TOEFL administered in 1989. Subjective and quantitative measures developed for the study provided consistent information related…
Descriptors: Ability, English (Second Language), Goodness of Fit, Item Response Theory
Hicks, Marilyn M. – 1989
Methods of computerized adaptive testing using conventional scoring methods in order to develop a computerized placement test for the Test of English as a Foreign Language (TOEFL) were studied. As a consequence of simulation studies during the first phase of the study, the multilevel testing paradigm was adopted to produce three test levels…
Descriptors: Adaptive Testing, Adults, Algorithms, Computer Assisted Testing