Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 17 |
Descriptor
Psychometrics | 21 |
Test Content | 21 |
Test Items | 21 |
Test Construction | 17 |
Computer Assisted Testing | 7 |
Item Analysis | 7 |
Item Response Theory | 7 |
Scoring | 6 |
Test Bias | 6 |
Test Validity | 6 |
Difficulty Level | 5 |
More ▼ |
Source
Author
Gierl, Mark J. | 2 |
Ackerman, Debra J. | 1 |
Ackermann, Richard | 1 |
Chuang Wang | 1 |
Do-Hong Kim | 1 |
Donovan, Jenny | 1 |
Eguez, Jane | 1 |
Ewing, Maureen | 1 |
Ganguli, Debalina | 1 |
Gentile, Claudia | 1 |
Graf, Edith Aurora | 1 |
More ▼ |
Publication Type
Journal Articles | 14 |
Reports - Research | 10 |
Reports - Evaluative | 7 |
Tests/Questionnaires | 4 |
Guides - General | 3 |
Numerical/Quantitative Data | 2 |
Guides - Non-Classroom | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 5 |
Elementary Secondary Education | 3 |
High Schools | 2 |
Secondary Education | 2 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 6 | 1 |
Kindergarten | 1 |
Primary Education | 1 |
Audience
Researchers | 1 |
Teachers | 1 |
Location
Australia | 1 |
Canada | 1 |
Delaware | 1 |
Illinois | 1 |
Maryland | 1 |
North Carolina | 1 |
Ohio | 1 |
Oregon | 1 |
Pennsylvania | 1 |
Singapore | 1 |
United Kingdom | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Do-Hong Kim; Chuang Wang; Thi Nhu Ngoc Truong – Language Teaching Research, 2024
Researchers and practitioners in the field of second language acquisition have come to realize the importance of non-cognitive skills such as self-efficacy and self-regulation in students' learning of a second language. However, there has been limited systematic research on such measures in the second language context and the validity and…
Descriptors: Psychometrics, Test Content, Self Efficacy, English Language Learners
Luo, Xiao; Wang, Xinrui – International Journal of Testing, 2019
This study introduced dynamic multistage testing (dy-MST) as an improvement to existing adaptive testing methods. dy-MST combines the advantages of computerized adaptive testing (CAT) and computerized adaptive multistage testing (ca-MST) to create a highly efficient and regulated adaptive testing method. In the test construction phase, multistage…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Psychometrics
New Meridian Corporation, 2020
New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS). The goal of the QTS is to provide guidance to states that are interested in including content from the New Meridian item bank and intend to make comparability claims with "other assessments" that include New…
Descriptors: Testing, Standards, Comparative Analysis, Guidelines
New Meridian Corporation, 2020
New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS). The goal of the QTS is to provide guidance to states that are interested in including content from the New Meridian item bank and intend to make comparability claims with "other assessments" that include New…
Descriptors: Testing, Standards, Comparative Analysis, Guidelines
New Meridian Corporation, 2020
New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS) to provide guidance to states that are interested in including New Meridian content and would like to either keep reporting scores on the New Meridian Scale or use the New Meridian performance levels; that is, the state…
Descriptors: Testing, Standards, Comparative Analysis, Test Content
Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018
Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…
Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences
Gierl, Mark J.; Lai, Hollis – Educational Measurement: Issues and Practice, 2016
Testing organization needs large numbers of high-quality items due to the proliferation of alternative test administration methods and modern test designs. But the current demand for items far exceeds the supply. Test items, as they are currently written, evoke a process that is both time-consuming and expensive because each item is written,…
Descriptors: Test Items, Test Construction, Psychometrics, Models
Ackerman, Debra J. – ETS Research Report Series, 2018
Kindergarten entry assessments (KEAs) have increasingly been incorporated into state education policies over the past 5 years, with much of this interest stemming from Race to the Top--Early Learning Challenge (RTT-ELC) awards, Enhanced Assessment Grants, and nationwide efforts to develop common K-12 state learning standards. Drawing on…
Descriptors: Screening Tests, Kindergarten, Test Validity, Test Reliability
Towns, Marcy H. – Journal of Chemical Education, 2014
Chemistry faculty members are highly skilled in obtaining, analyzing, and interpreting physical measurements, but often they are less skilled in measuring student learning. This work provides guidance for chemistry faculty from the research literature on multiple-choice item development in chemistry. Areas covered include content, stem, and…
Descriptors: Multiple Choice Tests, Test Construction, Psychometrics, Test Items
Lin, Chuan-Ju – Journal of Technology, Learning, and Assessment, 2010
Assembling equivalent test forms with minimal test overlap across forms is important in ensuring test security. Chen and Lei (2009) suggested a exposure control technique to control test overlap-ordered item pooling on the fly based on the essence that test overlap rate--ordered item pooling for the first t examinees is a function of test overlap…
Descriptors: Test Length, Test Format, Evaluation Criteria, Psychometrics
Jacobsen, Jared; Ackermann, Richard; Eguez, Jane; Ganguli, Debalina; Rickard, Patricia; Taylor, Linda – Journal of Applied Testing Technology, 2011
A computer adaptive test (CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be…
Descriptors: Delivery Systems, Testing Programs, Computer Assisted Testing, Foreign Countries
Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010
The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…
Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity
ACT, Inc., 2013
This manual contains information about the American College Test (ACT) Plan® program. The principal focus of this manual is to document the Plan program's technical adequacy in light of its intended purposes. This manual supersedes the 2011 edition. The content of this manual responds to requirements of the testing industry as established in the…
Descriptors: College Entrance Examinations, Formative Evaluation, Evaluation Research, Test Bias
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation
Sawaki, Yasuyo; Kim, Hae-Jin; Gentile, Claudia – Language Assessment Quarterly, 2009
In cognitive diagnosis a Q-matrix (Tatsuoka, 1983, 1990), which is an incidence matrix that defines the relationships between test items and constructs of interest, has great impact on the nature of performance feedback that can be provided to score users. The purpose of the present study was to identify meaningful skill coding categories that…
Descriptors: Feedback (Response), Test Items, Test Content, Identification
Previous Page | Next Page »
Pages: 1 | 2