Publication Date
In 2025 | 0 |
Since 2024 | 6 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 48 |
Since 2006 (last 20 years) | 84 |
Descriptor
Test Content | 115 |
Test Items | 115 |
Test Construction | 34 |
Foreign Countries | 27 |
Computer Assisted Testing | 21 |
Item Analysis | 21 |
Test Validity | 21 |
Comparative Analysis | 20 |
Mathematics Tests | 18 |
Test Format | 18 |
Scores | 17 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 115 |
Journal Articles | 84 |
Speeches/Meeting Papers | 16 |
Numerical/Quantitative Data | 6 |
Tests/Questionnaires | 6 |
Information Analyses | 3 |
Guides - Classroom - Teacher | 2 |
Reports - Evaluative | 1 |
Education Level
Audience
Teachers | 4 |
Practitioners | 2 |
Location
Canada | 4 |
California | 3 |
Turkey | 3 |
United States | 3 |
Delaware | 2 |
Europe | 2 |
Japan | 2 |
Maryland | 2 |
Massachusetts | 2 |
Netherlands | 2 |
Ohio | 2 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Pan, Yiqin; Wollack, James A. – Educational Measurement: Issues and Practice, 2023
Pan and Wollack (PW) proposed a machine learning method to detect compromised items. We extend the work of PW to an approach detecting compromised items and examinees with item preknowledge simultaneously and draw on ideas in ensemble learning to relax several limitations in the work of PW. The suggested approach also provides a confidence score,…
Descriptors: Artificial Intelligence, Prior Learning, Item Analysis, Test Content
Ondrej Klíma; Martin Lakomý; Ekaterina Volevach – International Journal of Social Research Methodology, 2024
We tested the impacts of Hofstede's cultural factors and mode of administration on item nonresponse (INR) for political questions in the European Values Study (EVS). We worked with the integrated European Values Study dataset, using descriptive analysis and multilevel binary logistic regression models. We concluded that (1) modes of administration…
Descriptors: Cultural Influences, Testing, Test Items, Responses
Marjolein Muskens; Willem E. Frankenhuis; Lex Borghans – npj Science of Learning, 2024
In many countries, standardized math tests are important for achieving academic success. Here, we examine whether content of items, the story that explains a mathematical question, biases performance of low-SES students. In a large-scale cohort study of Trends in International Mathematics and Science Studies (TIMSS)--including data from 58…
Descriptors: Mathematics Tests, Standardized Tests, Test Items, Low Income Students
Britt Hadar; Maayan Katzir; Sephi Pumpian; Tzur Karelitz; Nira Liberman – npj Science of Learning, 2023
Performance on standardized academic aptitude tests (AAT) can determine important life outcomes. However, it is not clear whether and which aspects of the content of test questions affect performance. We examined the effect of psychological distance embedded in test questions. In Study 1 (N = 41,209), we classified the content of existing AAT…
Descriptors: Academic Aptitude, Thinking Skills, Aptitude Tests, Standardized Tests
Sarah K. Cowan; Michael Hout; Stuart Perrett – Sociological Methods & Research, 2024
Long-running surveys need a systematic way to reflect social change and to keep items relevant to respondents, especially when they ask about controversial subjects, or they threaten the items' validity. We propose a protocol for updating measures that preserves content and construct validity. First, substantive experts articulate the current and…
Descriptors: Surveys, Public Opinion, Social Attitudes, Pregnancy
Wise, Steven L. – Applied Measurement in Education, 2020
In achievement testing there is typically a practical requirement that the set of items administered should be representative of some target content domain. This is accomplished by establishing test blueprints specifying the content constraints to be followed when selecting the items for a test. Sometimes, however, students give disengaged…
Descriptors: Test Items, Test Content, Achievement Tests, Guessing (Tests)
Do-Hong Kim; Chuang Wang; Thi Nhu Ngoc Truong – Language Teaching Research, 2024
Researchers and practitioners in the field of second language acquisition have come to realize the importance of non-cognitive skills such as self-efficacy and self-regulation in students' learning of a second language. However, there has been limited systematic research on such measures in the second language context and the validity and…
Descriptors: Psychometrics, Test Content, Self Efficacy, English Language Learners
Agus Santoso; Heri Retnawati; Timbul Pardede; Ibnu Rafi; Munaya Nikma Rosyada; Gulzhaina K. Kassymova; Xu Wenxin – Practical Assessment, Research & Evaluation, 2024
The test blueprint is important in test development, where it guides the test item writer in creating test items according to the desired objectives and specifications or characteristics (so-called a priori item characteristics), such as the level of item difficulty in the category and the distribution of items based on their difficulty level.…
Descriptors: Foreign Countries, Undergraduate Students, Business English, Test Construction
Russell, Michael; Moncaleano, Sebastian – Practical Assessment, Research & Evaluation, 2020
Although both content alignment and standard-setting procedures rely on content-expert panel judgements, only the latter employs discussion among panel members. This study employed a modified form of the Webb methodology to examine content alignment for twelve tests administered as part of the Massachusetts Comprehensive Assessment System (MCAS).…
Descriptors: Test Content, Test Items, Discussion, Test Validity
Wu, Haiyan; Liang, Xinya; Yürekli, Hülya; Becker, Betsy Jane; Paek, Insu; Binici, Salih – Journal of Psychoeducational Assessment, 2020
The demand for diagnostic feedback has triggered extensive research on cognitive diagnostic models (CDMs), such as the deterministic input, noisy output "and" gate (DINA) model. This study explored two Q-matrix specifications with the DINA model in a statewide large-scale mathematics assessment. The first Q-matrix was developed based on…
Descriptors: Mathematics Tests, Cognitive Measurement, Models, Test Items
Luo, Xiao; Wang, Xinrui – International Journal of Testing, 2019
This study introduced dynamic multistage testing (dy-MST) as an improvement to existing adaptive testing methods. dy-MST combines the advantages of computerized adaptive testing (CAT) and computerized adaptive multistage testing (ca-MST) to create a highly efficient and regulated adaptive testing method. In the test construction phase, multistage…
Descriptors: Adaptive Testing, Computer Assisted Testing, Test Construction, Psychometrics
Sivakorn Tangsakul; Kornwipa Poonpon – rEFLections, 2024
Given the significant global influence of the Common European Framework of Reference for Languages: Teaching, Learning, and Assessment (CEFR) on English language education, this study deals with aligning a university's academic reading tests to the CEFR. It aimed at validating the test construct of the academic reading tests in relation to the…
Descriptors: Alignment (Education), Reading Tests, Second Language Learning, Language Proficiency
Parry, James R. – Online Submission, 2020
This paper presents research and provides a method to ensure that parallel assessments, that are generated from a large test-item database, maintain equitable difficulty and content coverage each time the assessment is presented. To maintain fairness and validity it is important that all instances of an assessment, that is intended to test the…
Descriptors: Culture Fair Tests, Difficulty Level, Test Items, Test Validity
Atalmis, Erkan Hasan; Kingston, Neal Martin – SAGE Open, 2018
This study explored the impact of homogeneity of answer choices on item difficulty and discrimination. Twenty-two matched pairs of elementary and secondary mathematics items were administered to randomly equivalent samples of students. Each item pair comparison was treated as a separate study with the set of effect sizes analyzed using…
Descriptors: Test Items, Difficulty Level, Multiple Choice Tests, Mathematics Tests
Wallace, Matthew P.; Ke, Haijiao – TEFLIN Journal: A publication on the teaching and learning of English, 2023
This study examined the content alignment between an English as a foreign language skills curriculum and a provincial language test in China. When there is misalignment in the content between the standards of a curriculum and a test, conclusions about student abilities and teaching effectiveness can be questioned. To examine this, three categories…
Descriptors: Language Tests, Alignment (Education), Second Language Learning, Second Language Instruction