NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020
Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…
Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Peer reviewed Peer reviewed
Direct linkDirect link
Denovan, Andrew; Dagnall, Neil; Drinkwater, Ken – Journal of Psychoeducational Assessment, 2022
This study examined the psychometric properties of the Ego Resiliency Scale-Revised (ER89-R). Though support exists for a multidimensional conceptualisation using classical test theory approaches (i.e., a higher-order model comprising Openness to Life Experiences and Optimal Regulation factors), this measure has not been subjected to Rasch…
Descriptors: Likert Scales, Self Concept, Resilience (Psychology), Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sasao, Yosuke; Webb, Stuart – Language Teaching Research, 2017
Knowledge of English affixes plays a significant role in increasing knowledge of words. However, few attempts have been made to create a valid and reliable measure of affix knowledge. The Word Part Levels Test (WPLT) was developed to measure three aspects of affix knowledge: form (recognition of written affix forms), meaning (knowledge of affix…
Descriptors: English (Second Language), Second Language Learning, Language Tests, Morphemes
Peer reviewed Peer reviewed
Direct linkDirect link
Roche, Thomas; Harrington, Michael – Journal of Further and Higher Education, 2018
English language programmes provide established pathways for international students seeking university admission in countries such as Australia and the United Kingdom. In order to refer international applicants to appropriate levels and durations of English language support prior to matriculation into their main course of study, pathway providers…
Descriptors: Student Placement, College Admission, College Students, Foreign Students
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Turner, Mark; Davila-Ross, Marina – Psychology Teaching Review, 2015
The ability to reason scientifically and communicate research appropriately is central to psychological literacy. Scientific research has little value unless scientists are able to convey results and their consequences clearly to others. In this study, we outline a method of assessing the development of psychological literacy in undergraduate…
Descriptors: Interviews, Research Projects, Psychological Studies, Verbal Communication
Peer reviewed Peer reviewed
Direct linkDirect link
Black, Beth; Suto, Irenka; Bramley, Tom – Assessment in Education: Principles, Policy & Practice, 2011
In this paper we develop an evidence-based framework for considering many of the factors affecting marker agreement in GCSEs and A levels. A logical analysis of the demands of the marking task suggests a core grouping comprising: (i) question features; (ii) mark scheme features; and (iii) examinee response features. The framework synthesises…
Descriptors: Interrater Reliability, Grading, Scoring, High Stakes Tests
OECD Publishing, 2013
The Programme for the International Assessment of Adult Competencies (PIAAC) has been planned as an ongoing program of assessment. The first cycle of the assessment has involved two "rounds." The first round, which is covered by this report, took place over the period of January 2008-October 2013. The main features of the first cycle of…
Descriptors: International Assessment, Adults, Skills, Test Construction
Peer reviewed Peer reviewed
Bramley, Tom – Evaluation & Research in Education, 2001
Analyzed data from a session of the General Certificate of Secondary Education (GCSE) mathematics examination to identify items displaying a bi-modal expected score distribution, try to explain the bi-modality, rescore the items to remove under-used middle categories, and determine the effect on test reliability of rescoring the data. Discusses…
Descriptors: Foreign Countries, Mathematics Tests, Reliability, Scores
Peer reviewed Peer reviewed
Jones, Allan – Journal of Geography in Higher Education, 1997
Examines the increase in popularity of objective testing in the United Kingdom and addresses some of the accompanying academic issues. Reports on a case study of test production and implementation to illustrate issues of time costs and benefits. Discusses question styles, marking schemes, and the problem of guesswork. (MJP)
Descriptors: Comparative Testing, Educational Practices, Educational Trends, Foreign Countries