Publication Date
In 2025 | 6 |
Since 2024 | 59 |
Descriptor
Scores | 59 |
Test Reliability | 31 |
Reliability | 24 |
Foreign Countries | 23 |
Psychometrics | 21 |
Test Validity | 20 |
Factor Analysis | 16 |
Measures (Individuals) | 15 |
Correlation | 12 |
Rating Scales | 11 |
Validity | 11 |
More ▼ |
Source
Author
Anthony J. Gambino | 2 |
Bradley T. Erford | 2 |
Bridget Poznanski | 2 |
D. Betsy McCoach | 2 |
Daniel Long | 2 |
Del Siegle | 2 |
Hakyung Sung | 2 |
Howard Abikoff | 2 |
Jenelle Nissley-Tsiopinis | 2 |
Kristopher Kyle | 2 |
Laura Pendergast | 2 |
More ▼ |
Publication Type
Journal Articles | 55 |
Reports - Research | 51 |
Reports - Evaluative | 3 |
Tests/Questionnaires | 3 |
Dissertations/Theses -… | 2 |
Information Analyses | 2 |
Reports - Descriptive | 2 |
Education Level
Audience
Researchers | 1 |
Location
Turkey | 6 |
Canada | 2 |
Spain | 2 |
United Kingdom | 2 |
United States | 2 |
Belgium | 1 |
China | 1 |
Croatia | 1 |
France | 1 |
Hawaii | 1 |
India | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Arielle Boguslav; Julie Cohen – Journal of Teacher Education, 2024
Teacher preparation programs are increasingly expected to use data on preservice teacher (PST) skills to drive program improvement and provide targeted supports. Observational ratings are especially vital, but also prone to measurement issues. Scores may be influenced by factors unrelated to PSTs' instructional skills, including rater standards.…
Descriptors: Preservice Teachers, Measures (Individuals), Evaluation Problems, Teaching Skills
Kristen Bottema-Beutel; Shannon Crowley LaPoint; So Yoon Kim; Sarah Mohiuddin; Qun Yu; Rachael McKinnon – Exceptional Children, 2024
In this secondary analysis of a previously conducted systematic review, we analyze social validity assessments in intervention research for transition-age autistic youth. Social validity is concerned with the acceptability of the intervention goals, the acceptability and feasibility of the intervention procedures, and the perceived importance of…
Descriptors: Autism Spectrum Disorders, Intervention, Validity, Psychometrics
Ian Jones; Ben Davies – International Journal of Research & Method in Education, 2024
Educational researchers often need to construct precise and reliable measurement scales of complex and varied representations such as participants' written work, videoed lesson segments and policy documents. Developing such scales using can be resource-intensive and time-consuming, and the outcomes are not always reliable. Here we present…
Descriptors: Educational Research, Comparative Analysis, Educational Researchers, Measurement
Brogan L. Barr; Virginia V. W. McIntosh; Eileen F. Britt; Jennifer Jordan; Janet D. Carter – Measurement: Interdisciplinary Research and Perspectives, 2024
Even when raters demonstrate agreement in the use of a measure, limited score variability or violation of often-ignored statistical assumptions can result in lower reliability estimates than intuitively expected. This article uses data drawn from two randomized controlled trials of schema therapy and cognitive behavioral therapy for the treatment…
Descriptors: Evaluators, Interrater Reliability, Reliability, Measurement Techniques
Richard S. Balkin; Quentin Hunter; Bradley T. Erford – Measurement and Evaluation in Counseling and Development, 2024
We describe best practices in reporting reliability estimates in counseling research with consideration to precision, generalization, and diverse populations. We provide a historical context to reporting reliability estimates, the limitations of past practices, and new methods to address reliability generalization. We highlight best practices…
Descriptors: Best Practices, Reliability, Counseling, Research
Sojeong Nam; Byeolbee Um; Jeongwoon Jeong; Monique Rodriguez; David Lardier – Measurement and Evaluation in Counseling and Development, 2024
This study aimed to provide meta-analytic reliability information of the Columbia-Suicide Severity Rating Scale (C-SSRS). We implemented systematic search procedures to 35 eligible studies (N = 23,247; Mage = 26.74 years) that reported reliability estimates. The synthesized average values of Cronbach's alpha were 0.88 (95% CI [0.85, 0.92]) for the…
Descriptors: Scores, Test Reliability, Rating Scales, Suicide
M. Arda Atakaya; Ugur Sak; M. Bahadir Ayas – Creativity Research Journal, 2024
Scoring in creativity research has been a central problem since creativity became an important issue in psychology and education in the 1950s. The current study examined the psychometric properties of 27 creativity indices derived from summed and averaged scores using 15 scoring methods. Participants included 2802 middle-school students. Data…
Descriptors: Psychometrics, Creativity, Creativity Tests, Scoring
Ehri Ryu – Society for Research on Educational Effectiveness, 2024
Background/Context: Confirmatory factor analysis (CFA) model is a commonly adopted framework to estimate and test a measurement model. Once a well-fitting final CFA model is selected, the selected model may be used to test structural relationships of the latent constructs with other variables, to construct a test with desired reliability and…
Descriptors: Research Problems, Factor Analysis, Scores, Computation
Aberdine R. Dwight; Amy M. Briesch; Jessica A. Hoffman; Christopher Rutt – Child & Youth Care Forum, 2024
Background: Although the Depression Anxiety Stress Scales, Short Form (DASS-21) was developed for adults, its authors noted no compelling reasons to not use the measure with youth as young as 12 years. Despite increasingly widespread use with youth, psychometric evidence in support of its use with this population needs to be investigated to fully…
Descriptors: Depression (Psychology), Measures (Individuals), Anxiety, Stress Variables
Lucy Chambers; Emma Walland; Jo Ireland – Research Matters, 2024
Comparative Judgement (CJ) is traditionally and primarily used to compare written texts. In this study we explored whether we could extend its use to comparing audio files. We used GCSE Music portfolios which contained a mix of audio recordings, musical scores and text documents. Fifteen judges completed two exercises: one comparing musical…
Descriptors: Evaluative Thinking, Judges, Comparative Analysis, Reliability
Suzanna Dooley; Tammy Hopper; Rachael Doyle; Orla Gilheaney; Margaret Walshe – International Journal of Language & Communication Disorders, 2025
Background: Individuals with dementia have communication limitations resulting from cognitive impairments that define the syndrome. Whereas there are numerous cognitive assessments for individuals with dementia, there are far fewer communication assessments. The Profiling Communication Ability in Dementia (P-CAD) was developed to address this gap.…
Descriptors: Communication Skills, Communication Problems, Dementia, Intellectual Disability
Mustafa Ilhan; Nese Güler; Gülsen Tasdelen Teker; Ömer Ergenekon – International Journal of Assessment Tools in Education, 2024
This study aimed to examine the effects of reverse items created with different strategies on psychometric properties and respondents' scale scores. To this end, three versions of a 10-item scale in the research were developed: 10 positive items were integrated in the first form (Form-P) and five positive and five reverse items in the other two…
Descriptors: Test Items, Psychometrics, Scores, Measures (Individuals)
Abdulkadir Haktanir; M. Furkan Kurnaz; Zeynep Simsir Gökalp – Measurement and Evaluation in Counseling and Development, 2024
Objective: Brief Self-Control Scale (BSCS) is the most widely used instrument to assess self-control. The purpose of this reliability generalization meta-analysis was to examine the degree to which consistency reliability coefficients for scores on the BSCS generalize across age groups and languages. Method: We included studies using the BSCS and…
Descriptors: Self Control, Measures (Individuals), Meta Analysis, Test Reliability
Muhammed Tayyib Kadak; Nihal Serdengeçti; Meryem Seçen Yazici; Tuncay Sandikçi; Aybike Aydin; Zehra Koyuncu; Yavuz Meral; Abas Hasimoglu; Yasin Çaliskan; Gizem Bayraktar; Elif Can Öztürk; Mehmet Enes Gökler; Roula Choueiri; Mahmut Cem Tarakçioglu – Autism: The International Journal of Research and Practice, 2024
This study aims to investigate the validation of the Rapid Interactive Screening Test for Autism in Toddlers (RITA-T) in Turkish toddlers between 18 and 36 months of age. Children aged 18-36 months were referred to the department of child psychiatry for concerns of autism spectrum disorder, language disorder, developmental delay, and typically…
Descriptors: Foreign Countries, Turkish, Screening Tests, Autism Spectrum Disorders