Publication Date
In 2025 | 0 |
Since 2024 | 6 |
Since 2021 (last 5 years) | 25 |
Since 2016 (last 10 years) | 860 |
Since 2006 (last 20 years) | 1810 |
Descriptor
Statistical Analysis | 2527 |
Reliability | 1276 |
Test Reliability | 1071 |
Foreign Countries | 940 |
Correlation | 633 |
Test Validity | 628 |
Factor Analysis | 559 |
Validity | 507 |
Questionnaires | 479 |
Measures (Individuals) | 411 |
Test Construction | 338 |
More ▼ |
Source
Author
Alonzo, Julie | 12 |
Price, Gary G. | 12 |
Tindal, Gerald | 10 |
Lai, Cheng-Fei | 9 |
Brennan, Robert L. | 8 |
Raykov, Tenko | 8 |
Feldt, Leonard S. | 7 |
Livingston, Samuel A. | 7 |
Park, Bitnara Jasmine | 7 |
Irvin, P. Shawn | 6 |
Anderson, Daniel | 5 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 33 |
Practitioners | 20 |
Teachers | 10 |
Students | 8 |
Administrators | 5 |
Counselors | 2 |
Parents | 1 |
Policymakers | 1 |
Location
Turkey | 204 |
Nigeria | 57 |
Jordan | 38 |
Australia | 35 |
Iran | 35 |
Taiwan | 35 |
Canada | 31 |
China | 30 |
Germany | 29 |
California | 28 |
United Kingdom | 25 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Wanzel, Stella K.; Schultze, Thomas; Schulz-Hardt, Stefan – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2017
When advice comes from interdependent sources (e.g., from advisors who use the same database), less information should be gained as compared to independent advice. On the other hand, since individuals strive for consistency, they should be more confident in consistent compared to conflicting advice, and interdependent advice should be more…
Descriptors: Counselors, Judges, Accuracy, Reliability
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2015
Existing tests of interrater agreements have high statistical power; however, they lack specificity. If the ratings of the two raters do not show agreement but are not random, the current tests, some of which are based on Cohen's kappa, will often reject the null hypothesis, leading to the wrong conclusion that agreement is present. A new test of…
Descriptors: Interrater Reliability, Monte Carlo Methods, Measurement Techniques, Accuracy
Kara, Yilmaz; Bakirci, Hasan – Journal of Education and Training Studies, 2018
The purpose of the study was to develop an assessment scale for science activity videos that can be used to determine qualified science activity videos that can fulfill the objectives of activity based science education, help teachers to evaluate any science activity videos and decide whether to include into science learning process. The subjects…
Descriptors: Science Activities, Video Technology, Instructional Material Evaluation, Science Instruction
Christ, Theodore J.; Desjardins, Christopher David – Journal of Psychoeducational Assessment, 2018
Curriculum-Based Measurement of Oral Reading (CBM-R) is often used to monitor student progress and guide educational decisions. Ordinary least squares regression (OLSR) is the most widely used method to estimate the slope, or rate of improvement (ROI), even though published research demonstrates OLSR's lack of validity and reliability, and…
Descriptors: Bayesian Statistics, Curriculum Based Assessment, Oral Reading, Least Squares Statistics
Ford, Jeremy W.; Conoyer, Sarah J.; Lembke, Erica S.; Smith, R. Alex; Hosp, John L. – Assessment for Effective Intervention, 2018
In the present study, two types of curriculum-based measurement (CBM) tools in science, Vocabulary Matching (VM) and Statement Verification for Science (SV-S), a modified Sentence Verification Technique, were compared. Specifically, this study aimed to determine whether the format of information presented (i.e., SV-S vs. VM) produces differences…
Descriptors: Curriculum Based Assessment, Evaluation Methods, Measurement Techniques, Comparative Analysis
McRae, Lamerial; Gonzalez, Jennifer E.; Dominguez, Vanessa; Daire, Andrew Patrick; Liu, Xun – Measurement and Evaluation in Counseling and Development, 2018
We examined the construction for a modified Acceptance of Couple Violence (ACV) scale administered to lesbian, gay, bisexual, transgender, and queer college students (N = 266) measuring intimate partner violence. We ran an exploratory and confirmatory factor analysis; results identified 1 factor for the instrument explaining 76% of the variance in…
Descriptors: Factor Analysis, Test Construction, Questionnaires, Measures (Individuals)
Tock, Jamie L.; Moxley, Jerad H. – Journal of Psychoeducational Assessment, 2018
The Metacognitive Self-Regulation scale (MSR) was recently improved after subjecting the scale to a comprehensive reanalysis and replacing it with the Metacognitive Self-Regulation Revised scale (MSR-R). However, up to this point, researchers have made no attempts to determine if the MSR or MSR-R performs equivalently for males and females. The…
Descriptors: Metacognition, Gender Differences, Self Management, Test Reliability
Yalçin, S. Barbaros – Universal Journal of Educational Research, 2018
The purpose of this research is to determine whether prospective teachers' spiritual expressions have predicted their mindfulness. The research was conducted in relational screening model. The study group consisted of 411 students (81.2%) females and 94 (18.6%) males, totally 505 undergraduate students who are studying in the last year and who…
Descriptors: Preservice Teachers, Foreign Countries, Undergraduate Students, Prediction
Morris, Darrell; Pennell, Ashley M.; Perney, Jan; Trathen, Woodrow – Reading Psychology, 2018
This study compared reading rate to reading fluency (as measured by a rating scale). After listening to first graders read short passages, we assigned an overall fluency rating (low, average, or high) to each reading. We then used predictive discriminant analyses to determine which of five measures--accuracy, rate (objective); accuracy, phrasing,…
Descriptors: Reading Fluency, Prediction, Grade 1, Elementary School Students
Lambert, Matthew C.; January, Stacy-Ann A.; Pierce, Corey D. – Journal of Psychoeducational Assessment, 2018
The Emotional and Behavioral Screener (EBS) is a recently developed teacher-reported brief screening instrument for identifying students who are at-risk of an emotional or behavioral disorder (EBD). Although prior research supports the technical adequacy of scores from the EBS, there is a gap in the literature regarding strong evidence of the…
Descriptors: Screening Tests, Scores, Emotional Disturbances, Behavior Disorders
Aytaç, Kürsat Yusuf – Journal of Education and Training Studies, 2018
This research was conducted to investigate the effect of internet dependency on student- teachers' loneliness of Admiyaman University. The study also examined the differences in internet dependency and loneliness among students and teachers of Adiyaman University of Turkey. The standard questionnaire of Jung (1996) was used to measure the internet…
Descriptors: Internet, Psychological Patterns, Student Teachers, Social Isolation
Ebuoh, Casmir N. – World Journal of Education, 2018
Literature revealed that the patterns/methods of scoring essay tests had been criticized for not being reliable and this unreliability is more likely to be more in internal examinations than in the external examinations. The purpose of this study is to find out the effects of analytical and holistic scoring patterns on scorer reliability in…
Descriptors: Holistic Approach, Scoring, Essay Tests, Biology
Isik, Utku; Demirel, Mehmet – Online Submission, 2018
This study aimed to adapt to Turkish the measurement of work-leisure conflict developed by Tsaur et al. (2012) to measure work-leisure conflict and to present the causes and dimensions of the conflict and to develop a new study-leisure conflict scale for university students based on the items of this scale and to undertake reliability and validity…
Descriptors: Foreign Countries, Test Reliability, Test Validity, Conflict of Interest
Matsueda, Ross L.; Drakulich, Kevin M. – Sociological Methods & Research, 2016
This article specifies a multilevel measurement model for survey response when data are nested. The model includes a test-retest model of reliability, a confirmatory factor model of inter-item reliability with item-specific bias effects, an individual-level model of the biasing effects due to respondent characteristics, and a neighborhood-level…
Descriptors: Hierarchical Linear Modeling, Measurement, Surveys, Reliability