Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 29 |
Descriptor
Source
Applied Psychological… | 149 |
Author
Publication Type
Education Level
Higher Education | 5 |
Postsecondary Education | 3 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 2 | 1 |
High Schools | 1 |
Primary Education | 1 |
Secondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating

Meijer, Rob R.; And Others – Applied Psychological Measurement, 1995
Three methods based on the nonparametric item response theory (IRT) of R. J. Mokken for the estimation of the reliability of single dichotomous test items are discussed. Analytical and Monte Carlo studies show that one method, designated "MS," is superior because of smaller bias and smaller sampling variance. (SLD)
Descriptors: Estimation (Mathematics), Item Response Theory, Monte Carlo Methods, Nonparametric Statistics

Schulz, E. Matthew; Kolen, Michael J.; Nicewander, W. Alan – Applied Psychological Measurement, 1999
Developed a procedure for defining achievement levels on continuous scales using aspects of Guttman scaling (L. Guttman, 1950) and Item Response Theory. Using data from high school mathematics tests for about 6,000 students, found the new procedure to have higher reliability, higher classification consistency, and lower classification error than…
Descriptors: Academic Achievement, Classification, Estimation (Mathematics), High School Students

Davison, Mark L.; Robbins, Stephen – Applied Psychological Measurement, 1978
Empirically weighted scores for Rest's Defining Issues Test were found to be more reliable than the simple sum of scores theoretically weighted sum, or Rest's p scores. They also had slightly higher correlations with Kohlberg's interview scores. Empirically weighted scores also showed more significant change in two longitudinal studies. (CTM)
Descriptors: Higher Education, Longitudinal Studies, Moral Development, Moral Values

Poizner, Sharon B.; And Others – Applied Psychological Measurement, 1978
Binary, probability, and ordinal scoring procedures for multiple-choice items were examined. In two situations, it was found that both the probability and ordinal scoring systems were more reliable than the binary scoring method. (Author/CTM)
Descriptors: Confidence Testing, Guessing (Tests), Higher Education, Multiple Choice Tests

Lunneborg, Clifford E. – Applied Psychological Measurement, 1977
Three studies are described in which choice reaction time (RT) was related to such psychometric ability measures as verbal comprehension, numerical reasoning, hidden figures, and progressive matrices tests. Fairly consistent negative correlations were found between these tests and choice RT when high school samples were used. (Author/CTM)
Descriptors: Cognitive Ability, Cognitive Processes, High Schools, Higher Education

Hambleton, Ronald K., Ed. – Applied Psychological Measurement, 1980
This special issue covers recent technical developments in the field of criterion-referenced testing. An introduction, six papers, and two commentaries dealing with test development, test score uses, and evaluation of scores review relevant literature, offer new models and/or results, and suggest directions for additional research. (SLD)
Descriptors: Criterion Referenced Tests, Mastery Tests, Measurement Techniques, Standard Setting (Scoring)

Frederiksen, Norman; Ward, William C. – Applied Psychological Measurement, 1978
A set of Tests of Scientific Thinking were developed for possible use as criterion measures in research on creativity. Scores on the tests describe both quality and quantity of ideas produced in formulating hypotheses, evaluating proposals, solving methodological problems, and devising methods for measuring constructs. (Author/CTM)
Descriptors: Creativity Tests, Higher Education, Item Sampling, Predictive Validity

Schmeck, Ronald Ray; And Others – Applied Psychological Measurement, 1977
Five studies are presented describing the development of a self-report inventory for measuring individual differences in learning processes. Factor analysis of items yielded four scales: Synthesis-Analysis, Study Methods, Fact Retention, and Elaborative Processing. There were no sex differences, and the scales demonstrated acceptable reliabilities…
Descriptors: Factor Analysis, Higher Education, Learning Processes, Retention (Psychology)

Hsu, Louis M. – Applied Psychological Measurement, 1979
A comparison of the relative ordering power of separate and grouped-items true-false tests indicated that neither type of test was uniformly superior to the other across all levels of knowledge of examinees. Grouped-item tests were found superior for examinees with low levels of knowledge. (Author/CTM)
Descriptors: Academic Ability, Knowledge Level, Multiple Choice Tests, Scores

Woodruff, David J.; Sawyer, Richard L. – Applied Psychological Measurement, 1989
Two methods--non-distributional and normal--are derived for estimating measures of pass-fail reliability. Both are based on the Spearman Brown formula and require only a single test administration. Results from a simulation (n=20,000 examinees) and a licensure examination (n=4,828 examinees) illustrate these methods. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Licensing Examinations (Professions), Measures (Individuals)

Luecht, Richard M.; Hirsch, Thomas M. – Applied Psychological Measurement, 1992
Derivations of several item selection algorithms for use in fitting test items to target information functions (IFs) are described. These algorithms, which use an average growth approximation of target IFs, were tested by generating six test forms and were found to provide reliable fit. (SLD)
Descriptors: Algorithms, Computer Assisted Testing, Equations (Mathematics), Goodness of Fit

Levin, Joel R.; Subkoviak, Michael J. – Applied Psychological Measurement, 1977
Textbook calculations of statistical power or sample size follow from formulas that assume that the variables under consideration are measured without error. However, in the real world of behavioral research, errors of measurement cannot be neglected. The determination of sample size is discussed, and an example illustrates blocking strategy.…
Descriptors: Analysis of Covariance, Analysis of Variance, Error of Measurement, Hypothesis Testing

Rozeboom, William W. – Applied Psychological Measurement, 1989
Formulas are provided for estimating the reliability of a linear composite of non-equivalent subtests given the reliabilities of component subtests. The reliability of the composite is compared to that of its components. An empirical example uses data from 170 children aged 4 through 8 years performing 34 Piagetian tasks. (SLD)
Descriptors: Elementary School Students, Equations (Mathematics), Estimation (Mathematics), Mathematical Models

Chang, Lei – Applied Psychological Measurement, 1994
Reliability and validity of 4-point and 6-point scales were assessed using a new model-based approach to fit empirical data from 165 graduate students completing an attitude measure. Results suggest that the issue of four- versus six-point scales may depend on the empirical setting. (SLD)
Descriptors: Attitude Measures, Goodness of Fit, Graduate Students, Graduate Study

Eiting, Mindert H. – Applied Psychological Measurement, 1991
A method is proposed for sequential evaluation of reliability of psychometric instruments. Sample size is unfixed; a test statistic is computed after each person is sampled and a decision is made in each stage of the sampling process. Results from a series of Monte-Carlo experiments establish the method's efficiency. (SLD)
Descriptors: Computer Simulation, Equations (Mathematics), Estimation (Mathematics), Mathematical Models