Comparison and Properties of Correlational and Agreement Methods for Determining Whether or Not to Report Subtest Scores

Oksana Babenko, W. Todd Rogers

Abstract


Large-scale testing agencies often report subtest scores in addition to reporting the total test score. But is there evidence that subtests reveal differences in student performances? Three methods for determining whether subscore reporting is warranted were examined and evaluated using large-scale data as well as samples of various sizes for Reading and Mathematics assessments. Results revealed that subtests did not differ among themselves and added no value over the total test. The method statistics were determined to be accurate and precise estimators of the population parameters. Implications for subscore reporting are discussed.


Keywords


subscore reporting; accuracy; precision; large-scale assessment

Full Text:

PDF

References


American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.

Grandy, J. (1992). Construct validity study of the NTE core battery using confirmatory factor analysis. (ETS Research Report No. RR-92-03). Princeton, NJ: Educational Testing Service.

Gulliksen, H. (1950, 1967). Theory of mental tests. New York: John Wiley & Sons, Inc.

Haberman, S. J. (2005). When can subscores have value? (ETS Research Report No. RR-05-08). Princeton, NJ: Educational Testing Service.

Haberman, S. J. (2008). Subscores and validity. (ETS Research Report No. RR-08-64). Princeton, NJ: Educational Testing Service.

Haladyna, T. M. & Kramer, G. A. (2004). The validity of subscores for a credentialing test. Evaluation and the Health Professions, 27, 349–368.

Harris, D. J. & Hanson, B. A. (1991, April). Methods of examining the usefulness of subscores. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.

Kelley, T. L. (1923). A new method for determining the significance of differences in intelligence and achievement scores. Journal of Educational Psychology, 14, 300–303.

Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. New York: Addison–Wesley.

Lyrén, P. E. (2009). Reporting subscores from college admission tests. Practical Assessment, Research and Evaluation, 14(4), 1–10.

McPeek, M., Altman, R., Wallmark, M., & Wingersky, B. C. (1976). An investigation of the feasibility of obtaining additional subscores on the GRE Advanced Psychology Test (GRE Board Professional Report No. 74 - 4P). Princeton, NJ: Educational Testing Service. (ERIC Document No. ED163090)

Ryan, J. (2003). An analysis of item mapping and test reporting strategies. Greensboro, NC: South Carolina Department of Education.

Sinharay, S. (2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150–174.

Sinharay, S., Haberman, S. J., & Puhan, G. (2007). Subscores based on classical test theory: To report or not to report. Educational Measurement: Issues and Practice, 26, 21–28.

Sinharay, S., Puhan, G., & Haberman, S. (2009). Reporting diagnostic scores: Temptations, pitfalls, and some solutions. Paper presented at the National Council on Measurement in Education, San Diego, CA, USA.

Tate, R. L. (2004). Implications of multidimensionality for total score and subscore performance. Applied Measurement in Education, 17, 89–112.

Wainer, H., Sheehan, K. M., & Wang, X. (2000). Some paths toward making Praxis scores more useful. Journal of Educational Measurement, 37, 113–140.

Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., Swygert, K. A., & Thissen, D. (2001). Augmented scores –“borrowing strength†to compute scores based on small numbers of items. In Test Scoring (pp. 343–387). Mahwah, NJ: Lawrence Erlbaum Associates.

Yao, L. & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subtest proficiency estimation and classification. Applied Psychological Measurement, 31, 83–105.


Refbacks

  • There are currently no refbacks.


e-ISSN: 1694-2116

p-ISSN: 1694-2493