Application of Conditional Means for Diagnostic Scoring

Hollis Lai, Mark J. Gierl, Oksana Babenko

Abstract


In educational assessment, demand for diagnostic information from test results has prompted the development of model-based diagnostic assessments. To determine student mastery of specific skills, a number of scoring approaches, including subscore reporting and probabilistic scoring solutions, have been developed to score diagnostic assessments. Although each approach has a unique set of limitations, these approaches are, nevertheless, often used in diagnostic scoring, whereas an alternative approach, Complex Sum Scores (CSS), has not received much attention yet. With the process of developing model-based diagnostic assessments becoming increasingly complex, we revisit the CSS and demonstrate two applications of the CSS in the development of diagnostic assessments. Two applications include: (a) illustrating and validating skills within the model, and (b) partial mastery scoring using model-based distractors. By demonstrating the two applications, we aim to show how model-based diagnostic assessments can be developed and scored using the CSS scoring approach, the results of which can be used by teachers to inform teaching and learning.

Keywords


subscore reporting; diagnostic scoring; complex sum scores

Full Text:

PDF

References


Babenko, O., & Rogers, W. T. (2014). Comparison and properties of correlational and agreement methods for determining whether or not to report subtest scores. International Journal of Learning, Teaching and Educational Research, 4(1), 61-74.

Cui, Y., & Leighton, J. (2009). The hierarchy consistency index: Evaluating person fit for cognitive diagnostic assessment. Journal of Educational Measurement, 46, 429-449.

De la Torre, J., & Patz, R. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295-311.

Embretson, S. (1994). Application of cognitive design systems to test development. In C. R. Reynolds (Ed.), Cognitive assessment: A multidisciplinary perspective (pp. 107-136). New York: Plenum.

Gierl, M. J. (2007). Making diagnostic inferences about cognitive attributes using the rule space model and attribute hierarchy method. Journal of Educational Measurement, 44, 325–340.

Gierl, M., Cui, Y., & Zhou, J. (2009). Reliability and attribute-based scoring in cognitive diagnostic assessment. Journal of Educational Measurement, 46(3), 293-313.

Gorin, J. S. (2007). Test construction and diagnostic testing. In J. P. Leighton & M. J. Gierl, (Eds.) Cognitive Diagnostic Assessment in Education: Theory and Practice. Cambridge University Press.

Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33(2), 204–229.

Hartz, S. M. (2002). A Bayesian framework for the uniï¬ed model for assessing cognitive abilities: Blending theory with practicality.

Unpublished doctoral dissertation, Department of Statistics, University of Illinois, Urbana-Champaign.

Henson, R., Templin, J., & Douglas, J. (2007). Using efficient model based sum-scores for conducting skills diagnoses. Journal of Educational Measurement, 44, 361-376.

Junker, B.W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.

Leighton, J. P., & Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice, 26, 3-16.

Leighton, J. P., Gierl, M. J., & Hunka, S. (2004). The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka’s rule space approach. Journal of Educational Measurement. 41(3), 205-237.

Luecht, R. (2007). Using information from multiple-choice distractors to enhance cognitive-diagnostic score reporting in J. Leighton and M. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications. New York, NY: Cambridge. pp. 319-340.

Luecht, R. (November, 2008). Assessment engineering in test design, development, assembly, and scoring. Keynote address at the East Coast Organization of Language Testers (ECOLT), Washington, DC.

Mislevy, R. J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-483.

Nichols, P. (1994). A framework for developing cognitively diagnostic assessments. Review of Educational Research, 64(4), 575-603.

Sinharay, S. (2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150-174.

Sinharay, S., Puhan, G., & Haberman, S. (2010). Reporting diagnostic scores in educational testing: Temptations, pitfalls, and some solutions. Multivariate Behavioral Research, 45, 553-573.

Von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287-307.

Wainer, H., Vevea, J., Camacho, F., Reeve, B., Rosa, K., Nelson, L., Swygert, K., & Thissen, D. (2001). Augmented scores - “Borrowing strength†to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.) Test Scoring. Mahwah, NJ: LEA. pp. 343-387.

Wilson, M. (2009). Measuring progressions: Assessment structures underlying a learning progression. Journal of Research in Science Teaching, 46(6), 716-730.


Refbacks

  • There are currently no refbacks.


e-ISSN: 1694-2116

p-ISSN: 1694-2493