Recent Changes - Search:



Bayesian Networks in Educational Assessment

Cognition and Assessment


edit SideBar



"Evidence-centered design" (ECD) describes a program of research and application initiated at Educational Testing Service in 1997 by Robert J. Mislevy, Linda S. Steinberg, and Russell G. Almond. The work proposes a principled framework for designing, producing, and delivering educational assessments. The overarching conception is that all of the many and varied activities and elements of an assessment derive their meaning in the instantiation of an assessment argument.

The ECD framework can be used to analyze and design assessments of all kinds. Special attention is accorded to assessments that incorporate features (such as complex student models and interactive simulations) that lie beyond the rules-of-thumb and analytic procedures that have evolved over the years to support familiar kinds of assessments.

The contribution of ECD is not so much a particular advance in statistics, psychology, or forms of assessment as it is in laying out a coherent framework to harness recent developments of these various types toward a common purpose.

This article provides a brief history of ECD. The following sections address in turn the rationale of ECD, historical antecedents, the Portal project at ECD, and subsequent applications and development projects.


Advances in cognitive and instructional sciences stretch our expectations about the kinds of knowledge and skills we want to develop in students, and the kinds of observations we need to evidence them (Glaser et al., 1987). Off-the-shelf assessments and standardized tests are increasingly unsatisfactory for guiding learning and evaluating studentsí progress.

Advances in technology make it possible to evoke evidence of knowledge more broadly conceived, and capture more complex performances. Making sense of complex data that result is the bottleneck.

Advances in evidentiary reasoning (e.g., Schum, 1994) and in statistical modeling (e.g., Gelman et al., 1995) allow us to bring probability-based reasoning to bear on the problems of modeling and uncertainty that arise naturally in all assessments. These advances extend the principles that underlie familiar test theory to more varied and complex inferences from more complex data (Mislevy, 1994).

Advanced technologies and statistical methods arenít sufficient. One must design a complex assessment from the very start around the inferences one wants to make, the observations one needs to ground them, the situations that will evoke those observations, and the chain of reasoning that connects them (Messick, 1994).

A conceptual design framework for the elements of a coherent assessment can be laid out at a level of generality that supports a broad range of assessment types, from familiar standardized tests and classroom quizzes, to coached practice systems and simulation-based assessments, to portfolios and student-tutor interaction. Our design framework is based on the principles of evidentiary reasoning and the exigencies of assessment production and delivery. Designing assessment products in such a framework ensures that the way in which evidence is gathered and interpreted bears on the underlying knowledge and purposes the assessment is intended to address. The common design architecture further ensures coordination among the work of different specialists, such as statisticians, task authors, delivery vendors, interface designers.

Historical Antecedents

[under construction]

The Portal project

The Portal research program integrated and extended earlier work by the three principals: Mislevy (e.g., 1994) on evidentiary reasoning in assessment, Steinberg (e.g., Steinberg & Gitomer, 1996) on cognitively-based intelligent tutoring systems, and Almond (e.g., 1995) on building blocks for graphical probability models.

The work during 1997-1999 included developing the assessment design framework and applying the ideas. Developing the design framework is ETSís "Portal" project. It comprised three distinguishable aspects:

  1. A conceptual framework for assessment design. The Ďevidence-centeredí Portal assessment design framework explicates the relationships among the inferences the assessor wants to make about the student, what needs to be observed to provide evidence for those inferences, and what features of situations evoke that evidence.
  2. An object model for creating specifications for particular assessment products. The Portal object model embodies the pieces and relationships that any particular assessments needs. Instantiating instances of the objects as needed for a particular assessment ensures that an assessment will have the functionality it needs and the components will work together. This framework promotes re-usability of objects and processes.
  3. Software tools for creating and managing the design objects. The first generation of software tools to facilitate the design process have been developed. They can be used to create, manipulate, and coordinate a structured data-base containing the elements of an assessment design, which then serves as a blueprint for developing the application.

Several publications describing the conceptual model and its implications, although details of ETSís specific implementation of the approach in terms of the Portal object-model and software tools are proprietary. Various papers highlight different (interconnecting) facets of designing and implementing complex assessments, including cognitive psychology (Steinberg & Gitomer, 1996); probability-based reasoning (Almond et al., 1999); task design (Almond & Mislevy, 1999; Mislevy, Steinberg, & Almond, in press); and computer-based simulation (Mislevy et al., 1999a). Mislevy, Steinberg, Breyer, Johnson, & Almond (1999a, 1999b) apply this machinery to design a simulation-based assessment of problem-solving in dental hygiene. Almond, Steinberg, and Mislevy (1999) apply it to the challenge of defining standards for the inter-operability of assessment/instructional materials and processes. [links to online versions of many of these papers are forthcoming]


Almond, R.G. (1995). Graphical belief modelling. London: Chapman and Hall.

Almond, R.G., Herskovits, E., Mislevy, R.J., and Steinberg, L.S. (1999). Transfer of information between system and evidence models. In D. Heckerman & J. Whittaker (Eds.), Artificial Intelligence and Statistics 99 (pp. 181-186). San Francisco: Morgan Kaufmann.

Almond, R.G., & Mislevy, R.J. (1999). Graphical models and computerized adaptive testing. Applied Psychological Measurement, 23, 223-237.

Almond, R.G., Steinberg, L.S., & Mislevy, R.J. (1999). A sample assessment using the four process framework. White paper prepared for the IMS Inter-Operability Standards Working Group. Princeton, NJ: Educational Testing Service.

Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995). Bayesian data analysis. London: Chapman & Hall.

Glaser, R., Lesgold, A., & Lajoie, S. (1987). Toward a cognitive theory for the measurement of achievement. In R. Ronning, J. Glover, J.C. Conoley, & J. Witt (Eds.), The influence of cognitive psychology on testing and measurement: The Buros-Nebraska Symposium on measurement and testing (Vol. 3) (pp. 41-85). Hillsdale, NJ: Erlbaum.

Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13-23.

Mislevy, R.J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-483.

Mislevy, R.J., Almond, R.G., Yan, D., & Steinberg, L.S. (1999). Bayes nets in educational assessment: Where do the numbers come from? In K. B. Laskey & H. Prade (Eds.), Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (437-446). San Francisco: Morgan Kaufmann Publishers, Inc.

Mislevy, R.J., Steinberg, L.S., & Almond, R.G. (in press). On the roles of task model variables in assessment design. In S. Irvine & P. Kyllonen (Eds.), Generating items for cognitive tests: Theory and practice. Hillsdale, NJ: Erlbaum.

Mislevy, R.J., Steinberg, L.S., Breyer, F.J., Almond, R.G., & Johnson, L. (1999a). A cognitive task analysis, with implications for designing a simulation-based assessment system. Computers and Human Behavior, 15, 335-374.

Mislevy, R.J., Steinberg, L.S., Breyer, F.J., Almond, R.G., & Johnson, L. (1999b). Making sense of data from complex assessments. Paper presented at the 1999 CRESST Conference, Los Angeles, CA, September 16-17, 1999. Steinberg, L.S., & Gitomer, D.G. (1996). Intelligent tutoring and assessment built on an understanding of a technical problem-solving task. Instructional Science, 24, 223-258.

Schum, D.A. (1994). The evidential foundations of probabilistic reasoning. New York: Wiley.

Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., & Cowell, R.G. (1993). Bayesian analysis in expert systems. Statistical Science, 8, 219-283.

Edit - History - Print - Recent Changes - Search
Page last modified on March 02, 2007, at 07:39 AM