These are five questions that I posed during a lecture, "Using Evidence-Centered Design to Think about Assessments" presented at Florida State University. During the lecture I asked the audience to discuss the answers with their neighbors around the tables they were seated at. Now, I am revising the talk for a book chapter. My intention is that this Wiki page should serve as an online ``table'' for the readers of that chapter (and anybody else who cares to join).

This page lists all of the questions, and links to the individual questions.

Exercise 1 Tutoring Decision

Consider a school district making a decision about whether or not to place a child into a tutoring program. The school councilor has the following options available:

  • Place the child in the mainstream classroom as the baseline cost.
  • Place the child in a one-on-one tutoring program that cost ten times the baseline cost.

Before making the decision the school councilor can administer one of the following tests:

  • Ask the teacher about their impressions of the student. There is a considerable amount of variability in the way the different teachers at the school evaluate the children; however, this test is relatively inexpensive.
  • Use a detailed diagnostic assessment that costs one hundred dollars to administer and requires the student to miss 3 hours of class time.

What is the best "policy" (rule for selecting a test and a placement option based on test results)?

Exercise 2 Design a Driver's Test

Sketch the design of a automobile driver's license examination for your state or country. Think about the following questions:

  • What are the claims you want to make about newly licensed drivers?
  • What are the relevant proficiencies? How are they related to each other and the claims?
  • What would constitute evidence of those proficiencies? How are the observable variables defined? How are they related to the proficiencies?
  • How can we structure situations to elicit that evidence?
  • What kinds of presentation environment are necessary to support that data collection?
  • What other issues do we need to worry about?

Exercise 3 Early Demotion

Assume that a school district implements an an electronic grading system that uses a forecasting model to predict whether or not a student will receive a proficient rating in the annual state accountability test. About a month before the test the school uses the forecasting model to find students for whom the probability of a proficient score is less than a given threshold (say 5%). The policy is then that those student meeting not meeting the threshold are demoted to the previous grade and will be forced to repeat their current grade in the following academic year.

Unsurprisingly, this district's scores on the state accountability test go up. By using this policy, are they cheating on the intent of the accountability requirements?

Exercise 4 Assessing Knowledge of the Scientific Method

Consider a science assessment that attempts to measure the students ability to apply the scientific method. Assessing the application of the scientific method requires a domain of application, and it is difficult to envision a task that would not require both domain knowledge and knowledge of the scientific method.

What other kinds of tasks need to be on this assessment to be able to draw inferences about the examinee's ability to apply the scientific method?

Exercise 5 Portfolio Assessment

Consider a task in which a studio artist submits a portfolio of work that attempts to meet a goal specified in the problem statement. The submitted portfolios will each be rated by several raters. Making whatever assumptions you like about the proficiency model, design the evidence model for this assessment. Address the following issues:

  • What are the relevant proficiencies?
  • What are the possible observables?
  • How do they relate?
  • What about rater effects?

Page last modified on May 12, 2009, at 09:33 AM