Authentic Assessment Toolbox
created by Jon Mueller

What is Authentic Assessment? Why Do It? How Do You Do It?











Home > Tests

What should I assess on the test?

 What do you want your students to learn?  You have identified the important knowledge and skills in your goals, standards, and objectives.  Always return to those statements before you consider what to teach or assess.  That applies to quizzes or tests covering sections, chapters, units, quarters or semesters. The content named in your subject-area standards and the skills identified in your process standards define the domain which is to be taught, learned and tested.

  Should I be Assessing Standards or Objectives?

That depends on how broad or narrow your assessment is.  If you are just testing to see if students have mastered the material in one section or a couple days of class material, then you probably want to know if they mastered certain objectives.  Normally, however, your assessment focus should be on standards.  A central principle of the standards-based reform movement has been that we as educators have focused too much on the minutiae of the curriculum at the expense of broader, more substantial goals.  By teaching to and assessing the broader and more complex competencies described in standards, we will emphasize and develop deeper learning (Newmann & Wehlage, 1993; Wiggins, 1998).  If, by the end of a unit or quarter or semester your students have mastered the content and skills described in your standards, you are unlikely to be too concerned that they have not mastered every specific objective.  Thus, most assessments, particularly those covering more material, should focus on measuring student progress towards the standards.

  Representative Sample of Items (Questions) 

Even if you are not trying to assess every concept taught, covering all the substantial learning from a unit or quarter or semester can be a time-prohibitive task.  Thus, most tests assess a representative sample of the content domains.  Teachers who construct the tests are normally responsible for determining what is a representative sample.  To make sure a sample of test questions is sufficient and representative, teachers sometimes create a matrix of standards (or objectives) and the level or type of skill required.  This matrix is often called a Table of Specifications.  For example, here is a Table of Specifications for a section in a statistics course using state/department standards.


Table of Specifications for Statistics Test

Level of Skill Required
Definitions Comprehension Application Analysis Problem-solving Total
Read and interpret tables, graphs, and charts
2 M-C items 2 M-C items 8 M-C items
Represent and organize data by creating lists, charts…
2 M-C items 6 M-C items
2 constructed-response items 10
Analyze data using mean, median… 2 multiple-choice items 2 M-C items 2 M-C items 8 M-C items 1 constructed-response item 15
Predict and test reasonableness from data using interpolation… 2 multiple-choice items 6 M-C items 2 M-C items 2 M-C items 1 constructed-response item 13
4 12 12 18 4 50 items


In the above example, the teacher would check to see if her test adequately covered her standards by asking questions such as

v      Which standards/objectives do I want to assess with this test? (Every standard should be assessed in some manner, but an occasional objective may be taught without being assessed; some of the standards/objectives within this domain may be assessed through other means)

v      Have I included questions for all the standards being tested?

v      Have I included questions that assess the most critical elements of the standards?

v      Does the distribution of items across the standards reflect the importance I attached to the different standards and that I communicated to my students?

v      Do I have a sufficient number of items for each standard?


What is a Sufficient Number of Items per Standard?

Because selected-response type test items (e.g., multiple-choice) provide considerable room for guessing, quite a few questions are needed to address each standard. How many items are needed depends upon the breadth of the standard, the type of item, and upon how critical that standard is to determining whether or not students have mastered that section, chapter, or semester's content. At least ten to fifteen multiple-choice items are likely needed to provide an adequate representative sample of the domain of a standard. Even then, multiple and varied assessments will give you a more accurate picture of how well students have met the standard (Wiggins & McTighe, 1998). So, a selected-response test will probably be just one source of evidence.

Does the Level of Understanding/Application Asked for in the Test Questions Match the Level Stated in the Standards?

To answer this question, look at the verb phrase in the relevant standards. If, for example, you have asked students to define, state, identify or recognize, then you are asking them to develop knowledge (Bloom et al., 1956; see Anderson & Krathwohl, 2001 for a revision of Bloom's taxonomy of cognitive objectives) about the subject matter and not much else. Consequently, your test questions should try to determine if students have acquired definitions, can recognize that certain things go together (without necessarily understanding why), and can list, recognize or recall certain facts.

If, instead, you have asked students in your standards to be able to explain, apply, analyze, interpret, or compare and contrast (comprehend, apply and analyze in Bloom's Taxonomy), then you are expecting more than the acquisition of knowledge. Therefore, you need to write test questions that require these higher-order uses of the concepts. (The remaining two categories of objectives in Bloom's Taxonomy, synthesis and evaluation, are extremely difficult to capture through selected-response items, and, thus, are best left for other types of assessments.) For example, if your standard states that "students will explain the causes and consequences of the Civil War," it is not sufficient for the students to recall or recognize names, dates and facts about the War on a test to assess that standard. Furthermore, it is not sufficient for students to be able to pick out a cause from among alternatives on a multiple-choice test. In such a question the students have not demonstrated that they can explain the causes and consequences, which requires a more substantial understanding of the subject matter.

In other words, it is not enough to say that you taught concepts X, Y and Z and your test covers concepts X, Y and Z. You need to look back at your standards to see what you expect your students to know and be able to do with those concepts, and develop a test that addresses those competencies.

In the section on assessing more than factual knowledge, I will describe some ways and give examples of how you can assess more substantial understanding of concepts through multiple-choice items.

Home | What is it? | Why do it? | How do you do it? | Standards | Tasks | Rubrics| Examples | Glossary

Copyright 2018, Jon Mueller. Professor of Psychology, North Central College, Naperville, IL. Comments, questions or suggestions about this website should be sent to the author, Jon Mueller, at