Classroom Assessments

One of the most troublesome challenges of classroom instruction is the classroom test. Some teachers create their own tests and others use “book tests.” But the concern among principals -- and complaints among parents and students -- remains: Are we testing what we’re teaching, and are we using valid tests?

CLASSROOM TESTS COMPARED TO OTHER TESTS
The purpose of the classroom test is to measure student mastery of specific standards. Actually, it is this level of test for which the classroom teacher is most responsible, and from which the results are the most immediate and helpful to the teacher. These tests reflect not just the content of the standards but also their cognitive demand. Classroom tests are specific to what has just been taught, and they are diagnostic as to individual student needs. But these tests should not be about "points earned" toward a passing grade. They reflect "stardards mastered" toward a composite picture of performance learning. Publishers’ tests, unfortunately, refer to the content of standards but not necessarily the cognitive demand. EdFOCUS consultants show teachers how to create classroom test items that reflect the content and cognitive demand of the standards but also -- and this is most important -- using formats and item designs that are parallel to the state’s high-stakes tests. But classroom tests are only one side of the assessment coin. The other side is the array of commercial tests that are external to the classroom. A few of these are listed below.
- High Stakes Tests. The purpose for high-stakes tests is to measure the levels of academic performance on state-level content standards. The scores compare students to their age-mates within a district, a state, or even the country. For example, there is the MAP Test (assessing Math and Reading. AIR Test (from the American Institutes for Research); the PARCC Test (from the Partnership for Assessment of Readiness for College and Careers), and various State Proficiency Tests. The states actually use these tests to rate districts in terms of the number of students who are proficient. Additionally, these test scores reflect each student’s “AYP” (or average yearly progress) to determine the direction of his or her individual growth.
- Commercial Diagnostic Tests. These tests identify individual student strengths and weaknesses with a set of specific academic skills. The scores inform teachers where and how to intervene with each student to improve proficiency. For example, there is the Stanford Diagnostic Reading Test, the Key Math test, and DIBELS (Dynamic Indicators of Basic Early Learning Skills).
- Benchmark Tests. Many districts also use semester or quarterly Benchmark tests. These tests determine student mastery of standards taught in a particular quarter. Collectively, they represent a year’s worth of mastery.
- Standardized Achievement Tests. These tests determine student mastery of a select array of skills in comparison to other students their age around the state or the country. Samples of these are the ACT, the SAT, the Terra Nova, and the Iowa Test of Basic Skills.
- The National Assessment of Educational Progress Test -- NAEP This test is developed and normed by the National Center of Education Statistics (NCES), a division of the Institute of Educational Sciences (IES) in the U.S. Department of Education. It is administered in each state every two years and measures student proficiency in Math and Reading at grades 4 and 8.
THE STRUCTURE OF TRADITIONAL CLASSROOM PAPER-PENCIL TESTS
If districts are actually data-driven -- and not just SAY they are -- their classroom assessments fall at two intervals. One level is FORMATIVE (interim or short-cycle tests) to immediately determine if and what re-teaching is needed. The second level is SUMMATIVE (or end-of-unit assessments) to determine student mastery at the end of instruction. Teachers may call their summative tests Chapter tests or Unit Tests.
In some cases, teachers use the tests published by their textbook companies. In other cases, they create their own. Whichever, there are several important considerations in the selection or creation of classroom tests:
1. The test items must be validly constructed to measure the Unit standards -- not just the topic but the level of rigor required too.
2. The test items actually reflect what has been taught -- again at the level of rigor required in the standards.
3. The tests must parallel the formats students will see on their high-stakes tests. These are (1) Multiple Choice, some of which have more than one correct answer; and (2) Constructed Response items that require students to show they can extend the concept beyond a classroom situation. These high-stakes test formats may involve students...
-Conducting error analysis
-Citing text detail from a document
-Showing their work
-Writing an explanation
-Drawing a diagram or making a graphic
-Interpreting data
-Comparing two or more documents
In our experience, very few teachers have had training in how to construct or select effective tests. And now with the current emphasis on rigorous academic standards, the challenge is even greater. EdFOCUS helps teachers construct valid test items that reflect the content standards. The training includes:
- How to decide which type of test item is the most valid for the standard being measured—multiple choice or constructed response or both?
- How to unpack each standard in terms of its content and its cognitive demand. How many test items are needed to determine mastery of a standard.
- Whether multiple standards can be assessed with the same test items.
- How to build Multiple Choice items including how to construct valid stems and how to devise diagnostic distractors so they reveal specific “misunderstandings” that help pinpoint the exact need for intervention.
- How to design Constructed Response items that require students to construct meaning for themselves. These questions must reflect the level of rigor required by the standards assessed. It is important that teachers pre-write expected answers that indicate what needs to be in the response as well as how the student's answer reveals any "misunderstandings."
THE NECESSITY OF PERFORMANCE OR AUTHENTIC CLASSROOM ASSESSMENTS
In contrast to individual steps in learner outcomes (such as “add 2-digit numbers with regrouping”), the current, more rigorous content standards are holistic and performance-based (such as “solve real-world math problems involving addition of 2-digit numbers with regrouping”). That is, students are expected to actually apply what they have learned in the classroom to scenarios from daily living. The mastery of performance standards cannot be determined solely by traditional tests. To verify independent and enduring mastery, the assessments must be authentic, life-based, and parallel to the standards themselves.

The idea of performance assessments is not new. It’s been the mainstay of Career and Technical schools for three decades, and the professions of law, medicine, and even plumbing have always used performance to determine competence. To “grade” or evaluate the quality of performance, teachers should use a Rubric or checklist of criteria taken from the standards being measured. EdFOCUS has seen firsthand the negative results of districts purchasing books of scoring Rubrics that do not match the standards.

EdFOCUS consultants provide teachers with the rationale behind authentic assessments and offer several sample formats. These include a variety of original written products, original math problems, error analyses, and the deep-level on-demand analyses of unfamiliar texts and documents. Rather than starting from scratch, teachers are also provided actual performance assessments they can adapt. EdFOCUS is proud to have samples at all grade levels and subject areas. A few are listed below:
- Math: Given the purchase of a used boat for $20,000 and a depreciation factor of 15% per year, the student writes an exponential “depreciation” model to represent the value of the boat after 3.5 years.
- English/Language Arts: The student presents the pro and con of a position (e.g., school uniforms), and presents it as a 3-minute newscast to air on FOX or CNN as a “talking head.”
- Science: The student describes the activities of the cell by creating a graphic organizer to show how each activity in the cell process works together for the benefit of the individual cell as well as the overall organism.
- Social Studies: The student devises a lesson plan to teach younger children about the 1930s that includes (a) the Great Depression; (b) the Dust Bowl; (c) the New Deal; and (d) an impact or a lesson learned for our own times. Include a presentation “script,” visuals, and at least one student handout.
For each authentic or performance assessment they add to their Units, teachers are provided help in creating a scoring Rubric -- drawn from the standards reflected by the project.

Classroom Assessments

CLASSROOM TESTS COMPARED TO OTHER TESTS

THE STRUCTURE OF TRADITIONAL CLASSROOM PAPER-PENCIL TESTS

THE NECESSITY OF PERFORMANCE OR AUTHENTIC CLASSROOM ASSESSMENTS

How We Can Help

Diagnostic Services

Audits

Test Results and Error Analysis

Content Standards and Curriculum Consulting

Unpacking and Mapping the Curriculum

Classroom Assessments

Best Practices for Unit Planning

Professional Development

Taxonomy of Best Practices

Success Stories

Resources

Classroom Tips

About Us

Contact Us

Classroom Assessments

CLASSROOM TESTS COMPARED TO OTHER TESTS

THE STRUCTURE OF TRADITIONAL CLASSROOM PAPER-PENCIL TESTS

THE NECESSITY OF PERFORMANCE OR AUTHENTIC CLASSROOM ASSESSMENTS

Resources﻿

Resources