shortcut link to main content
Photo of High School building entrance with access to district office
Voorheesville Central School District heading graphic
pattern background graphicLink to Clayton A. Bouton High School home pageLink to Voorheesville Middle School home pageLink to Voorheesville Elementary School home page
 
 

arrow icon graphicSchool District News

 
 

What's behind a scaled score

Superintendent Dr. Theresa Thayer Snyder


(August 18, 2010) I admit that I am a long-time skeptic as to the value of some of the New York State testing regimen, especially at the elementary level. I often tell people about my experiences as an elementary school principal when the tests were first initiated for fourth and eighth graders. The nine-year-olds took ELA exams, Math exams, and Science exams. These exams were designed to give us hard data about student performance. I observed levels of stress among the students and collected my own hard data—how many fourth graders visited the school nurse during testing weeks compared to the rest of the year. You can probably guess that visits spiked remarkably. I decided at the end of one testing cycle to surprise my fourth graders with an ice cream party. When I announced that all fourth graders should come to the cafeteria, one little boy asked his teacher if he should bring his number 2 pencil!

A few years later, the grades 3-8 tests were initiated. I admit my skepticism remained intact, but when it was announced that all children would be proficient by 2014, I actually became prophetic. “Well, what that means is the scaled scores will increase gradually to give the illusion that there is growth towards attaining that lofty goal of 100% proficiency, despite the fact that the lofty goal makes no sense at all.” You do not need too much of a background in educational statistics to realize that if you give such a test and everybody passes, it is a bad test, just as it is if you give such a test and everybody fails, it is a bad test. You may recall this occurred a few years back on the Regents Math exam, which was re-scaled because so many students failed it. To say all will be proficient means the results have to be statistically tweaked to accomplish this. It is rather like the Lake Woebegone Effect, where all the children are above average.

Now we have a new dilemma. In order to prove that New York State tests have been soft, this year’s versions were bumped up and moved to the end of the year. We educators were told that these assessments would be testing a good deal more than previous ones. Once they were completed and sent off to the State, an announcement was made that indicated a new cut point was implemented for determining levels of performance. I am sure you have read the local press regarding the shock that school people and parents will be feeling when they see that a child’s performance level on the State exams has been seriously affected by the new cut point. But again, that old skepticism of mine inched forward. One of my administrators pointed out an irregularity. On a third grade math test, a child who answered 37 out of 39 questions correctly (that is 95% correct) was deemed performing at level 3. A child who answered 85% of the answers correctly was deemed performing at level 2. Some children, who were performing at level 1, actually had scaled scores that would have put them at level 3 last year. I scratched my head over this and wondered it this was an aberration on the grade 3 math assessment. As I began to dig through the data, I found that it was not an aberration, but was true across all tests, in both ELA and math. I contacted a couple of colleagues who advised me to remember that you can’t compute percentages and compare them to scaled scores because test items are weighted by difficulty. So I ran an error analysis, and I learned that the errors the students made were random (especially for higher-scoring students), which suggests item difficulty did not impact their outcomes.

Since psychometrics is not my passion, I decided to consult a guru of psychometrics whom I met several years ago. Dr. W. James Popham is professor emeritus of statistics and educational measurement from UCLA. He is an authoritative author and a renowned scholar in this field. When I emailed him my dilemma and asked if I was on track in looking at these data sets differently, he generously responded: “One of the problems with scaled scores is that, although their potentials for analysis are considerable, it is really impossible to make any sense out of them. Thus, when you use a “percent-correct” prism in an attempt to interpret the meaning of your school’s scaled scores, this is a really sensible thing to do. I wish more educators would be sensible!”

I have continued to correspond with him as I have dug deeper and his response has been consistent: “Your analysis is one way of trying to figure out on your own what’s meant by these mystical numbers.” Dr. Popham has written a book I recommend for parents called Testing! Testing! What Every Parent Needs to Know About School Tests.

As I have been analyzing the data sets through this new prism, I am finding that the cut points were adjusted to force more students into levels 3 and 2, despite answering large percentages of the questions accurately. A handful of children who scored less well are children with whom we are already working to help them achieve. But I have to be honest, I simply cannot look a parent in the eye and say your child is performing at level 3, despite having answered 95% of the test questions correctly. On many of the tests, the range to achieve at level 4 is no more than one error. I have decided that, given this different view of the data sets, when we receive the parent report to mail home, I will be attaching a label which will tell parents the percentage of items their child answered correctly. I hope this will make the testing agenda more transparent and will ease undue anxiety about student learning. I am all for rigor and I am surely in favor of educational reform, but I don’t want it carried on the backs of school children. From my point of view, you don’t make a poor assessment stronger by making it harder to pass—you reform the assessment.

Respectfully submitted,
Dr. Teresa Thayer Snyder, Superintendent

 

 
   
 
photos of students of various ages

This page is maintained by the District Webmaster according to the Web publishing guidelines of Voorheesville Central School District. Copyright © 2006-07. All rights reserved. Produced and maintained in cooperation with the Capital Region BOCES Communications Service.