Smart Guys Can Get It Wrong

Exif_JPEG_PICTUREYou know that feeling when one of your friends says something stupid that makes a different friend angry?  And you realize that you agree with the angry friend?  Yeah, so that happened to me a few weeks ago.  Let me explain.

I am a big fan of the FiveThirtyEight blog that Nate Silver runs with his team of statisticians/economists.  Nate has a knack for explaining technical mathematical stuff using everyday examples.  He started in sports and moved to politics (correctly predicting most of the 2014 races), and then ESPN brought his blog back over to their site where it lives now.  When he sticks to those two topics–sports/politics–he is a bastion of logic in a world of opinions.

Lately, though, Silver has been dipping his toes into the realm of educational policy and the ridiculous questionable data that supports some of the recent “reforms”.  In the recent article “The Science of Grading Teachers Gets High Marks“, Silver’s ed dude Andrew Flowers analyzes some of the discussion around the Vergara case.  He discusses the back and forth between statisticians at Harvard, Brown, and Columbia and Jesse Rothstein of Stanford.

While I agree with Flowers that the arguments over methods for analyzing teacher impact are a positive sign that science is working as it should, I side with Valerie Strauss when she writes for the Washington Post that,

“The quality of the underlying standardized  assessment is assumed to be at least adequate — or why use the student scores to evaluate their teachers? — when, in fact, many of them are less than adequate to provide a well-rounded, authentic look at what students have learned and are able to do.”

Flowers provided this throw-away phrase that was guaranteed to make educators angry,

“In order to perfectly isolate the effect of a teacher on a student’s test scores — setting aside whether higher test scores is the right goalstudents would need to be assigned to teachers randomly.”  [emphasis mine]

What?!?  How can you “set aside” the source of all of the data that you are analyzing (or, more accurately, discussing the analysis of)?  That’s like saying, “Setting aside the fact that koalas are not actually bears, observing them is a great way to learn about the bear behavior.” We MUST stop pretending that mathematical analysis can make up for crappy assessments.

Opinions?  You know what to do.


Image: “Friendly Female Koala” by QuartlOwn work. Licensed under CC BY-SA 3.0 via Wikimedia Commons.


Wait, Isn’t Norm-Referencing Bad?


I was lucky enough to be in a day-long session last month about assessment and one of the statistical models that is used to measure the value-added by teachers.  Before I start to vent, I want to admit that my statistical background is sketchy at best.  I am also in a particularly frustrated place due to the impact that these models play in the determination of teacher “effectiveness”.

During the presentation, one of the most fundamental aspects of state-wide standardized test analysis was explained to my group and I was floored.  The metric that is used to determine if my students have shown growth (or academic improvement) is not their absolute score on the test.  It is their percentile rank that is expected to improve.  Students who improve at the same rate as the average in the state will have a growth index of zero.

This is akin to lining up all of the students in my state according to their scores.  If this line of students all move five steps (or fifty steps!) forward, they have all made improvement.  By the simplest definition, they have all learned.

But, in the world of high-stakes testing, since none of the students moved “up in line” or improved relative to others, they have not improved.  As a teacher, I have failed them.  Even if they made five hundred steps of forward progress, their growth index is zero.

Perhaps the biggest reason that this frustrates me, is that classroom teachers have been taught for decades that norm-referenced assessments have many weaknesses.  These assessments present students and teachers with a “moving target” since success depends on the performance of others.  Grading on the curve is widely accepted as unfair assessment practice.

Yet, this is exactly what we are doing at the level of standardized tests and teacher effectiveness.  In the interest of continuous improvement, the bar moves from year to year. But, the result is shifting sands beneath our feet that rob educators of the ability to anchor their instructional goals to something concrete.  We are told, “Just keep teaching them the best you can and you’ll do fine.”

But, if large groups of students do better, my evaluation will be negatively affected. In the world of norm-referenced standardized testing, a rising tide can actually sink many ships.  If we were pessimistic about the anti-collaborative effects of standardized testing before this realization, we should be downright fearful that as more educators realize the way that our effectiveness is calculated, corruption will become much more common and teachers will begin to make choices that are not in the best interest of students.

Am I overdoing it on the gloom and doom?  Let me know in the comments.


The Absence of Value

medium_3887821803The always awesome Seth Godin posted a quick entry on his blog in January entitled “The cost of neutral“. He discusses an example from the business world, but it applies equally to education. Godin writes,

“Not adding value is the same as taking it away.”

What a powerful statement, particularly in the context of our current furor over “value-added” measures for teachers. It rings true for me in so many ways:

  1. As a teacher, if my students finish my class no better off than they would have been without me, then I have taken something from them… time and motivation.
  2. As a colleague, if I don’t actively participate in collaborative meetings and activities, my team is worse off than if I never attended.
  3. As a teacher leader, if I don’t provide professional development that is meaningful and useful to my staff, they would have been better off anywhere else.

These ring true, don’t they?