Amanda Ripley Author of The Unthinkable

In today’s New York Times, Diane Ravitch responds to David Brooks and other critics by hoisting well-worn foreign flags.

“No high-performing nation tests its students every year or uses student test scores to evaluate teacher quality.”

This is a point Ravitch makes again and again. I usually just glide right by it, since it comes wedged between so many other questionable claims and also some valid points. But since I just got back from visiting these high-performing nations, I must note that Ravitch’s version of reality does not match what I saw.

Everywhere I went, testing was absolutely embedded in the system. It took different forms, and in some places it was done more intelligently and more subtly than we do it, but it was always there. In South Korea, kids are tested in elementary, middle and high school. How do I know? Teachers, principals, students and the Education Minister told me so. It was not a secret.

Just to be clear: Korean kids, who score at the top of the world in international tests, take standardized tests administered by the Korean government to measure what students know—and identify which students and schools need more help. Yes, they do!

And guess what? The results of these tests are used to evaluate principal and school quality. Yes, they are!

What about teachers? Teachers are evaluated, too, using criteria that do not currently include test data—but do include surveys of students, parents and other teachers about the effectiveness of the teacher. (And by the way, everywhere I went, I could find teachers and principal who complained about these evaluations, calling them unfair, just like teachers do here. It’s a small world after all.)

Now bear with me for a second: Ravitch is careful to use the caveat “every year.” And it’s true that Korean kids do not take standardized tests every year. Neither do American kids! Under federal law, our kids must be tested in grades 3-8 and at least once between 10th and 12th grade. That’s seven years out of 13. Is that too much? Probably. Should our tests be smarter? Definitely.

But to imply that tests are irrelevant in high-performing countries is misleading.

Even in Finland, which has the best schools in the world by multiple measures, tests are part of life. Are they annual, standardized tests, the results of which are made public? No, they are not, and teachers in Finland thank God for that. But make no mistake: the Finnish national government routinely and systematically tests samples of students around Finland to make sure that schools are meeting high standards.

And Finnish teachers told me that of course they test their students regularly—and they compare their students’ results with the results of their colleague’s students to see what they need to work on. Of course they do. Why wouldn’t they? You don’t get to be high-performing without actually performing.

In reality, Korean high-schoolers—and Finnish high schoolers—obsess over one test in particular far more than most American kids ever will. In both countries, kids graduating from upper secondary schools must take an all-important, standardized, end-of-the-year test before they graduate. So tests are not only present; they are truly high-stakes in a way that they are not in most U.S. schools (where most tests are only high-stakes for the people who work there.)

I believe in learning from high-performing nations. That’s why I am writing a book about it. In fact, I am convinced that these comparisons are a matter of economic and even moral urgency. And that’s why we have to do this work with great care and humility—as if we want our schools to be better more than we want to be right.

1

Justin Snider said on July 11, 2011 at 4:14 pm

I don’t get it. First you say that Ravitch is wrong—that what you found abroad contradicts what she says about annual standardized testing. Then you concede that neither South Korean nor Finnish kids are tested each year. Nor are they tested nearly as often as American students are. Nor are the tests as meaningless and easily manipulated as those we give our students day after day, year after year. You note that the Finnish government conducts tests of students around the country—but you fail to note how revolutionary this idea is, and how different from what we do in the U.S.: first, we typically require every school to test every student almost every year (rather than relying on representative sampling, which is far smarter); and, second, you don’t comment on the fact that the government is doing the testing in Finland, an important point because it can help ensure the validity of test results. You need look no further than D.C., New York City or Atlanta to see what happens when schools and teachers are responsible for testing their own students—and when district leaders are putting enormous pressure on teachers to raise test scores at all costs.

What so many policymakers and politicians in the U.S. don’t get is that the vast majority of our standardized tests are junk—the results tell us virtually nothing about what students know and can do.  Other countries test their students in much smarter ways. The U.S. needs to move away from its over-reliance on multiple-choice tests, which are cheap and easy to grade—but which obscure more than they reveal about students’ knowledge and skills.

Finally, your assertion that in the U.S. “most tests are only high-stakes for the people who work” in schools—not for students—isn’t accurate. Test scores are still not factored into most U.S. teachers’ evaluations (or tenure decisions), though such a future is probably not far off. And in many states, U.S. high school students DO face end-of-year tests that, if failed, can prevent them from earning diplomas.

Contrast what you appear to have found in Finland with what the Finnish Minister of Education told me in March: “Our educational society is based on trust and cooperation, so when we are doing some testing and evaluations, we don’t use it for controlling [teachers] but for development. We trust the teachers.”

http://www.huffingtonpost.com/justin-snider/keys-to-finnish-education_b_836802.html

Such trust is nowhere to be found in American education.

2

Amanda said on July 11, 2011 at 5:05 pm

Hi Justin,

I do think Ravitch is wrong. Kids are not tested every year here. She is suggesting that they are, which is clearly false. They are tested about 7 out of 13 years from K-12.

She is also suggesting that in high-performing nations, tests are not central to the education system. That is clearly not the case. I’ve never seen a school anywhere in America in which tests are as important as they are in South Korean schools. Kids in Finnish upper secondary schools are tested every 5 weeks by their classroom teachers.

I agree totally that our tests need to be smarter. In any high-performing organization, you need intelligent ways to measure how you are doing.

And the sampling that the Finnish national government does is not so revolutionary. (We have been doing national samples for many years through NAEP.)

And you are right, of course, that trust is like gold. Finland has it, and we need it. But I would argue that demagogues like Diane Ravitch (or Michelle Rhee for that matter) seek and destroy trust wherever they go.

3

Justin Snider said on July 11, 2011 at 5:29 pm

Let’s say that American kids are tested “about 7 out of 13 years from K-12.”

Fine. But let’s look at those years more carefully. Testing begins in third grade. Why? Because most psychometricians—not to mention psychiatrists, educators and parents—don’t believe that very young kids should be subjected to standardized testing, or that their results would be meaningful. So we start in third grade, but only because we can’t start in kindergarten. And then we test every year—typically in multiple subjects—for six years in a row. Show me another system that does that, please.

Arguably, we “only” test once in high school because most students are already deluged with other standardized tests (SAT I, SAT II, ACT, CLEP, AP—the list is long).

That we test so much is evidence, to me, that our tests are terrible. You don’t have to test nearly as often if the tests yield meaningful information about performance.

NAEP samples are great. In fact, with how unreliable state testing systems have proven themselves to be, it’s a wonder we bother with them at all. Why not just turn over all accountability testing to the feds, which would make it virtually impossible for states to game the system and mislead the public on proficiency rates?

If the point of testing is to know how the system as a whole is doing and to pinpoint where it’s falling short, there’s absolutely NO NEED to test every single student. This is the premise on which the Finnish system is built. A representative sample will suffice.

But the reason we won’t be moving to a system that samples representative student populations anytime soon is because we’re uniquely obsessed with using test results to make personnel decisions at the individual classroom level, despite mounds of research showing the unreliability of our test results.

Enter Campbell’s law: “the more any quantitative social indicator is used for social decision making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”

(see http://www.nytimes.com/2010/06/19/opinion/l19teach.html; I wrote about this in June 2010)

A final point that is too rarely made—but whose importance anyone with significant statistical training knows well: it’s very dangerous to use the results of a test designed for one purpose (say, to measure student achievement or learning) in ways for which it was not originally designed (say, to measure teacher performance or to dole out bonuses).

4

Stuart Buck said on July 12, 2011 at 10:32 am

Justin—

“If the point of testing is to know how the system as a whole is doing and to pinpoint where it’s falling short, there’s absolutely NO NEED to test every single student. This is the premise on which the Finnish system is built. A representative sample will suffice.”

The point of NCLB testing isn’t just to know how the system as a whole is doing, but to be able to identify schools (for example) that may look fine on average but in which minority groups are significantly behind, and to encourage such schools to get those groups up to a higher level. You can’t do that with NAEP-style sampling.

The point you make in the final paragraph seems to assume that student achievement has nothing to do with teacher performance, which implies that teachers are superfluous. What sort of test would be designed to measure “teacher performance” without looking at student achievement? Do you have in mind a test administered to the teachers themselves?

5

Karl Wheatley said on July 13, 2011 at 10:37 am

Average test scores are poor indicators of which educational “systems” are performing well for multiple reasons:

1) Some of what is on the tests does not really matter much, or matter at all. In general, relying on standardized, machine scored tests means aiming low in education.

2) Some of what is on the tests does not itself matter but is an easy to test proxy for what really matters, and under conditions of high-stakes testing, the correlations between the proxy (decoding nonsense syllables) and the thing we really care about (reading comprehension) weakens. Put differently, high-stakes directly lower the validity of test scores.

3) Much of what matters most is not on the tests.

4) Much of what matters most is directly undermined by teaching to tests, thus, higher scores often come at the expense of causing harm in other areas of learning and development.

5) Related to # 4, some high-scoring countries, including the East Asian countries, have been very candid about the harm that pursuing test scores has caused to their system, and China for one has been trying to escape the grips of testing.

6) Most subjects that matter are not being assessed.

7) Tests do do “measure” learning in the way we measure coffee or lumber. There is no real measurement in education and teacher assessments are routinely as good or better than standardized tests.

8) Faster gains simply do not signal better overall learning, just as faster short-term gains in the profitability of a company often come by doing things that cause more harm than good in the long run. Faster is often worse, not better, so the whole VAM enterprise is built on faulty assumptions.


9) Poverty/SES is by far the biggest factor in test scores, and when you control for poverty and make apples to apples comparisons, the U.S. is either number one or in the top five on numerous international tests, including the latest PISA. As with soldiers, teachers must be judged by what they accomplish with the mission they are given, and at least with respect to the narrow goal of test scores, the U.S. is already one of the high-performing systems.

By parsing Ravitch’s words you are missing the broader truth: Our educational system is testing-obsessed, and as someone who teaches teachers for grades PK-3, this obsession has trickled all the way down to preschool, so the literal mandate is a silly way to look at the dreadful PK-12 reality that has been created here.

Learning/teaching for the purposes of passing tests or getting rewards or avoiding sanctions fundamentally corrupts the entire educational enterprise. The book Collateral Damage provides a nice overview of the damage caused by test-driven schooling, while the book Making the Grades (from a long-time testing industry insider) provides troubling insights into just how unscientific the whole enterprise is.

For those not sure they wish to spend the money, here are two excerpts from the latter:

“However, if you knew what I knew about the [testing] industry, you would be aghast at the idea of a standardized test as the deciding factor in the future of even one student, teacher, district, or state. I, personally, am utterly dumbstruck by the possibility. The idea that education policy makers want to ignore the assessments of the classroom teachers who spend every day with this country’s students to instead hear the opinion of some testing company (often “for-profit”) enterprises in a distant state is, in my opinion, asinine. It is ludicrous.” (Farley, 2009, p. ix-x).

“Reliability,” Greg scoffed. “Don’t worry about the numbers. I can make statistics dance.” (Farley, 2009, p. 24).

Farley, T.  (2009). Making the grades. My misadventures in the standardized testing industry. PoliPoint Press: Sausalito, CA. 

Me: People who lack a systems understanding of education and who confuse education with manufacturing have taken over our schools, and the result has been a disaster.

6

zinat al hayat said on July 16, 2011 at 5:26 pm

I liked a lot the thread is superb

7

Mirna Jope said on August 02, 2011 at 9:18 am

I’ve been a credentialed teacher in California since 1983. In the mid-90’s, I was required to give standardized tests to my first-grade students (which, thankfully, was discontinued after a few years). In all the years that I have taught, students in 3rd-11th have been required to take the standardized tests that disrupt the usual classroom routines for at least two weeks out of the year. So it’s a little disingenuous to use your statistics the way your do.
As others wrote, there are many ways to test and assess and the current focus is not the best way to measure what students are capable of.

Name:

Email:

URL:

Comment:

Please enter the word you see in the image below:


Remember my personal information

Notify me of follow-up comments?