Friday, November 9, 2007

Is this a wake-up call for the people who work there? You betcha.

Earlier this week, Mayor Michael Bloomberg flexed his muscles by threatening to close F schools as early as June. He quipped, "Is this a wake-up call for the people who work there? You betcha."

Through analyzing these data, I've concluded that the people in need of a wake-up call work not at F schools, but at the NYC Department of Education. Undoubtedly, data can and should be used for organizational learning and school improvement. But if we're going to rank and sort schools - an action that has serious consequences for the kids, educators, and parents affected - the Department of Ed's methods should be in line with the standards to which statisticians and quantitative social scientists hold themselves. Needless to say, NYC's report cards are not.

There are five reasons the report cards might kindly be called statistical malpractice:

1) Ignoring measurement error

Measurement error isn't sexy and won't attract the attention of journalists and commentators. But it may be the central downfall of the NYC report card system. For example, elementary school PS 179 (score=30.9) got an F, while PS 277 (score=31.06) got a D. Similarly, Queens Gateway to Health Sciences Secondary School (score=65.21) got a B, while IS 229 (score=65.22) got an A.

If we actually acknowledged that these overall scores are measured with error, a school scoring a 65.21 is not statistically distinguishable from one scoring a 65.22 (a difference of .0007 standard deviations) . And Mayor Bloomberg is threatening to close PS 179 this year and keep PS 277 open because of a difference of .16? (See the grade brackets in the table below to see how close your school was to earning a higher or lower grade.)

2) Arbitrary grade distributions and cutoffs

Initially, the Dept announced a curve on which schools would be graded, but now they've curiously changed the distribution of grades and created different distributions for elementary, middle schools, high schools, and K-8s. Why should 25.26% of middle schools get As, but only 21.72% of elementary schools do? By the same token, why should 5.12% of high schools get Ds while 9.69% of middle schools do? It's not that the Dept has set criterion-referenced score cutoffs for attaining these grades, as the table above demonstrates - so what's going on here? The Dept of Ed needs to release more information about why this distribution of grades was chosen, and why it is different for each school level.

For example, there are more A/B middle schools than A/B elementary schools - does this mean that NYC's middle schools are "better?" The table below shows the percentage of schools receiving each grade for each school level, as well as the number of schools receiving that grade. You can click to enlarge.

* For high schools, the denominator does not include schools with grades "under review."

3) 6-12 schools grade discrepancies

Schools serving grades 6-12 got two grades - one for 6-8, and one for 9-12. These are the same schools, same principal, same roof. But for the 33 schools for which there are middle and high school grades available, 22 have different grades - though they are the same school! Sometimes these differences are substantial.

Consider the Academy of Environmental Science - its high school got a C, but its middle school got an F. At Hostos Lincoln Academy of Science, the middle school got a D, but the high school got a B. At the Bronx School for Law, Government, and Justice, the middle school got an F, but the high school got a C.

4) Poorly constructed comparison groups

As I've written here, the Dept of Ed flubbed the comparison groups by treating the percent African-American and percent Hispanic as interchangeable (i.e. a school with 59% Hispanic and 1% African-American is a perfect match for a school with 59% African-American and 1% Hispanic.) In addition, the Dept did not consider the proportion of Asian students when creating comparison groups; schools with higher proportions of Asian kids were more likely to get As and Bs, and there's no reason to believe that Asian kids in NYC have access to much higher quality schools. It's more likely that Asian kids grow academically at a faster rate because of things that happen outside of school.

Until the Dept releases the comparison groups, it is difficult to know how bad these comparisons are - so stay tuned.

5) Problems with growth models: Interval scaling and ceiling effects

I'm all for growth models, but you can't treat 1 unit of growth at the bottom of the distribution (i.e. moving from 13 to 14 on a 100 point scale) the same as 1 unit of growth at the top of the distribution (i.e. moving from an 89 to a 90). Put formally, the Department of Ed's model assumes that tests are "interval scaled," but they are not. Similarly, if a student is scoring near the top possible score of a test (the ceiling), there is very little room left to grow. One can address this problem by weighting growth at different parts of the distribution differently, but the Dept chose not to do this.

Hopefully, folks who care about public education in NYC will issue a wake-up call to the Dept of Ed and demand that these problems are fixed before vital decisions are made about schools based on highly questionable methods.

Thursday, November 8, 2007

Cool teachers you should know: Katie Hillstrom, Winchester High School

Katie Hillstrom is a 7th year English teacher at Winchester High School in Winchester, Massachusetts. According to her nominator, she is well-known in her school for her infectious energy that touches everything she does, whether it is teaching a unit about Walden (Her lesson plans on Walden are available here and have some fun ideas and activities) or engaging her students in service-learning projects.

Of particular note is an exchange Ms. Hillstrom organized between Winchester High School (a suburb of Boston) and Boston's English High School. Through this project, two ninth grade honors English classes at Winchester High School (WHS) collaborated with two classes at English High School (EHS) in an exchange designed to dispel stereotypes, build relationships across the socio-economic divide, explore urban/suburban educational equity, and brainstorm action steps to address these issues.

Students from WHS received “e-mail pals” from the EHS classes, so they were emailing each other and getting to know each other before the Exchange even occurred. Then, students from WHS spent one day at EHS, participating in relationship-building activities, touring the school, and listening to EHS students discuss the day-to-day challenges of attending an urban school. The project was designed to help the students involved by exposing them to a world outside their two very different communities. Here is the Massachusetts Department of Education's description of her project, from which I've quoted liberally.

How did this project affect students? Here is an excerpt from one of her student's evaluations of the project:

“True or false: Every child under age 16 in the United States receives education of an equal quality, regardless of race or wealth. Last year, I would have answered “true” to that statement. However, in March, my English class took an eye-opening field trip to English High School in Boston. There, we learned firsthand about the inconsistency in the quality of education between schools just ten miles apart from each other.”

Keep up the good work, Ms. Hillstrom!

You can nominate a cool teacher for this series by emailing me at eduwonkette (at) gmail (dot) com.

What Does It Mean for a School to Be Good? More on NYC Report Cards

I promised a post on the theory behind assigning schools a single letter grade, so here we are.

The idealized idea is that in the vast supermarket of educational goods, grades serve as a strong signaling mechanism that tell both educators and parents where their schools stand. They are intended to provide a stronger signal, and thus clearer information to guide action, than regular test scores. As a result, their proponents argue that by placing schools in a distribution where everyone knows the pecking order, grades create strong incentives for improvement. They are also supposed to serve as shortcuts for parents, which allow them to vote with their feet and create market pressure for schools to improve.

In the locales where grades have been implemented, they appear to have real consequences on behavior. For example, economists David Figlio and Maurice Lucas found that trivial differences in grades in Florida affect housing prices. Similarly, a new experimental study by economist Justine Hastings and colleagues found that providing simplified information on test scores "improves" parents' choices ( i.e. they pick higher scoring schools).

All of this makes sense if school quality is unidimensional - but as Diane Ravitch has pointed out, there are many dimensions of schooling. Does it make sense to give an overall grade, when schools may excel at some dimensions and not others? Parents look for different things in schools, and while some may prioritize a positive school environment, others may care more about value-added. Still others may care about overall proficiency rates. I took a look at the school environment and Quality Review scores in the report cards, which provide insight about the general environment of the school. This is not to say these measures are great, but they're something to look at.

What I found is that an A or B school is not necessarily a school with a positive school environment or a well-developed rating on the Quality Review, and vice versa:

* 67 schools that received school environment scores in the lowest 20% of all city schools (i.e. if this was a separate grade, they would have received an F) received As and Bs.

* On the other hand, 8 schools that were in the highest 20% of environmental category scores received Ds or Fs.

* 35 schools that were rated as "Underdeveloped" on their Quality Reviews received As or Bs, while 22 schools rated as "Well-developed" received Ds or Fs.

New Yorkers may be wondering about which schools fell into these mismatch categories:

Schools that got As but were in the bottom 20% of school environment category scores:


Schools that got Ds or Fs but were in the top 20% of the school environment category scores:

4) PS 291
7) PS 179
8) PS 35 The Clove Valley School

Schools rated as "underdeveloped" (the lowest category) on Quality Reviews that got As and Bs:



12) PS 217/IS 217 ROOSEVELT IS.
14) PS/IS 54
17) PS 107
24) MS 391
30) PS 184 NEWPORT

Schools graded as well-developed (the highest category) on Quality Reviews that got Ds and Fs:


7) I. S. 381
14) PS 041 NEW DORP

19) PS 179
20) PS 182
21) PS 238 Anne Sullivan
22) PS 35 The Clove Valley School

So what does it mean for a school to be good? As saavy parents and teachers know, it depends on what "good" means. The report card grades should be interpreted with that in mind.

Wednesday, November 7, 2007

Who's Afraid of Educational Triage?

Last week, the blogs were abuzz with word that the bubble kids are dead! See Swift and Changeable and Eduwonk for a taste. At the same time, NYC has explicitly designed an accountability system that is intended to eliminate triage (at least on the lowest performing kids) by focusing on growth and giving more weight to the lowest scoring kids' progress. What should we make of these competing claims?

Here's the basic idea behind triage: proficiency-based systems create short-run incentives to get some kids to pass this year, so the educational triage hypothesis is that schools focus on "bubble kids" and not high and low performers.

But there are two general scenarios in which we would not expect to see triage - 1) when resources are flush or 2) when there are few kids failing - so I thought the Education Next study by Matthew Springer of Peabody might have stumbled on these two conditions. I've never believed universal triage claims; we should expect to see triage in some schools and not others. The problem, of course, is that the schools NCLB is most concerned about - those with the most disadvantaged kids - are likely to have large numbers of students well below failing if the standards are high enough.

There are two take-home lessons for educational research that apply beyond this EdNext article:

1) Education Next is a magazine published by the Hoover Institution, not a scholarly journal.

This comment is not intended to sound cranky, but - Education Next is not a scholarly journal because scholarly journals a) provide enough information about the study to enable one to fully evaluate the methods used by the author , and b) pick their editorial boards based on the scholarly expertise, not adherence to a particular reform agenda. To be clear, a number of exceptional scholars are involved as editors and editorial board members of EdNext - but those who don't fully adhere to the particular angle provided by EdNext are glaringly absent; to name a few that study topics covered in EdNext, Alan Krueger, James Heckman, David Card, Charlie Clotfelter, Sunny Ladd, Ceci Rouse, Brian Jacob, Julie Cullen, Jesse Rothstein, Derek Neal, Bill Evans, Susanna Loeb, etc. To say "Education Next partakes of no program, campaign, or ideology. It goes where the evidence points" is just not true, and is no different than saying Fox News is fair and balanced. That said, I'm glad they're around for debate's sake and I give their articles to my students occasionally.

2) This is a "duh" point, but....that research has been published does not mean that it's true.

From reading the Economics of Education Review paper, I concluded that the study lacks a design that would enable one to identify the presence or absence of educational triage. Why?

1) The paper lacks a pre-NCLB measure: Derek Neal, who studied this issue in Chicago, identified a focus on marginal kids by examining the distribution of achievement before and after an accountability system was put into place; it is through this strategy that one learns that the distribution of achievement changes in response to incentives. See his paper here.

2) The paper doesn't identify meaningful variation in incentives to act strategically: If one doesn't have a pre-NCLB measure, some fancy statistical gymnastics are required. An exemplar is Randy Reback's elegant paper using data from Texas. Rather than simply examining the "didn't make AYP last year/did may AYP" dichotomy that Springer uses, he explicitly calculates schools' short-run incentives to improve the performance of various students at the school. What does he find?

  • Schools respond to math performance incentives both by targeting math resources towards specific students and by making broad changes which also help very low achieving students. These responses tend to sacrifice the targeted students’ reading performance and to sacrifice relatively high achieving students’ performance in both math and reading.

  • Schools respond to reading performance incentives by targeting resources towards the reading performance of particular students, sacrificing these students’ math performance and sacrificing all other students’ performance in reading.
  • Finally, schools devote fewer resources towards students in the terminal grades during years when short-run incentives are low than during years when incentives are high.

In contrast to Springer, Reback concluded:

If one of the primary goals is to create a sort of educational triage, in which students below minimum grade-level skills are pushed up, then the No Child Left Behind type of accountability system appears to be fairly effective. If accountability systems are not intended to induce schools to shift resources disproportionately towards certain types of students, then these systems should use test results to formulate school ratings that do not simply reflect the fraction of students achieving minimum competency.

Similarly, Derek Neal concluded:
Based on our results, it is reasonable to conjecture that hundreds of thousands of academically disadvantaged students in large cities are currently being left behind because the use of proficiency counts in NCLB does not provide strong incentives for schools to direct more attention toward them.
Since we have two large-scale studies in hand that find educational triage in full swing, at least in some schools, it's something we need to take seriously. For all of the NYC's report card flaws (and there are many), the Department of Education deserves credit for focusing on growth rather than proficiency, and for giving additional weight to the lowest performing kids' progress.

Cool people you should know: Randy Reback

Randy Reback is an economist at Barnard College. He studies loads of stuff, including school accountability, school finance, school choice, and teacher labor markets. You can find his papers here. Some of his cool findings are reported in this post on educational triage, but here are some more:
  • The presence of teacher certification programs in colleges affects who goes into teaching: The addition of teacher certification programs that could be completed within four undergraduate years could increase rates of entry into public school teaching by at least 50% among recent graduates of certain selective colleges.

  • School choice programs affect housing prices: Reback found that residential properties appreciate significantly in school districts where students are able to transfer to preferred school districts, whereas residential property values decline in districts that accept transfer students.

NYC School Report Cards: How Did the Community School Districts Fare?

Here are a few figures on how the Community School Districts fared in the report cards. More commentary and interpretation on this later. You can click on the table below to enlarge it.

NYC School Report Cards II: A Closer Look

One of the most striking patterns from my post yesterday was that schools with higher report card grades, on average, had higher proportions of Asian kids than those with lower grades. This raised serious questions for me about the validity of the comparison groups, which were generated using only four school characteristics: the combined African-American and Hispanic population, percent free/reduced lunch, percent special ed, and percent ELL.

From national datasets like the Early Childhood Longitudinal Study, it's clear that Asian kids have different growth trajectories throughout elementary school. It's also clear that Hispanic and African-American kids have very different growth trajectories, with African-American kids falling behind at a much faster rate. If a large proportion of the report card is based on growth, but the growth measures don't account for varying trajectories that are, in part, the result of non-school factors, schools serving higher proportions of Asian kids will look like they're producing more growth. Similarly, schools serving higher proportions of Hispanic kids will look like they are producing more growth if they are compared to schools serving similar proportions of African-American kids. [We can debate how much of these differences in trajectories are explained by school versus outside-of-school factors.]

Here's what I found (I'll interpret a whole line of the table so I'm clear on what these measures are):

  • At the elementary school level, A schools have an average of 20.99% Asian students. By contrast, the median A school has 9.55% Asian students. This tells us that the average is pulled up by schools with very high proportions of Asian kids. The standard deviation is there for stats junkies, but most readers will prefer to look at the interquartile range (the 25% and 75% columns) to get a sense of how much variation there is. If we read across the A row, the 25% column tells us that 25% of A schools have 2.3% Asian or fewer. Similarly, the 75% column says that 25% of A schools have 36.8% Asian students or more. The range column represents the lowest and highest values for a given grade; A schools have between .2 and 92.6% Asian students.

  • If we compare medians instead of averages, we still see that A elementary schools have more than 3 times as many Asian students than F schools (9.55% versus 2.90%). The 25% of A schools with the highest Asian population have between 36.8% and 92.6% Asian students, while the 25% of F schools with the highest Asian populations have 5.3% Asian students and 28.5% Asian students.

  • The general pattern is similar across all levels of schooling; A schools have substantially higher proportions of Asian students.

The tables below show the distribution of Asian students by school grade and school level (i.e. elementary, middle, K-8, and HS):

* 3 elementary schools had missing data in 2005 and are thus not represented in the table above.

* Because many 6-12 schools opened recently and thus do not have 2005 data available, the middle school data are missing 32 schools, including 10 A schools, 14 B schools, 5 C schools, 1 D school , and 2 F schools, and thus should be interpreted with this in mind.

* 1 K-8 school had missing data in 2005 and is not represented in the table above.
* 1 high school had missing data in 2005 and is not represented in the table above.

Unless we believe that schools with high concentrations of Asian students are higher quality schools, these results raise a lot of questions about the validity of these comparison groups. In the multivariate analyses that I describe below, I also find that schools with higher proportions of Hispanic students are more likely to receive A or B grades - which raises the question of the accuracy of using an aggregate black/Hispanic number to compare schools.

Overall, these results suggest that the Dept of Ed's method of establishing comparisons groups ultimately results in apple-to-orange comparisons.

For geekier analyses, read on:

For those who are interested, I also ran a series of descriptive logistic regressions for the purpose of examining the association between school racial composition and schools' odds of earning an A or B grade, net of many other factors that could explain this association. In these models, I controlled for percent free lunch, percent female, percent immigrant, percent stability (i.e. the opposite of mobility), percent full and part-time special education, percent ELL, school size, percent capacity (how crowded the school is), and teacher characteristics (percent with more than 5 years teaching and percent with a masters degree). I didn't impute missing values, so these analyses include 970 schools of the 1187 that had data available in 2005. Remember, regressions like these are just descriptive, not causal [that is, they describe patterns observed in the data and do not necessarily explain *why* a school received the grade it did]. Nonetheless, unless we believe that schools with the highest concentrations of Hispanic and Asian students are much higher quality than those with lower concentrations of these students, these results suggest that the peer comparison groups are not entirely fair:

  • First, I divided schools into four equal groups - quartiles - based on their percent Asian. (This is a typical approach to modeling non-linearities.) The first quartile included the 25% of schools that have the lowest proportions of Asian students, and the fourth quartile included the 25% of schools that have the highest proportions of Asian students - in these analyses, quartile 4 includes >15.3% Asian. I did the same thing with the African-American and Hispanic populations, i.e. divided schools into four quartiles.
  • When we just examine the association between racial composition and a schools' odds of getting an A or a B, quartile 4 schools (those with the most Asian students) are almost 2.5 times more likely to get an A or B (odds ratio=2.44, p=.001) compared to those with the lowest proportions of Asian students (those with 1.5 percent Asian or less) . However, schools in the 2nd and 3rd quartiles of the Asian population have no advantage over those in the first. When we just look at the relationship between getting and A/B and African-American and Hispanic composition, we see no stastically significant relationship.
  • Once we control for all variables listed above, quartile 4 Asian schools have a smaller advantage (they are slightly less than twice as likely to get an A or B (odds ratio=1.87, p=.041). Again, there is no advantage of being in the 2nd or 3rd quartile of the Asian population over the first. But schools in quartile 4 of the Hispanic population have an even larger advantage - they are slightly more than two times as likely to get an A or B (odds ratio=2.12, p=.036).
  • Nonetheless, this full set of predictors only explains a tiny proportion of the variance (pseudo R2=.06).

If anyone is interested in checking out the full results, email me and I'll send you the output.

Tuesday, November 6, 2007

The NYC School Report Card: A First Cut at the Data

The report cards are out - you can check them out here. There's a lot to talk about, but I thought it would be helpful to first create a profile of the schools that received various grades. Using the progress report data and data from the 2005 release of the NYC School Report Cards, I looked at the average characteristics of schools receiving a given grade (i.e. means). For example, A schools average 15.92% Asian students, while schools that received Fs have an average Asian population of 4.53%.

The bright line findings from the table below are:
  • Schools with higher grades have higher proportions of Asian students; A schools have an average of 16% Asian, while F schools have 4.5%.
  • Schools with higher grades have much lower proportions of African-American students; A schools have an average of 26%, while F schools have an average of 47%.
  • Schools with higher grades have lower proportions of students qualifying for free and reduced lunch - (A schools=65%, F schools=76%). Interestingly, schools "under review" - those contesting their grades - have much lower percentages of free and reduced lunch kids (56.5%).
  • Schools with higher grades have higher proportions of Hispanic and immigrant students.
I'll write in more depth about the comparison group issue later, but for now, these results - particularly the Asian finding - suggest that the comparison groups may not adequately control for the fact that kids of different backgrounds may be on different growth trajectories for reasons that have little to do with the schools they attend - read more about this under point #2 here.

Also interesting are the dimensions where there are very few differences between schools receiving different grades, as you'll see in some of the variables below.

* Notes on the analysis: The most recent data to which I had access were 2005 data; because these variables are highly correlated across years, the general trends should look the same. (If someone has clean 2006 data, send it along and I can rerun these descriptives.) Also note that these are means; if you are interested in standard deviations and ranges, email me.

Monday, November 5, 2007

Introducing Cool Teachers You Should Know - Nominations Open!

We hear so much about "bad" teachers that it's easy to forget about the many superstars we have teaching in American public schools.

Since I started writing this blog, I've been profiling "cool people you should know" - cool people who do research on education. Beginning this week, I'll start profiling a "cool teacher you should know" every week- someone who colleagues, parents, or students recognize as a master teacher, and who adds something special to your school. (If I get lots of nominations, I'll profile teachers more frequently.)

So please circulate this post far and wide and help me identify some exceptionally cool teachers from around the country. And if someone young enough to have a myspace or facebook page could help me get this out to high school and college students who are still close to their educational experiences, I'd appreciate it.

Email me their name, grade level or subject, school, and location, as well a few tidbits about them - something about how they teach, what they're like as a colleague, especially great lessons, how they affected you, etc. Send me an email at which I can contact them (you can remain anonymous as the nominator if you wish). If you have a picture, send it along, too. If the teacher has a DonorsChoose profile, I'll link to that as well.

Send nominations to eduwonkette (at) gmail (dot) com.

PS - Also, check out teacher ken's post on the impacts of society's lack of respect for educators. He reminds us that saying thanks makes a difference:
All of us have had teachers. And even if we were too shy, or too stubborn, to express our thanks at the time, we can always drop a note or make a call, or if possible stop by and say hello, and thank those who made a difference for us. Sometimes we worry about the students who pass through our care, that we did not do enough, care enough, and it can help a teacher who is wondering whether to continue the struggle to hear of the differences s/he made. Sometimes that can be the one thing that keeps a teacher going for one more year.

The Report Card Management Strategy

Five years ago, Malcolm Gladwell wrote an article called "The Talent Myth" questioning the management zeitgeist that the NYC Department of Education has swallowed wholesale:
At the heart of the McKinsey vision is a process that the War for Talent advocates refer to as "differentiation and affirmation." Employers, they argue, need to sit down once or twice a year and hold a "candid, probing, no-holds-barred debate about each individual," sorting employees into A, B, and C groups. The A's must be challenged and disproportionately rewarded. The B's need to be encouraged and affirmed. The C's need to shape up or be shipped out. [One company] followed this advice almost to the letter, setting up internal Performance Review Committees. The members got together twice a year, and graded each person in their section on ten separate criteria, using a scale of one to five. The process was called "rank and yank." Those graded at the top of their unit received bonuses two-thirds higher than those in the next thirty per cent; those who ranked at the bottom received no bonuses and no extra stock options--and in some cases were pushed out.
Gladwell writes at length about the management strategy of a dazzlingly successful company, which:
took more credit for success than was legitimate, that did not acknowledge responsibility for its failures, that shrewdly sold the rest of us on its genius, and that substituted self-nomination for disciplined management.
How did this work out?
The broader failing of McKinsey and its their assumption that an organization's intelligence is simply a function of the intelligence of its employees. They believe in stars, because they don't believe in systems. In a way, that's understandable, because our lives are so obviously enriched by individual brilliance. Groups don't write great novels, and a committee didn't come up with the theory of relativity. But companies work by different rules. They don't just create; they execute and compete and co├Ârdinate the efforts of many different people, and the organizations that are most successful at that task are the ones where the system is the star.
Gladwell concluded:
They were there looking for people who had the talent to think outside the box. It never occurred to them that, if everyone had to think outside the box, maybe it was the box that needed fixing.
What company was Gladwell writing about? Enron.

I hesitate to invoke the comparison here, if only because Enron has become a cognitive shortcut term for too many things. Here's the parallel - Enron's organizational failure was not just the result of a handful of swindlers, i.e. Lay and Skilling. Enron created a organizational structure and incentive system that virtually ensured that gaming and malfeasance would occur.

As we wait for the grades, check out the entire Gladwell article.

Sunday, November 4, 2007

Reader comments roundup: NYC's Wylde Wylde West

Dear anonymous 1:55AM, Norm, Sol, Robert, more anonymous people, holden, and world,

My apologies for the impasse in responding to comments. You left a number of thoughts about the NYC Department of Education’s attack on Diane Ravitch via Kathy Wylde – here’s the post that spurred them. Anonymous 1:55AM wrote:
You say the NYC administration isn't concerned with "figuring out what works for New York City kids." I'm assuming you just don't agree with their approach, because you can't seriously suggest there's a shortage of ideas being tried up there.
It’s not that I unequivocally disagree with their approach, if by approach you mean ideas. If academics do nothing else well, we see the world in shades of grey. When many of their reforms were first presented – i.e. a standardized curriculum featuring the reading/writing workshop approach or supply-side reform via small schools – I was encouraged.

When I was confronted with classrooms full of high school kids who couldn’t read or write, I had turned to Lucy Calkins’ and Nancie Atwell's books. And I found this model to be successful in moving struggling readers and writers forward. But I also know that the implementation of this curriculum in NYC stressed compliance rather than instructional support and capacity building. At the end of the day, teachers haggled with supervisors measuring their rug size and the dimensions of their word walls. This certainly didn't support their professional growth as educators, and goes against the spirit of these approaches.

By the same token, as someone who began studying education policy during the Annenberg Challenge, I thought small schools had enormous potential. And hopefully they still do. But as many first wave small school movement participants have pointed out, the essential ethos of that movement – a focus on serving the most disadvantaged kids and seeing them as whole people, on nurturing teacher collaboration, and on building school communities – has given way to an obsession with producing good stats. So anonymous 1:55AM, many of these problems are not “idea problems” as much as they are "approach/implementation" problems.

But you’re correct that I disagree with NYC’s approach as exemplified by the Wylde affair last week - and it is this approach that precipitated the downfall of some good ideas. Figuring out what works requires a willingness to seriously evaluate NYC’s reforms, to acknowledge failures when they occur, and to change course accordingly. It would also require a willingness to listen and to engage critics. Instead, what we’ve seen are incredible displays of hubris and rhetoric that are reminiscent of the “coalition of the willing” charades that preceded the Iraq War. When faced with dissent or tepid results, this administration has fired its critics (see the Monday Night Massacre response to their social promotion plan), personally discredited them, or launched advertising campaigns (see the Evander Childs affair here).

Anonymous 1:55AM also critiqued the critics’ lack of an alternate reform vision. Norm Scott took this comment head on, responding that critics do have a reform vision that includes attention to the context of teaching (i.e. class size reduction), attention to the non-academic needs of kids, and a commitment to building collaborative school communities where teachers have a real say. Sol added to Norm’s post, writing that some of the best ideas come from the master teacher in the next room over, and effective reforms should leverage that expertise. I agree with Anonymous 1:55 AM’s sentiment that critics should offer alternatives, but I don’t see an absence of alternatives being offered by the NYC Department’s critics. Nonetheless, my one-woman “What Works Clearinghouse” will take Anonymous 1:55AM’s comments to heart in the coming weeks and offer some alternatives.

Robert Pondiscio provided sage advice about the need for reform movements to listen to critics, writing:
Have we come to the point in the education revolution when we will devour our own? Let’s hope not. The battle is nowhere near over. We need all the ideas and constructive criticism we can muster.
Let’s hope that the folks in charge – from Bloomberg/Klein to Margaret Spellings – take that advice seriously.

Finally, Holden pointed out that in the vast sea of poor evaluations, the Policy Studies evaluation of NYC small schools may be among the best:
I've been looking for an NYC education charity that can compellingly demonstrate that it makes a difference. Aside from the Mathematica study of Teach for America and one randomized study by LEAP, I've seen nothing whose methodology tops this New Visions report (so, nothing better than a B-). For a complex issue like education, it's tough for me to have confidence in a charity under these circumstances. Can you point me to anything I'm missing?
I'm having a hard time coming up with any. As long as foundations don't require (or fund) such evaluations, we're not going to see them. If someone can identify an educational organization in NYC that rigorously evaluates its work, please post it here.

Thanks for these comments, everyone. Keep them coming.

Reporting on NYC's Report Cards

eduwonkette was not intended to be a blog about NYC school reform, but the folks up in NYC provide lots to write about. This week, the NYC Department of Education will release report cards grading its schools - see the NY Times article here. While 15% of schools will get As, 5% will get Fs. Here's this week's outline:

Monday - The Report Card Management Strategy in Other Sectors

Tuesday - NYC School Report Cards I: What kinds of schools received As, Bs, etc?

Wednesday: Did the Department of Education Flub the Peer Comparison Groups?: A Closer Look at the NYC Report Card Data

Thursday - What Does It Mean for a School to Be Good?: In this post, I explain the theory behind NYCs school grades, and look at schools that did well in their environmental category score but poorly overall, and vice versa.

Friday - Problems with the NYC Report Cards