To paraphrase Karl Marx, a specter is haunting the nonprofit world, and it is the specter of measurement.
Everywhere we turn, foundation and government officials tell us that they’re going to get serious about measurable outcomes for their investments, and that underperforming nonprofits had best beware.
Lost in today’s measurement mania, however, is one disturbing fact: This devotion to measurable outcomes is hardly new. Indeed, it is at least a century old.
More to the point, it has itself apparently had so little measurable impact on the way we do business that a full century later, we’re talking about measuring impact as if we’ve just discovered the concept.
How can I make such an outrageous argument? Recall that modern philanthropy at the turn of the 20th century understood itself to be in its very essence a scientific enterprise. It was engaged in the search for the root causes of problems, so that experts could design targeted solutions and measure the results with unerring accuracy.
The hallmark of the new Russell Sage Foundation was the famous, data-laden Pittsburgh Survey. The Rockefeller philanthropies early on invested heavily in the development and application of the social sciences at research universities and think tanks.
When the Ford Foundation came on the scene in the early 1950s, it committed millions to the development of what it called “The Science of Man,” which would unlock the secrets of human behavior and allow us to direct it at will.
Since then, of course, the science of evaluation has become an enormous industry in philanthropy, with literally hundreds of different measurement frameworks on offer, and thousands of consultants willing and eager to apply them.
Social-science measurement entered government service in a big way during the Great Society, when many millions of dollars were spent on program research and evaluation—spending that has only grown since then. Indeed, every presidential administration for the past 50 years has pledged to hold its programs and nonprofit grantees to ever more rigorous standards of performance and accountability.
President Johnson had his PPBS, or Planning, Programming, and Budgeting System; Nixon followed with Management by Objectives; Carter embraced zero-based budgeting and sunsetting.
GPRA, or the Government Performance and Accountability Act, appeared during the Clinton years, along with the “Reinventing Government” effort; President Bush did his PART with the Program Assessment Rating Tool. President Obama vowed to replace that with a more effective measurement tool in his own management-improvement agenda unveiled in the fiscal 2010 budget document.
So the nonprofit world and government been saturated for more than 100 years with ever more elaborate schemes for ensuring measurable outcomes.
But to repeat, it has itself had so little measurable impact that each new generation of grant makers and White House executives arrives on the management scene, quickly scans the environment, and with wonderment exclaims: “Hey, how come no one ever tried to measure impact?”
Now, there’s nothing wrong, I suppose, with yet another new surge of enthusiasm today for measurement. After all, as any evaluation expert will tell readily tell you, “of course all those earlier efforts failed, because they didn’t do it my way!”
But the fact of the matter is, all these past efforts, in both their number and complexity, have created a substantial and growing burden of measurement for the nonprofit world, which too often goes unremarked.
As any nonprofit leader can tell you, she may have to report outcomes in as many different ways as the number of grants she receives from foundations and government, for there is no generally accepted, uniform way for such outcomes to be reported.
And as presidents change, both in Washington and at foundation headquarters, the previous ways to measure—once deemed state of the art science— suddenly appear to be so “yesterday” to the new executives, who are eager to make their own mark on programs.
So nonprofits must not only master many different ways to measure. They must also be prepared for them to change constantly over time, and learn to ride ephemeral metric fads and fancies, even though they may be reporting on the same programs.
The great variety and transience of measurement protocols of course becomes a severe drag on a group’s work. More and more staff time must be deployed simply to master and fill out the various reporting forms.
As a result, small, grass-roots nonprofits—which are so often a key source of innovation—are automatically frozen out of money by the burden of measurement.
Ironically, the same economic hardships that feed the demands for greater measurable efficiency also reduce the money available for this increased managerial burden.
Now, the burden of measurement might be endurable were we confident that all those numbers we were collecting were somehow adding up to a coherent science of grant making.
But no such thing is happening. We are not able to aggregate reported results across large numbers of similar projects. All those numbers are being gathered without any way to make meaningful comparisons among them. That is, they aren’t collected in such a way that they add up to a useable body of knowledge.
A bare handful of programs—again, this is a century after we began counting—can claim to have been scientifically validated according to the “gold standard” of measurement, using randomized control groups.
But for most social interventions, rigorous experimental testing is ethically problematic, extremely expensive, and yields results long after they would be useful for decision-making.
Even then, such results are subject to the evaluator Peter Rossi’s “iron law of evaluation,” namely, that “the expected value of any net impact assessment of any large-scale social program is zero.” As Mr. Rossi famously argued, very few assessments of large-scale social programs have found that the programs in question made any difference at all.
Perhaps that’s why recent surveys suggest that most donors and foundations are not in fact clamoring for more numbers in spite of all the hype.
For all the talk about strategic, data-driven grant making, the Center for Effective Philanthropy reports that very few foundations actually practice it. Even when measurements have been duly gathered, research shows that they have little impact on actual grant making, not affecting the amount of money spent on a program, no matter how clear the numbers seem to be.
When foundation program officers are being honest, they will in fact tell you that in most cases, after the data is collected, the reports sit in a corner somewhere in foundation headquarters gathering dust.
After all, who would be interested in them?
A program officer whose entire focus must be on preparing the next docket of grants, rather than on reviewing past grants?
A board member, whose preferences are determined far more concretely by whom he knows rather then what impersonal numbers might reveal?
Some researcher at another foundation, who gathers her data in a completely different format for a noncomparable program in a dissimilar locale?
As for government financing being determined by outcomes, simply consider the resounding silence and inaction that greeted the federal government’s latest gloomy data on the long-term effects of Head Start.
Government grant makers aren’t any more likely than private donors to take their own research seriously. Here too decisions are swayed far more often by inertia, interest-group pressure, and political connections than by abstract numerical rankings.
Now, I happen to believe that measurement is finally a futile way to approach grant making, so the lack of enthusiasm for its actual employment is neither particularly surprising nor disturbing to me.
But my own relaxed view cannot be taken by those hard-pressed nonprofits, struggling every day to meet mounting needs with shrinking resources, who are now enduring yet another series of lectures on how foundations and governments are going to crack the metric whip—and this time we really mean it!
If only those nonprofits had the luxury to speak the truth to those upon whom they rely for money, they might say something like this:
“The last thing we need right now is to devote yet more time to gathering data that won’t affect your decision one way or another—even if you bothered looking at it, and that cannot be used anyway to build up a coherent or useful body of research for grant making.”
Those nonprofits might continue: “We’re working like crazy to respond to the people who need us. If gathering numbers is what you care about, then end the futile gestures that have governed one hundred years of grantmaking.
“Let’s decide jointly on a simple, coherent, user-friendly system to which we can both pay attention, which will prevail over bureaucratic inertia and political connections, and which will feed into a serious body of knowledge. But until then, stop pretending that the problem is our lack of acceptable performance, rather than your lack of serious purpose.”
My fear is that those words will never be spoken, and that the burden of futile and pointless measurement will only continue to grow.