Stanford Social Innovation Review : Informing and inspiring leaders of social change

SUBSCRIBE | HELP

Measuring Social Impact

Get Out of the Office

You’ll never get what’s going on out there by sitting in here.

I thoroughly enjoyed the intellectual bar fight between Charity Navigator and fans of GiveWell over a recent SSIR blog post. A few years ago, GiveWell posted a no-holds-barred critique of Charity Navigator, and after suitable time to sharpen the knives, Charity Navigator roared back, slamming GiveWell’s approach to philanthropy as “defective altruism.” A vociferous debate ensued, punctuated by calls for civility, chiding over name-calling, and some smart comments on both sides.

This stuff is both healthy and entertaining, which is a pretty great combo. And you’ve got to admire the Charity Navigator guys’ chutzpah: The last time I was paying much attention, they’d handed Greg Mortenson’s Central Asia Institute a four-star rating right before the Three Cups of Tea scandal broke. They’ve yet to deliver on a 2008 pledge to report on impact, and while there is progress, their utility is limited until impact is central to their ratings. I don’t think they’ve seized quite enough moral high ground to call others’ work “defective.”

Anyway, after reading Charity Navigator’s post, I went back to read GiveWell’s original critique, and then ended up noodling around on their website for a couple of hours (which I highly recommend—there’s a ton of interesting and useful material there). When I got to the “Top Charities” page—the centerpiece of the whole site—I almost sputtered coffee all over my keyboard. The page features a grand total of three organizations: Deworm the World, the Schistosomiasis Control Initiative, and GiveDirectly. After all that research and verbiage—three? Really? And two that do essentially the same thing? We like mass-deworming too, but come on.

The GiveWell website says, “We see ourselves as a ‘finder of great giving opportunities’ rather than a ‘charity evaluator,’" but the truth is that they set up a bar and only three organizations got over it. Given that, it seems to me that either: a) the international poverty sector sucks and is not worth their time, or b) they need to get out more. Given my own experience in the sector, I’d have to go with “b.”

GiveWell does its research in the office. GiveWell staffers—none of whom have a background in international poverty work—have visited a total of two programs in the past two years. The approach appears to be something like this: Find an intervention already supported by a bunch of expensive randomized controlled trials (RCTs). Identify an efficient implementer of that intervention. Recommend. Repeat twice. Done. I don’t know why it took all those smart people so long to come up with three recommendations.

I’d like to see the social sector do a lot more RCTs, and Mulago enthusiastically funds three great members of the RCT mafia: JPAL, IPA, and IDInsight. However, RCTs aren’t always appropriate or doable, and there are a lot of other ways to reach a reasonably confident understanding of impact (or lack thereof). Overall, I’m more interested in ongoing internal impact evaluations that feed quickly back into design and operations than ponderous episodic RCTs, but to trust an organization and its methods, you have to get out there and get to know them well.

A case in point: GiveWell said that Root Capital, which provides technical assistance and loans to businesses that buy from smallholder farmers, didn’t have sufficient evidence of impact for GiveWell to even consider it for top charity status. I went to Uganda a couple of years ago as part of our own due diligence. I saw a cotton ginnery that Root Capital had financed and rebuilt in the area laid waste by the Lord’s Resistance Army. All the producers are smallholders; we know how much cotton they are selling now, and we know how much cotton they were selling before, which was zero.

Some version of this happens in most Root Capital sites, with good-quality numbers indicating that a lot of farmers stabilize and/or increase their incomes. The evidence is strong, but given a heterogeneous international portfolio, it’s hard to package it up neatly. No matter how much impact Root Capital generates, its work doesn’t lend itself to an RCT, and so an important solution to the vexing problem of rural poverty will never make the GiveWell grade.

Another example is One Acre Fund, working with 150,000 very poor farming families in Kenya, Burundi, Rwanda, and Tanzania. One Acre provides farmers with the fertilizer, seeds, training, support, and access to markets they need to make a decent yield from their tiny plots. The farmers, in turn, repay costs from the proceeds of harvests. In most cases, farmers triple their yields and after repayment, double their farm incomes. One Acre measures this by comparing— literally, weighing—the yield from a random sample of present One Acre farmers with the yield of a random sample of would-be One Acre farmers in a similar area where the organization plans to go next. Is it a perfect way to measure? Nope. But if you are in the field, and if you have experience with smallholder farmers, it is utterly clear that something profound has happened. What’s more, it’s a doable, affordable method that the people at One Acre can repeat again and again in the various settings where they work. As funders, it provides a cheap, high-confidence “movie” of what’s going on in four different settings, rather than the expensive one-time, one-setting snapshot afforded by an RCT.

I spend a lot of my own time going on about the failure of the social sector to measure and invest on the basis of impact, so it’s a little weird to find myself criticizing GiveWell. I admire much of GiveWell’s work, and the organization’s insistence on evidence of impact is a service to all of us. It’s just that we—Mulago’s staff, our fellows, and everyone in our portfolio—spend a lot of time and effort neck-deep in the messy, humbling business of measuring real impact in the real world, and GiveWell’s desk-based proclamations can be, well, irritating. Following the RCT trail to find stuff like deworming and cash transfers is easy; to find the impact jackpot, you need to immerse yourself deeply enough in context and methods to make a reasoned judgment. You also have to be a little flexible: Real-world measurement often requires a certain amount of creativity. You can’t just set an impossibly high bar and wait for stuff to show up on your desk. Precision in this business is a mirage, and often a distraction. We want real numbers and real attribution, but we’re happy to take a small hit on accuracy if it we get a convincing picture of real impact that we couldn’t have gotten otherwise.

We all need GiveWell: We need their obsession with impact, big brains, and unwavering honesty and candor. But whether you’re Charity Navigator, GiveWell, or Mulago, you’ve gotta get out there or at least follow the lead of someone who does. When I visited the Central Asia Institute in Pakistan, it was obvious within hours that the operation was a shambles; when I went to see Root Capital in Uganda, reams of data took on new meaning. This isn’t an office-desk business. I’ve made some of my dumbest mistakes because I didn’t go to the field first, and some of our best stuff came to our attention only because we were poking around off the grid and far from home.

So: Shut down your computer, turn off the lights, and go.

Read more stories by Kevin Starr.

Tracker Pixel for Entry
 

COMMENTS

  • BY Kristin Gilliss

    ON January 30, 2014 06:02 PM

    Now I know why I don’t have desk.

  • BY steve wright

    ON January 30, 2014 10:02 PM

    “We want real numbers and real attribution, but we’re happy to take a small hit on accuracy if we get a convincing picture of real impact that we couldn’t have gotten otherwise.” Kevin, thank you for this. I’m curious about your take on performance management vs. evidence of impact. I ask in the context of this blog because I think it gets at the issue of cost. For those of us executing on programs, the central issue is to balance cost with efficacy. I am imterested in impact measurement to the extent that it can answer the question “Am I any good?” The best case scenario is that answering this question demonstrably leads to evidence of impact. And, the trick is to get this data to fall out of the work. It is my opinion that evidence of efficacy is not for funders/investors. Its for business intelligence. Funders need to learn to find well run businesses.

  • Jacob Trefethen's avatar

    BY Jacob Trefethen

    ON January 31, 2014 05:08 AM

    This seems like a well argued, fair critique, which I mostly agree with. I think there are two questions worth separating that the author addresses:

    (1) Is GiveWell’s bar for evidence of impact too high?
    (2) Does GiveWell use the correct measure to determine (expected) impact?

    I think the answer to (1) may well be Yes. It has long been discussed, on the GiveWell blog and elsewhere, how much certainty you should trade-off for a high (explicit) expected value of impact. Many have argued, I think convincingly, that GiveWell have the balance too far in the certainty direction of this trade-off, hence their conservative recommendation of GiveDirectly.

    Yet the answer to (2) might also be Yes, contra Starr’s argument. Starr suggests that rejecting a charity without first (e.g.) visiting the sites on which they operate could lead to missing out great opportunities. This is true, but GiveWell has limited time and resources. It strikes me as a very sensible approach when faced with thousands of poverty-focused charities from which to choose to investigate further those charities that look promising on impact from the start. A great way to do this is by sitting behind a desk and looking at RCT data.

  • BY Jenny Stefanotti

    ON January 31, 2014 07:38 AM

    To be fair though, GiveWell is about maximizing impact from a donor’s perspective—i.e. I don’t care about just impact, I care about maximizing the incremental social welfare of a dollar given.  So the notion of getting out more and having a sense of impact even if there isn’t an RCT, or recommending a long list of charities isn’t really aligned with their stated goals.

    I think there’s a separate argument to be made about why that might not be the best way for donors to think, for a long list of reasons —false sense of precisions, misinterpretation or RCTs,  ignores motivations for giving, hinders funding innovation, and so on (blog post on that soon maybe?).

    All points to the need for another intermediary who donors can trust to evaluate giving opportunities but with a slightly different POV.

  • The difference between how Mulago is described here and how I understand GiveWell is that GiveWell has to maximize towards both impact and transparency whereas Mulago can just focus on impact. That is, GiveWell should only recommend nonprofits that it can clearly prove are high impact to an uninformed third party. So, unfortunately, while on-the-ground stories like the ones told here can be good sources of evidence for an individual, they don’t translate well to others. So, “getting out of the office” would not be a useful strategy for GiveWell even if it is very useful for Mulago.

    One area where GiveWell could probably improve is their rhetoric surrounding charities that they don’t recommend. It sometimes seems as if GiveWell is implying that insufficient evidence of impact means the nonprofit does not have a large impact. But, this is clearly not the case.

  • The public radio show This American Life did a profile of GiveDirectly last year, comparing it with more traditional charities such as Heifer International. The reporters went to Kenya and visited households that received funds from GiveDirectly. They also visited families that had gotten cows from Heifer International. If they had only relied on those field visits, they might have concluded that Heifer International was the more effective charity, because the Heifer International cows were so much bigger and healthier than cows bought by people who’d received GiveDirectly funds.

    But that would have missed the fact that vastly more donor money went into each one of those Heifer International cows. To get at the impact per dollar donated, you need desk research by organizations like GiveWell. And you need impartial analysis that’s not colored by field visits where you might get plenty of anecdotal and visual evidence, but you might not get the full story or see the full picture.

    Clearly we need both: we need organizations that do desk research and we need organizations that do site visits. And I’m not sure the desk-research organizations really do need to get out of the office, for the reasons stated above by other commenters.

  • Michael Keenan's avatar

    BY Michael Keenan

    ON January 31, 2014 12:42 PM

    GiveWell’s process isn’t “Identify an efficient implementer of that intervention. Recommend.”

    It’s “Identify ***the most*** efficient implementer of that intervention. Recommend.”

    That’s why they recommend only a few charities.

  • BY William MacAskill

    ON January 31, 2014 04:16 PM

    “When I got to the “Top Charities” page—the centerpiece of the whole site—I almost sputtered coffee all over my keyboard. The page features a grand total of three organizations: Deworm the World, the Schistosomiasis Control Initiative, and GiveDirectly. After all that research and verbiage—three? Really? And two that do essentially the same thing? We like mass-deworming too, but come on.”

    If a website had the goal of finding the best place to buy gold from, it would only need to recommend one place. If a website were trying to find the best company to invest in, it would only need to recommend one company. The same is true for a website that advises people on where giving their money will do the most good

    The question GiveWell is addressing is not “Which charities are worthy?” but rather “Where is the best place for me to give my money?” If you’re trying to do the most good with your dollar, then the latter question is what’s relevant; the former isn’t.

  • Kevin Starr's avatar

    BY Kevin Starr

    ON January 31, 2014 06:00 PM

    This is great stuff.  Thank you. They are such good points, it’s worth addressing each:

    Kristin:  we’ll get you a desk.  I promise.

    Steve:  Everyone has to manage/measure performance; it’s just part of responsibly running a firm, for-profit or not.  Everyone has to measure impact, because if you don’t, you cant iterate toward creating more of it more efficiently.  Understanding your cost per unit of impact is probably the best way to ultimately capture both.

    Jacob: Your point is an excellent one, and I hate to say it, but whether you’re GiveWell, Mulago, or Charity Navigator, if you don’t have the minimum resources necessary to do it right, you perhaps shouldn’t do it at all.

    Jennifer: You’re right on both counts; it’s just that setting one’s organization up as an arbiter of “Top Charities,” only accepting RCT’s, then using that to justify naming only three is a little weird. 

    Kerry:  Numbers are primary; it’s just that an accurate story help put them in context.  I agree about the rhetoric - the labeling of organizations as “contacted - declined” is bullshit.

    Brad: You might want want to read the piece again- I was very clear that high quality numbers come first, but that field visits add a critically important context and understanding of measurement methods and constraints.  Desk vs field is a false dichotomy: you need to do both.  I’m allergic to heart-warming stories without evidence; I imagine you are as well.

    Michael:  Since their are only two organizations specifically focused on mass-deworming and exactly one doing only unconditional cash transfers, “most efficient” vs merely “efficient” is a distinction without a difference. 

    William:  Where is the best place to give my money to save kid’s lives?  To help one-acre farmers make a decent living over time? To create rural food security? To assure literacy?  The notion that two de-worming organizations and one organization that gives out cash are somehow the best bang for your buck of all the organizations working on all the crucial problems out there is, well, absurd.  I don’t think the GiveWell guys would back you one that one.

  • Michael Keenan's avatar

    BY Michael Keenan

    ON January 31, 2014 06:50 PM

    “The notion that two de-worming organizations and one organization that gives out cash are somehow the best bang for your buck of all the organizations working on all the crucial problems out there is, well, absurd.  I don’t think the GiveWell guys would back you one that one.”

    The GiveWell FAQ contradicts this:

    “We recommend few charities by design, because we see ourselves as a “finder of great giving opportunities” rather than a “charity evaluator.” In other words, we’re not seeking to classify large numbers of charities as “good” or “bad”; our mission is solely to identify, and thoroughly investigate, the best.”

    http://www.givewell.org/about/FAQ#WhydoesGiveWellrecommendsofewcharities

    They examine this at greater detail here: http://blog.givewell.org/2013/03/29/why-we-recommend-so-few-charities/

  • BY steve wright

    ON January 31, 2014 10:31 PM

    “Everyone has to measure impact, because if you don’t, you cant iterate toward creating more of it more efficiently.  Understanding your cost per unit of impact is probably the best way to ultimately capture both.”

    Kevin, I passionately agree and I think the cost of “proving” impact and the audience for the “proof” and the definition of proof vs. Insight are critical practical considerations. Ultimately what we are doing is solving intractable global problems by building individual and ostensibly sustainable “businesses”.  What a business needs to know about its own efficacy is a very different burden than what funders/investors need to know about their efficacy.

    At Grameen Foundation we have built tools (Progress out of Poverty Index and TaroWorks and MoTech) that enable organizations to manage performance and gain insight.  The first order goal is to improve the efficacy of individual organizations. Additiomally, the design of the tools enable aggregation for insight across many organizations.

    I agree that RCTs are extremely valuable and we need pre-RCT environments that are well managed AND data rich.  Thanks again for this excellent post.

  • Happy to Comment's avatar

    BY Happy to Comment

    ON February 6, 2014 10:14 PM

    Nice post.
    What none of these groups can measure are the compelling work of human rights funders supporting grassroots movements and community attempts at policy engagement on their own terms. CN and Givewell in this blanket idea that they somehow cover “charities” promote the idea that the only causes worth giving to are the ones that are amenable to scientific testing by US economics students parachuting into the South.  They have no idea about the sociopolitical histories of these places. Hence the battle of the dewormers. Not too compelling for most millennial donors.  Let the armchair statiticians battle among themselves while the rest of us learn about actual people and places.

Leave a Comment

 
 
 
 
 

Please enter the word you see in the image below: