Timing: Evaluation’s Dirty Little Secret
Apologies in advance for the slightly sensational headline. I’m trying not to let these blog posts get mired too deeply in the earnest and the technical.
What I’m about to tell you isn’t exactly news, and it is far from salacious. Yet, it is hugely important for thinking about pretty much any change you ever hear or read about. These can be changes in a patient’s health after starting a new medication; the sprouting and blossoming of an iris in your garden in the spring; changes in a baseball player’s batting average after he adjusts his swing; and changes in someone’s earnings after they get a college degree. It applies to virtually anything that involves cause and effect. Open a newspaper, listen to the radio, pay attention to what people talk about at work or over dinner. Most of the time, it comes down to a story of ”this happened” or ”so-and-so did this,” and as a result, x, y, and z occurred.
So what is the dirty secret with timing? It is this: your result will be heavily influenced by the point in time your study is conducted. Bear with me for a minute while I fetch my watercolors and brush, and paint you a fairly typical scenario from the development aid world.
Let’s imagine that, several years ago, a big project to help improve people’s lives was implemented. A new agricultural technique to grow vegetables was introduced in the country of Ruritania. Now the government who funded this project wants to know – how much bigger was the yield as a result? Was the money well-spent? Are people better off?
Now imagine that, two years later, a world class team of researchers has been assembled. They employ the most rigorous research design and methodological skills ever developed. Money is not an issue (This is my super, best-case scenario). They design and implement an evaluation that asks the right questions, surveys the right people, and is statistically rigorous in all aspects. It includes qualitative research to understand why and how those changes are or are not occurring. The supervision of the field research and quality control were outstanding. The results have been checked and rechecked for accuracy. It was, in short, a legendary evaluation! The outcome – drum roll, please – was that average yields had increased by 43.5% after two years.
Is that not a terrific result? Yes, most certainly! Our donor is thrilled. But the truth, folks, is that 43.5% is a totally arbitrary figure. Why? Because of the timing. That’s our secret. That lovely figure is culled from just one point in time. Come back in two years, apply the same methods, and you will get another answer. I guarantee it. Maybe come back in 20 years, and you’ll find that average yields had settled in at between 20 and 30% higher than they were before. Or maybe they’re back to square one, perhaps because of soil depletion (We won’t even get into the 500 external factors which influence yields).
You see, every development follows its own trajectory, and the point at which you take the measurement matters hugely. I owe this insight to Michael Woolcock of the World Bank, who emphasized its importance in a seminar I attended a few years ago on qualitative methods and the limits of quantifying change.
Change trajectories follow their own patterns. Some start slowly and then accelerate. Think of the college graduate and her earnings. After graduation, she works for months in a fast-food restaurant, quits out of frustration and is jobless, but then a year later, bingo! she lands a high-paying job in Silicon Valley. (If you had measured the impact of her college degree after 6 months, it would have been disappointing, but if you measured the impact after one year, it would have been very positive!) On the other hand, some developments start rapidly, but eventually fade away. Think of the flower that blooms so prettily in spring, but eventually wilts, as she must. Some changes exhibit modest, steady improvement, and then level off – think of the patient who, after taking the pills, feels a bit better each day, and after a week is back to normal from a full recovery.
This subject is one over several touched on in a fascinating podcast on Freakonomics Radio that describes the results of research conducted by Raj Chetty and colleagues on a 1990’s program, “Moving to Opportunity,” in which poor families in poor neighborhoods were given the chance to move elsewhere. The initial results, in terms of changes in earnings after a 10-year time lapse, were disappointing. But when they repeated the study after 10 more years, they found that the very young children in those families, who were now of working age, were seeing significant differences. It was a matter of timing (And realizing that the main beneficiaries, at least in monetary terms, were the toddlers).
What I’ve described above is not an argument against doing evaluations, or an argument against trying to measure impacts as accurately as you can. It is a call to take into account the nature of our world. It is extremely rare for anything to change in a constant, upward direction.
So, what does this mean for evaluations? For users of evaluation research, it means, don’t take those scientific precise results you get as immutable. They tell about the magnitude of a change at a specific point in time. If sustainability is an important issue (and usually it is) you should do a follow-up evaluation at regular intervals. If that sounds expensive, maybe don’t spend all your funds on one evaluation that will give you precise but chronologically arbitrary results.
For us evaluators, it means that we need to talk about the results we obtain in a way that values accuracy and rigor, yet doesn’t fetishize precision. We need to take into account the limitations while also probing more deeply into how those changes are occurring. When conducting research, we must consider the trajectory of a change. We should take a deeper interest in the chronological context by asking questions about how the changes have come about: how quickly or slowly? What do people experiencing or observing the changes expect in the future? This may not be a scientific method, but I would argue it embodies the scientific spirit by asking important questions of how a change transpired, and what path it took. It means recognizing the ephemeral. It means accepting that, in most cases, we grasp at answers at a moment in time, and tomorrow those answers might be completely different.