Not all Norwegians are blond, or “why we’re so diverse, but you’re all alike”

Out-group homogeneity effect

There is a concept that social psychologists refer to as out-group homogeneity effect.

We perceive members of our own group to be relatively heterogeneous, i.e. we see variation. Everyone else, so-called “out-group members”, however, seem relatively homogeneous

In other words, we tend to think of our group as a mosaic and people from other groups as monotone.

People really do see more variation in personality among in-group members, an attitude confirmed by a number of studies.

It’s an intuitive concept. We know the people we spend more time with (our in-group members) better. Although we all have something in common, whether it’s ethnicity, colleagues, family members, we aware of their individual personalities and idiosyncrasies. Because they’re around us all the time, we have to distinguish them from one another.

We have less information about people from groups we don’t have as much contact with, Norwegians, for example. (Full disclosure – I’m a quarter Norwegian.) We’ve heard they eat a lot of fish, win lots of medals at the Winter Olympics, drill for oil. We might know that they’ve descended from Vikings but are now apparently very socially-minded Nordics.

But the view from inside Norway is, understandably, rather more nuanced than the stereotype. They are only slightly more blond and blue-eyed than people in the rest of the world. And there is a raging debate happening right now between citizens who want the country to stop oil and gas exploration (it is one of the world’s exporters of both) as it aims for net zero emissions, and those who point to the potential job losses of such a move. Of course, Norwegians do share similarities, but the population is not a monolithic bloc of Jarlsberg cheese.

A natural response

The out-group homogeneity effect makes sense from a biological perspective, too.  My research assistant, Hannah Rosenthal, points out that the farther away you are from a group, the more homogeneous it looks, simply because our eyesight has its limits. This is not just true for the nearsighted. It’s harder it is to make out the details. You see the forest, but not the trees.

Taking the long view, stereotypes and biases have probably served us quite well. They probably saved a few lives.

When our ancestors first encountered a spear-carrying stranger from another tribe, it would have been prudent to first consider them as just like all the other “others” we’d come across. That is, as a possible threat.  So, getting into a defensive crouch to size him up would be a smart first reaction. Only after examination, and exchanges, would the other tribe member solidify into an individual, perhaps one to be trusted.

Why it’s a problem

So what? You might respond.

Well, out-group homogeneity effect happens to be a source of bias. It leads to stereotyping, an oversimplified belief that people who share certain characteristics are pretty much all the same.

When we think of outgroup members as being more similar to one another, they risk being stereotyped or seen as interchangeable rather than complex and unique individuals.

This raises also raises an interesting question. Do in-group members feel less need to become more diverse because the inside perspective feels quite heterogeneous already, thank you very much?

When you think about it, it would be odd if such a bias didn’t exist.

This is not an excuse to say, “Hey, what can I do? I’m human, we’ve all got our biases.”

Rather, it is a reminder to reflect on the fact that people in other groups are probably just as diverse as the group you belong to. That applies to groups based on ethnicity, political ideology, nationality, profession, and any other group characteristic you can come up with.

Getting up close and personal, that is, spending time with people from that “other” group is a good remedy

*Photo by Hudson Hintze on Unsplash

.


What the coronavirus teaches us about time and risk

How bad are the odds of dying from coronavirus?

That’s a question on many people’s minds these days. A lot of us are following the numbers closely and wondering, how safe am I?

To truly understand risks, you must account for the time period for which those risks are valid.

According to recent forecasts by the University of Washington’s predictive model the U.S. may lose up to 68,000 people to Covid-19 before the pandemic is over. That number looks low, since as of April 24, 2020, the US has already recorded more than 50,000 deaths, and well over 2,000 per day are added to the toll.

If this forecast is accurate, the risk of dying would be 1 in 4,853. Not great, but not alarming, either. At first glance. Those odds are lower than from dying in a pedestrian accident, which are 1 in 541.

What’s wrong with this picture? Why are extreme lockdown measures being taken at the national, state, and local levels to deal with Covid-19, but not to mitigate many other, often preventable causes of death?

Even if the U.S. were to lose 2.2 million people, based on an alarming early forecast (which did not take into account containment measures), the risk to any one individual would still be “just” 1 in 166. That is far lower than dying of heart disease (1 in 6), and still lower than the risk of dying in a car accident (1 in 106), or even, supposedly, just from falling (1 in 111). (These estimates come from the U.S.-based National Safety Council.) The fatality risk per person from base jumping (pictured above) is around 1 in 60 annually.

Here’s the rub: the seemingly low “risk of death” odds related to the coronavirus are misleading. Why? Because the they don’t account for the time span over which they play out.

Time is a key factor when computing odds of anything happening. You see, you may die in an accident over the next 40 years. But you may die of Covid-19 in the next 40 days.

That, in a nutshell, is why our response to dealing with Covid-19 is so urgent and extreme when compared to our response to dealing with heart disease, car crashes, and other fatalities.

The virus has exposed the fragility of our society

Covid-19 has revealed how fragile both our societies and economies are. This is true both at the local and the global level. It is a story of how a microscopic pathogen has wreaked global havoc. With a diameter of just 60-140 nanometers (laid end-to-end, about 600,000 would make an inch), Covid-19 is nature’s nano-weapon against humanity.

The odds of succumbing may appear fairly low, but Covid-19 is scary because it is an imminent threat. In this post I’m going to explain why.

The invisible weapon against humanity

Let’s review. The coronavirus is adept at homing in on the most vulnerable among us, to disable and even kill. However, there is no guarantee that those who appear fit and healthy have dodged a bullet, as they can succumb as well.  But unlike being hit by a “normal” weapon, in many cases people will never even know they have been “attacked.” In other cases, it will only become apparent days later.  

Worst of all, the virus has weaponized humans against each other. We, the most social of creatures, can’t come physically close to one another because we might accidentally cause each other to die.

And so, almost every government in the world has told its citizens to keep a distance of least six feet from each other. Entire countries are under stay-at-home orders of varying strictness. Businesses have been shut, economies are under massive strain and headed for collapse. It is as though the Great Depression is being resurrected like a zombie from the dead.

The odds of not making it

As bad as things are, in a way, things could be much, much worse. Only a small share of each country’s population shows symptoms of being infected, i.e. are reported as Covid-19 cases.

There is general agreement that these case numbers—tracked by Johns Hopkins and avidly followed by perhaps millions of people—are significantly under-counting the actual cases. The actual number of persons infected is not yet known, given the still unacceptably low testing rates in most countries. Even so, even if the rates of dying doubles, the chance of any single person succumbing would remain very low.

A study from the  Stanford Prevention Research Center estimated the absolute risk of dying among individuals younger than 65 without underlying diseases at just 1.7 per million in Germany (a country that so far has managed to keep death rates relatively low) to 79 per million in New York, the U.S. epicenter of the virus. That comes to a relatively minuscule risk of 1 person out of 588,000 in Germany, and a still very low risk of one person out of 12,658 in New York state.

These odds depend on age, gender and underlying health conditions, among other factors. Men, for example, are more at risk than women. And the risks are much greater for older adults than for the young.

In the U.S., the pandemic is taking a critical toll on African-American and Hispanic communities. This is due to the lethal combination of having worse underlying health conditions, and that they are generally less likely to have health insurance.

Let’s take demographics out of the equation for now, though, and consider the entire U.S. population of about 330 million people. That includes infected and non-infected, of course. And let’s remember that we don’t know the actual rate of infection, because not everyone is tested.

If U.S. fatalities from Covid-19 reach 68,000 before this is over, the absolute death rate for all ages would come to about one in 5,500. Looked at another way, if those are the odds of the average American dying of Covid-19, it means that 5,499 out of every 5,500 will survive this. 

I am not minimizing the importance of the measures taken to try and control the virus. In fact, the projection are fairly “low” largely because of all the social distancing measures being taken. This may be lost on the “liberate America” protesters

Healthcare and hospital systems are under strain in many areas. Clearly, they are not prepared to handle a surge in patients, just as morgues and cemeteries were not prepared to handle them. Doctors, nurses, and healthcare workers—the real heroes of this crisis—are putting their lives on the line while caring for patients. Often, they don’t have the equipment they need.

An outside observer might say, why all the fuss? Why the lockdowns if the risk of dying is so low, not much higher than a bad normal flu season?

Compare these numbers to the Black Death, which hit Europe in 1348. An astounding 30-60 percent of the population may have died. For those going through it, it must have been apocalyptic. Here’s a firsthand account from a chronicle kept by William de la Dene at the cathedral priory of Rochester 30 miles east of London, written in the year the plague arrived:

“A great mortality … destroyed more than a third of the men, women and children. As a result, there was such a shortage of servants, craftsmen, and workmen, and of agricultural workers and labourers, that a great many lords and people, although well-endowed with goods and possessions, were yet without service and attendance. Alas, this mortality devoured such a multitude of both sexes that no one could be found to carry the bodies of the dead to burial…”

We can count ourselves lucky that we’ve only got Covid-19. (But this does not mean we can afford complacency. The next virus to hit us could conceivably be much worse.)

Covid-19 and the other risks we face

To return to my initial argument, if you compare the odds of dying of Covid-19 with the odds of dying from other factors, it is essential to understand over what time period it plays out.

That is because there seem to be many other causes of death for which the odds of dying are much, much higher. They include cancer (1 in 7), suicide (1 in 86), car accident (1 in 106), or from a gun assault (1 in 298). Even the chances of dying from choking on food (1 in 2,618) or from being hit while bicycling (1 in 4,060) seem to be worse than dying from covid-19, at least if you’re under 65. 

The U.S.-based National Safety Council provides a long list of those odds:

Lifetime odds of death for selected causes, United States, 2018

Cause of DeathOdds of Dying
Heart disease1 in 6
Cancer1 in 7
All preventable causes of death1 in 25
Chronic lower respiratory disease1 in 26
Suicide1 in 86
Opioid overdose1 in 98
Motor vehicle crash1 in 106
Fall1 in 111
Gun assault1 in 298
Pedestrian accident1 in 541
Motorcyclist1 in 890
Drowning1 in 1,121
Fire or smoke1 in 1,399
Choking on food1 in 2,618
Bicyclist1 in 4,060
Sunstroke1 in 7,770
Accidental gun discharge1 in 9,077
Electrocution, radiation, extreme temperatures, and pressure1 in 12,484
Sharp objects1 in 29,483
Hot surfaces and substances1 in 45,186
Hornet, wasp, and bee stings1 in 53,989
Cataclysmic storm1 in 54,699
Dog attack1 in 118,776
Lightning1 in 180,746

The odds above show the risk in 2018 of someone in the U.S. dying of these various causes, but not dying in the year 2018 of any of those causes. (If that were the case, 47 million Americans would have died of cancer in 2018, not the estimated 609,000.)

Knowing that your chance of one day dying of cancer are one in seven may be unpleasant, and dying in a pedestrian accident 1 in 541. But these risks do not reflect the chance of dying within the next few months, but over the rest of your lifetime, which will extend for many more decades.

The average age of the U.S. population is 38.2 years, and average life expectancy is 78.5, so the odds in the table above really cover, on average, 40 years of a person’s life span.

At 50 years old, I still have, actuarially speaking, about 34 years left to live, and 34 years during which I could die as a pedestrian crossing the street. I don’t like to think about it, but I can handle it. But I could die of covid-19 in the next month. For proper comparison, it means that the risk of me getting hit by a car in the next month is 408 times (12 months times 34 years) lower than the 1 in 541 odds. The odds are now about 1 in 220,000 of me dying as a pedestrian next month.

Now look again at the odds of dying of coronavirus, of 1 in 5,500 in the very near future. Things look quite different.

To take the most extreme form, the risk of any one of us dying eventually is, unfortunately, 100%. As the Game of Thrones saying reminds us, valar morghulis (“all men must die” in High Valyrian). At the other end of the extreme, the risk of any one of us dying in the next 24 hours is close to zero.

There you have it. This is why there is, to put it mildly, a rather significant time dimension to risks.  And that is why Covid-19 is so brutal. It is not about the possibility of dying from it, one day, maybe in a few years, maybe far in the future. It is that you, your loved ones, your friends, your colleagues, may die of it tomorrow.

Be prudent, be safe, and treasure the life you have. Remember, the biggest risk in life is to die without having lived.

But let’s leave the last word to Game of Thrones author George R.R. Martin. As the inimitable Tyrion Lannister puts it, “Death is so terribly final, while life is full of possibilities.”


Check your outlier – is it a symptom or an anomaly?

The shocking United Airlines ejection of a passenger was an outlier

This week I’d like to talk about outliers. These are people, events, or data points that are so far from the norm that they attract unusual amounts of attention. Outliers make the news. Most of the news stories you read concern exceptions or unexpected events. They grab out attention. The disturbing United Airlines’ scene of a paying passenger being forcibly and violently ejected from the plane represents just such an outlier.

Last week, a passenger was brutally dragged off a United Airlines flight in Chicago by security guards who broke his teeth and nose, leaving him with blood streaming down his face and a concussion. He had refused to give up a seat he had paid for, and United decided that he and three other passengers (chosen randomly according to some algorithm) had to leave the plane. The airline had offered travel vouchers (of $800) but no one had taken them, apparently. So it decided – time to draw straws. And what for? So space could be made for four airline employees, arriving at the last minute for the flight bound for Louisville, Kentucky.

The scene of security personnel pulling the 69-year old Asian American (he was born in Vietnam) through the aisle to the horrified looks and gasps of other passengers was filmed on smartphones and duly posted to the web. It has generated a huge outcry and calls for a boycott of United. The company made things worse when CEO Oscar Munoz issued a terse non-apology, blaming the passenger, Dr. Dao, for being “disruptive” and “belligerent”. Munoz wrote: “I apologize for having to re-accommodate these customers,” which is about as far away from saying sorry as it gets before entering antonym territory.

Meanwhile, there are plenty of news articles predicting that the public will get over it, and United Airlines will weather the storm. This is because many companies have overcome this type of scandal in the past, and because many customers don’t have a lot of choice when it comes to airlines. Consolidation among the big legacy airlines, blessed by the regulators, has ensured that.

The shocking incident is highly unusual, which is why it generated so much attention. We are used to hearing about passengers being escorted from flights for misbehaving, but in this case it was the airline which misbehaved (even if it did outsource the job to airport security).

Outliers – good for the news, but not so good for research?

In the research world, outliers are not a news opportunity. Most of the time, they’re viewed as a problem, sticking out like a sore thumb. Outliers raise questions about data reliability and validity. They distort the mean, leaving an inaccurate impression, even if the data is 100% correct. (Remember the one about how Bill Gates walks into a bar and suddenly everyone is a billionaire, on average?) The solution? Outliers are typically ignored or dropped from datasets by researchers.

But outliers can mean different things. They can be symptomatic or anomalous. Maybe an outlier highlights a larger problem, represents the tip of the iceberg, a leading indicator, a canary in the coal mine, i.e. a symptom of some larger phenomenon or a trend that is about to break. Or maybe it represents an exception to the rule, a bad apple, and can justifiably be disregarded. It is also possible that the outlier is an error. Before deciding how to deal with them or react to them, we need to understand what they mean.  If they’re signaling something, then even researchers need to take a closer look at them. Like a doctor, you should check your outlier in order to make a diagnosis.

In a 2002 article, Vijayendra Rao advocates “having tea with an outlier,” i.e. looking more closely at what they represent, and maybe even talking to them if they represent a survey respondent, to get a different perspective on the issues.

The outlier may have different intrinsic characteristics that sets it apart. I was once asked to find a poor person in a village in Kazakhstan in a region where I was conducting an evaluation for the Asian Development Bank. It wasn’t easy, because the people we talked to didn’t really consider themselves poor and had to scratch their heads when we asked them to point us toward a poor household. Finally, my team members and I were directed to the home of a single mother. We brought her some groceries, and knocked on her door. She invited us in and we sat down and talked to her. It turned out that she had emigrated from Ukraine (I can’t remember whether her husband had died or merely left her) and thus lacked a social support network.  She had problems with her papers. There were also health issues. I don’t believe we actually had tea. She was an anomaly, an exception. She didn’t represent the typical inhabitants of the region. While we learned about what kind of factors might drive people into poverty, her case didn’t tell us much about poverty issues among the population as a whole.

The case for symptomatic

But if the outlier represents an extreme case of a phenomenon that is happening to a lesser degree elsewhere, then it takes on a different meaning. What do we have with United? I would argue that the case is symptomatic and not anomalous. Indeed, although it was well outside the norm (the chances of a passenger getting bumped from a flight remain 1 in 10,000, and the chance of losing your teeth in the process remain vanishingly small) the mood against United has been building for some time and helps explain the outrage. United certainly did not have a good customer service reputation prior, and the extreme mistreatment represents, in many people’s eyes, all that is wrong with the company. The frustration and anger over poor service boiled over. United’s reputation was already solidly second rate. It ranks 66 out of 100 global airlines according to one survey.

My own experience flying United is far from pleasant, and presumably widely shared. Anyone who has flown with a European, Gulf region or Asian airline will know that US carriers in general deliver poor service. While with the better international carriers you might feel as though you were their guest, on most American carriers you feel like a revenue source that, inconveniently, must be processed, takes up physical space, and requires (minimal) attention. They get away with it because of a lack of competition, and because they know passengers put up with it because of the relatively low prices.

The irony is that Americans on the whole don’t tend to be unfriendly; quite the opposite. But once hired by an airline, I can only assume that they are processed through some kind of training module which strips away as much of their humanity as possible (although you will occasionally interact with a friendly crew member or ground staff whom the system clearly failed to process).

Finally, CEO Munoz’s initial reaction to back his staff and essentially blame the passenger was very telling and fairly indicative of United Airlines’ attitudes in general. Based on his initial reaction for Munoz the incident was not such a big deal. In other words, it was within the bounds of normalcy. That suggests it was symptomatic, not anomalous. Granted, Munoz did issue a proper apology days later, but that was tainted by the strong suspicion that it was a reaction to the airline’s share price dropping a bit and not some sort of  recognition that it’s approach to customers in general is woefully lacking in common decency.

So while the news articles which argue that United will survive this debacle may be correct, it doesn’t mean that this particular extreme behavior doesn’t mean anything. I believe the evidence supports the view that it reveals a lot. It is symptomatic of a much larger problem – in a word, disrespect toward passengers, those important but still annoying revenue streams.

(If you’re curious, I will do my best to avoid giving United my business in the future, even if it costs extra.)

Inadvertent airline humor

I leave you with some one liners. They are taken verbatim from a link on United Airlines’ own website called, without a trace of irony, Shared Purpose and Values:

  • We Fly Right: On the ground and in the air, we hold ourselves to the highest standards in safety and reliability. We earn trust by doing things the right way and delivering on our commitments every day.
  • We Fly Friendly: Warm and welcoming is who we are.
  • We Fly Together: As a united United, we respect every voice, communicate openly and honestly, make decisions with facts and empathy, and celebrate our journey together.
  • We Fly Above & Beyond: With an ambition to win, a commitment to excellence, and a passion for staying a step ahead, we are unmatched in our drive to be the best.

Even setting aside the passenger ejection incident, anyone who has ever flown on United – an airline with some of the most unhelpful and unfriendly employees in the world – will be forced to acknowledge that they do have a dry sense of humor.


Give me a number, any number

When interviewing people as part of an evaluation, at some point I like to put them on the spot.

The interviewees will be well-informed about the program under evaluation. That’s how they were selected. They might be policy makers, managers, program implementers, sector specialists or some other type of what we in the business call “key informants,” or “KIs.” The interviews are semi-structured, with a pre-determined set of questions or topics. That means the answers can be open-ended, in contrast to surveys where most responses need to be kept succinct. The open-ended format allows the interviewer to probe, follow-up or clarify particular points. It’s not so different from how a journalist interviews a subject, or police detective interviews a suspect. It’s a process of discovery as well as a matter of answering straightforward questions about program.

During the interview, as we go through the questions and the respondent shares her assessments and opinions (often in many shades of grey) I’ll press her to take a stand and defend her position. I’ll ask for a number. I want a number that summarizes, say, her subjective assessment of the program.

Sure, I like words. I’m a writer, after all. You can learn a lot from words, but there are just so darn many out there! In its ability to synthesize a story, a number can almost be poetic…

In the middle of a discussion about Program A, I’ll ask something like: “Now, based on everything you know about this energy development project, how would you rate its impact on a scale of 1 to 5, where 1 means you noticed no impact at all and 5 means you noticed a very strong impact?” The key informant will come up with a number, say a “4.” I make a note of it, and then follow up with something like “Please elaborate. Tell me why you rated it a 4?” And the interviewee builds a case, offering more details, providing a rationale.

Sometimes the response you get is a surprise. A key informant will be criticizing a program left and right, but then rate it 4 out of 5. Why the apparent disconnect? Apparently all those criticisms carried less weight for the respondent, in the grand scheme of things, than I had assumed. You see, if I hadn’t asked them to give me a number, I might well have walked away from the interview thinking, “Hmm, he thought that program sucked.” In fact, they thought the program was pretty good, but just had some caveats. You can find the opposite scenario as well, of course. A person speaks positively about a program, then rates it a 3. It could be they just had very high expectations.

These are scale questions, referred to as Likert-type scale survey questions after the psychologist who invented the concept, Rensis Likert. Of course, such rating questions are extremely common in surveys. Who hasn’t responded to an online or in-person interview which didn’t include a scale question? Online sites including Amazon, Netflix, TripAdvisor, and Yelp use a similar approach to get us to rate products or services.

The concept has something in common with the efficient market hypothesis, which states that share prices reflect all current available information. All the negatives, positives, and expectations are priced into that one number. Doctors use a pain scale to gauge a patient’s chronic pain. Therapists will ask their clients to rate their depression using a scale. Similarly, evaluators might use a scale to understand the degree of effectiveness of an intervention, or a number of issues.

Typically, the scale question is used for closed-ended interviews, as part of surveys. Responses can be analyzed and used to obtain an aggregate measure for all the 1,243 respondents, as well as different subgroups. For example, you might be find that, on average female participants rate the program’s ability to improve their lives at 4.3, while male participants rated it 3.8. Done well, this line of inquiry can be a valuable method for taking a population’s pulse on an issue.

The nice feature of open-ended interviews with key informants is that you are able to do a little digging after they’ve coughed up a number. In a survey, if you ask respondents to explain their answers, things can get complicated – those longish answers can’t be easily summarized numerically. (Of course, open-ended answers can be coded according to type, but then you lose a lot of that rich detail along the way.) You don’t face that constraint when conducting qualitative research. You face other constraints instead.

I find that rating questions are a good way of cutting through the dense jungle of information you can be pulled into when doing research. You’re walking along, making your observations of the flora and fauna, taking as much in as you can, using your machete to clear the way, trying to figure out whether you’re heading in the right direction (i.e. testing incoming information against your hypothesis, the path which may or may not lead you to the truth). And then you emerge onto a rocky outcropping…and all at once see the whole rainforest spread out below you. Aha! So that’s how the program looks.

Rather than being a reductive or sterile exercise, I’ve found that people being interviewed rather like this type of questioning. They appear to enjoy the exercise of mentally processing large amounts of information on a subject to generate out a single number. And they like explaining how they got there.

Essentially, this is a way to leverage people’s cognitive functions. You’re engaging them in a kind of meta-cognition exercise, in which they examine and explain their own thought processes.

Try it out on yourself. On a five-point scale, rate your satisfaction with, say the place you live; your own job performance…or how much of your life you spend online. Then justify that number in words. You will most likely find that your brain immediately begins sorting through a whole succession of factors, lining up the pros and cons, weighing them against each other.

It may be that I’ve spent far too much of my life evaluating stuff, but I honestly find this exercise quite revealing, stimulating even, in a cerebral sort of way.


Making the case for credibility

In an op-ed for the Washington Post on February 3, discussing journalism, Ted Koppel wrote that “we are already knee deep in an environment that permits, indeed encourages, the viral distribution of pure nonsense.” What is disconcerting is that many people may not care, as long as the nonsense aligns with their worldview.

Take note, evaluators, and anyone for whom collecting evidence is important. If until now the critical issue was ensuring your evidence was credible, henceforth the challenge may be convincing others that credibility even matters. We have entered an era in which information has gone from being something more or less firm, to one in where it is going to be fluid.

The term ‘post-truth’ was selected by Oxford Dictionaries as the word of the year for 2016, defined as “Relating to or denoting circumstances in which objective facts are less influential in shaping public opinion than appeals to emotion and personal belief.” While Oxford Dictionaries has highlighted a serious problem with its selection, I think it would be more accurate to call post-truth the euphemism of the year. Post-truth just smells Orwellian; its academic-sounding preface adds a gloss of respectability to an insidious practice. Post-truth is not related to truth in the way post-modernism is related to modernism. The term hardly deserves to be dignified. A better description is ‘anti-truth.’ This would more accurately and honestly convey what happens when half-truths and falsehoods are spread, aimed at degrading the consensus on reality and contaminating public discourse.

Yes, there are grey areas when it comes to information. It can have multiple meanings. You can argue opposing sides by marshaling selective facts to make your case. (Lawyers are trained to do this.) Thus, it is accurate to note that under the Obama administration (January 2009 to January 2017) unemployment fell from 7.8 to 4.8 percent, which is a good thing. But you can also point to a fall in labor force participation rates from 65.7 to 62.9 percent. Not such a good thing. But in order to have a meaningful argument rather than, say, a shouting match, the basic facts must be accepted, and accessible, to all. If one side says, we don’t trust the US Bureau of Labor Statistics (where these data come from), they’re just a bunch of liars, then there is no basis for conversation.

What seems to be occurring is that one side has become increasingly less interested in engaging in a meaningful argument and is happy to make stuff up, i.e. invent facts. And when credible evidence is produced, it is now often derided as false. For example, the controversy over Obama’s birth certificate: although the certificate was made available in 2011, as of 2016, 41 percent of Republicans disagreed with the statement in a NBC News|SurveyMonkey poll that “Barack Obama was born in the United States.”

Will the skepticism of credible sources filter down to the technical research work conducted in the social sciences? Let us hope not, although the new Administration’s gag order on scientists in federal agencies is not encouraging. We may have to confront a whole new dilemma. No longer will it be sufficient to provide credible evidence, transparency of methodology, and detailed information on sources. We may need to defend the very concept of credibility, make a case why credibility matters to those who disagree.


An Illuminating Case: Mixed Methods and Reconciling Conflicting Findings

street-sign-two-way-arrow

Warning – Contradictions Ahead!

I’m a big fan of using mixed methods in evaluation. That means combining qualitative and quantitative data, such as statistical data on household energy consumption and interview findings, where people reveal what it’s actually like to live with regular blackouts. This is not just because it’s always interesting to poke around at problems from different angles, but because the resulting analysis is generally much more layered and nuanced. Let me use a case study from a few years ago to illustrate what I’m talking about.

Applying different methods to the same problem is like stepping into the shoes of different blind men around an elephant, all trying to determine the nature of the creature. Instead of standing in front, holding the trunk and guessing it’s some kind of snake, or standing alongside, grabbing a leg and guessing it’s a tree, and so on, you test every dimension by shifting your position. A key advantage in using mixed methods is the ability to triangulate findings. Not only can you compare and cross-check your results – when done well, a more multi-layered, nuanced version of the underlying issues emerges. On the other hand, doing so can also complicate your life: there is always the risk of ending up with findings that don’t make a lot of sense.

elephant-solo

Nonetheless, when findings derived from different methods not only highlight different attributes, but point in different directions, that is when the most interesting decision points and nuanced analysis can arise. Assuming the problem does not lie with the methodology or data collection itself (something which should always be checked), evaluators will need to do some probing to figure out why the data points in different directions.

We faced this issue as part of a 2004 World Bank-financed evaluation of the household impacts of electricity sector reforms in Moldova, using the ‘poverty and social impact analysis’ (PSIA) approach. The accompanying steep increase in electricity tariffs was seen by some as hurting the poor, and a reason to roll back reforms.

The data we analyzed showed a marked improvement in electricity consumption among the poorest 20% of households. However, the perception among poor households, gleaned through qualitative methods (focus groups) painted a less than rosy picture. We heard a lot of complaints from them. Based solely on the quantitative evidence, our report might have concluded that all was well, that concerns over hurting the poor with high prices were overblown. Conversely, if we had only used qualitative findings, we might have concluded that the reforms were indeed a negative for the poor.

Is raising electricity tariffs good or bad for the poor?

power-lines-for-moldova-post

A bit of background: by 2003, with World Bank assistance, Moldova had privatized two-thirds of its electricity distribution network (the part of the grid which delivers electricity to customers). The aim was to improve sector performance by moving it to a commercial footing. Then, as now, Moldova was one of the poorest countries in Europe. A Spanish operator, Union Fenosa, had won the tender, with a pledge to invest $54 million in upgrading the infrastructure.

Prior to privatization, bills had gone unpaid, operating costs had not been covered, barter payments were common, and the network was severely degraded. Outside of Chisinau, the capital, most people could only count on a few hours of electricity per day. However, a Communist government had recently been elected and, with electricity tariffs rising steadily for years, some policy makers were now arguing that the reforms were having a punitive effect on poor households. There were fears among the donors that the privatization would be reversed, returning the distribution networks to state control. This is what triggered the study.

I was part of a team of Bank staff and consultants tasked with evaluating whether the poor were, in fact worse off now than before the reform.  As tariffs increased, were they consuming less electricity? Or were they perhaps cutting back on consumption in other areas? Either scenario would have pointed to a negative welfare effect.

The methods behind the madness

By matching billing data provided by the electricity with data on household expenditures (from the country’s Household Budget Survey), we were able to track consumption patterns over the 5-year period which coincided with tariff rises and privatization. Tariffs had begun climbing prior to privatization (to make the sector more attractive to potential bidders) and continued to afterwards. Over that period, they rose a whopping 300%, in nominal terms which translated into 26% in inflation-adjusted terms. This meant that the cost of electricity rose slightly faster than the average costs of other goods in the household consumption basket. That fact doesn’t provide a sufficient basis on which to base a conclusion about welfare.

With this in mind, we did not just analyze the data, we also talked to the population directly (thereby mixing our methods). A local research firm held 43 focus group discussions and 59 key informant interviews across the country. We basically wanted to hear about the users’, the electricity consumers’, perspective on these changes. Many focus group participants related how badly conditions had become during the, literally, dark years after independence.

“About two years ago a girl was raped, and now I don’t permit them to leave the house in the night,” reported a woman named Anea

“Everybody, children, the elderly, grown-ups, are affected by the darkness in the entrance hall and on the streets. It is very difficult to walk when we return home late in the evening. The roads are in a bad state, full of holes and one can easily fall down and break the neck,” Tamar said.

“In our entrance hall, the bag of an old woman was stolen, together with her pension for one month,” according to Elena

What did the data show? We found that over the period electricity consumption among the poorest 20% had risen by 14.6%, even as electricity tariffs rose more than 300% in nominal terms and 26% in real terms. Of course, people weren’t consuming more electricity because the cost went up; they were consuming more because their incomes were increasing. We also found that among all other households (which we designated the ‘non-poor’), consumption rose by 3.2% (i.e. the poor experienced bigger consumption gains). Perhaps more importantly, there were no more rolling blackouts. From having on average 4 hours of electricity per day, everyone now enjoyed electricity service 24/7. Based on the data, we could confidently say that the poor were not being hurt by privatization with the concomitant tariff increases. (We did not seek to determine whether the electricity reform made the poor better off, a rather more complex and somewhat subjective question.)

Data vs. perceptions

So was the government wrong? Were the poor pleased with the changes? Hardly.  Although many people acknowledged that things were better – they now had electricity 24/7 – the overriding message that emerged from focus group discussions was that many people were unsatisfied. They did not feel better off. They certainly did not express undying gratitude for the reforms. They complained about costs going up, about having to save, and about quality issues (e.g. voltage fluctuations). And they were unhappy about what they perceived as Union Fenosa’s excesses. The company had built itself a multi-story headquarters building, bought new company vehicles, sent out electricity bills using colored ink on high quality paper (replacing the cheap brown paper used previously, a Soviet-legacy. To top it off, it was seen to be spending money frivolously by paying for schoolchildren to go on outings to the circus and other measures aimed at burnishing its corporate image. All of this did not seem to impress the average Moldovan. In her view, the company should instead have kept its tariffs lower.

Returning to our quantitative findings in the light of public perceptions, one particular piece of information was of critical importance for setting the context: electricity consumption levels in Moldova at the time of privatization were extremely low. The average household consumed just 50 kWh per month, the equivalent of a couple of light bulbs and a TV. This was close to one third the Eastern European average, and a small fraction of developed country consumption levels. Many people (exactly half, of course) fall below the average increase in electricity consumption, and many had to reduce

According to Vasile: I think that the reform had a positive impact, the regular blackouts prior to the reform practically stopped the activity in many fields. However, there is the price issue.

“During winters I unplug the refrigerator, hence I pay by 35 lei less per month” a participated named Victoria said.

Did these findings invalidate our quantitative findings? I would say no, they were complementary. They helped explain why the government had been hearing mostly negative feedback about the privatization. While we found that the poor were not being hurt financially, it would have been a stretch to say that privatization had been a boon for them. The difference between 50 and 57 kWh per month is not exactly a game changer or cause for rejoicing if you’re a poor household.

This study illustrated the importance of using different methods of inquiry and, conversely, the risks of not doing so. If we had used one method only, we would have arrived at rather different conclusions – one too rosy (based on the consumption data) and one two bleak (based on people’s perceptions). Neither finding was right or wrong. They simply told different sides of the same story, and helped explain two very different but valid perspectives on the complex and charged (no pun intended) issue of privatization.

You can read the full report here: World Bank 2004 Moldova Electricity Reforms