Image by Gerd Altmann from Pixabay
When we think of prejudice, we tend to think of the most obvious ways that it shows itself. The current zeitgeist has surfaced the harm done by passive acceptance of injustice. As a data scientist, you have an important responsibility in making progress toward fairness. Often, the drive will have to come from you as the practitioner. It can be difficult to know how to approach this issue. I’m going to focus on three concrete ways in which you can work towards eliminating bias from your data science decisions.
Say the word “accuracy” in reference to a machine learning model, and data scientists will come out of the woodwork to explain why it’s not the right metric. They’ll continue by listing a half dozen better ways of quantifying effectiveness. It’s doubtful those same semantic warriors are as well versed in ways of measuring fairness in ML models. We want our models to be fair just like we want our models to be effective and performant. Consequently, we measure fairness. What you report about a model reflects your priorities, and measuring fairness should be a priority.
Measuring fairness in models can be challenging. In an attempt to avoid introducing bias in our models, many datasets will not include demographic information. This is a great illustration of good intentions leading to bad results. You should record demographic information, even if it’s withheld from the model. Without this information, it’s not possible to measure how your model performs across these lines. If your company is not recording this information, consider advocating on behalf of doing so.
Once you have the ability to measure fairness, you’ll need to choose the right metric. In a previous post, I discussed a metric for quantifying bias in healthcare called group benefit equality, and drilled into a somewhat technical description of why I believe it is the most appropriate measure in the context of healthcare. If you detect prejudice in your algorithm, this post provides a nice description of how you can use SHAP scores to start quantifying why the decision is biased in the way that it is.
Researchers in the social sciences often have issues generalizing their theories. One reason is that they conduct experiments on college students, which represents a narrow band of society. The description for these data sets are WEIRD: Western, Educated, Industrialized, Rich, and Democratic. Such a phenomenon is okay if you’re only trying to make predictions about those groups. When you try applying social theory to other parts of the world, underlying assumptions about norms break down. Most datasets we use have similar assumptions about how an individual will generate data, and can often lead to inequality.
This effect can be particularly bad in healthcare. There are pronounced differences between the way rich and poor people use health systems. For example, poorer individuals use ambulatory services to get rides to normal appointments. This is difficult to anticipate when you’ve never lacked access to a car. Oftentimes, subject matter experts (SMEs) will be keenly aware of such phenomena. Collaborating with SMEs is extremely valuable for brainstorming ways to account for such effects, particularly in the feature engineering stage.
The things that aren’t recorded by the data are an even greater challenge to deal with. Healthcare professionals know that a huge portion of what determines your health is based on your home and work environments. Just a few examples of health factors tied to wealth and zip code are access to healthy food, exposure to harmful chemicals, and easy access to green space. Accounting for these effects are difficult, but can often be the key for gaining insights to why a group measures low in fairness measures.
One of the biggest aspects that differentiates great data scientists from the rest is their ability to understand how their model impacts the real world. The reason this is so important, is that understanding the intended use has an impact on how you define the group you make predictions about, and how you define your outcome. While focusing on intended consequences will make you a good data scientist, considering unintended consequences will make you a great data scientist.
For example, you might consider building a model to predict if a person will be hospitalized due to a heart attack. One reasonable step might be to limit your population to those individuals who have a previous diagnosis of a heart condition. Filtering the population in this way is a double edged sword. While it will make your model more precise, it will also filter out those individuals living with no previous diagnosis. If a person is getting regular checkups, it is more probable that the conditions building up to a heart attack are recorded. By excluding individuals who have unusual usage patterns, you end up building a model that will direct resources to individuals who are already getting the best care.
One of the best aspects of working on data science is that healthcare provider’s and patient’s interests align. Healthcare providers want to predict severe medical complications so that interventions can be applied. As an example, if you note that a person is a risk for a fall related injury, the intervention might be to get them a step-stool. It’s deadly simply and really effective. Win-win. Many of the known inequalities in healthcare are impactable. Actively engaging with SMEs should guide your modeling efforts to help make those efforts more effective. Your models have the potential to make people healthier, help the bottom line, and create a more equitable healthcare plan. That’s a big of win-win-win as there is.
Interested in reading more about prejudice, fairness, and bias in healthcare data science? Check out these blog posts:
We add new resources regularly. Enter your email address to get them directly in your inbox.