AI Machine Learning: Can We Put Our Faith in Bias?

AI Machine Learning: Can We Put Our Faith in Bias?

A recent study found that explanation approaches that assist users decide whether to believe machine-learning model predictions may be less accurate for disadvantaged minorities.

AI Machine Learning: Can We Put Our Faith in Bias?

When the stakes are high, machine-learning algorithms are occasionally used to help human decision-makers. A model, for example, may predict which law school candidates are most likely to pass the bar test, enabling admissions staff in making decisions about which students to accept.

Because of the complexity of these models, which can include millions of parameters, it is practically hard for AI researchers to completely comprehend how they forecast. An admissions officer with no machine-learning knowledge may be unaware of what is going on behind the scenes. Scientists will occasionally use explanation approaches that replicate a broader model by making modest approximations of its predictions. These more understandable approximations let consumers decide whether to accept the model's predictions.

But, are these techniques of explanation fair? Users may be more likely to accept the model's predictions for certain persons but not for others if an explanatory technique offers better approximations for males than for women, or for white people than for black people.

MIT researchers thoroughly analyzed the fairness of certain commonly used explanation strategies. They observed that the level of approximation of these explanations varies greatly amongst subgroups, and that the quality is frequently much poorer for minoritized subgroups.

In practice, if the approximation quality is lower for female candidates, there is a mismatch between the explanations and the model's predictions, which may lead the admissions officer to reject more women than males.

When the MIT researchers realized how widespread these inequalities are, they explored a variety of methods to level the playing field. They were able to close certain gaps, but not all of them.

"What this means in practice is that individuals may wrongly believe projections for some subgroups more than others." Improving explanation models is vital, but so is explaining the specifics of these models to end users. These gaps exist, therefore users should alter their expectations of what they will obtain when using these explanations," says lead author Aparna Balagopalan, a graduate student in the MIT Computer Science and Artificial Intelligence Laboratory's Healthy ML group (CSAIL).

The paper was co-authored by Balagopalan, CSAIL graduate students Haoran Zhang and Kimia Hamidieh, CSAIL postdoc Thomas Hartvigsen, Frank Rudzicz, associate professor of computer science at the University of Toronto, and senior author Marzyeh Ghassemi, assistant professor and head of the Healthy ML Group. The findings of the study will be presented at the ACM Conference on Fairness, Accountability, and Transparency.

High precision

Simplified explanation models can approach human-readable predictions of a more complicated machine-learning model. An efficient explanation model optimizes fidelity, which evaluates how well it matches the expectations of the bigger model.

Rather than concentrating on overall model fidelity, the MIT researchers investigated fidelity for subgroups of persons in the model's dataset. In a dataset comprising men and women, the fidelity should be quite comparable, with both groups having fidelity near to that of the overall explanatory model.

"If you only look at the average fidelity across all instances, you may be missing out on artifacts that exist in the explanatory model," Balagopalan explains.

They created two measures to assess fidelity gaps, or fidelity differences between subgroups. One is the difference in faithfulness between the overall average and the fidelity for the worst-performing subgroup. The second computes the average of the absolute difference in faithfulness between all conceivable pairings of subgroups.

They used these criteria to look for fidelity gaps using two types of explanation models trained on four real-world datasets for high-stakes situations including forecasting if a patient will die in the ICU, whether a criminal would reoffend, or whether a law school candidate will pass the bar exam. Individual people's sex and race were protected characteristics in each dataset. Protected qualities are characteristics that cannot be utilized to make judgments, usually owing to laws or corporate regulations. The definition of these might change depending on the job at hand in each decision scenario.

The researchers discovered significant fidelity gaps in all datasets and explanation models. For underprivileged populations, loyalty was frequently substantially lower, reaching up to 21% in some cases. The fidelity difference between racial subgroups in the law school dataset was 7%, which means that approximations for some subgroups were inaccurate 7 percent more often on average. If there are 10,000 applications from these groupings in the dataset, for example, a large number may be incorrectly denied, according to Balagopalan.

"I was astonished by how common these fidelity gaps are in all of the datasets we looked at." It is difficult to overstate how frequently explanations are employed as a "fix" for black-box machine-learning algorithms. "We show in this study that the explanation techniques themselves are flawed approximations that may be worse for some subgroups," Ghassemi explains.

Closing the Gaps

After discovering fidelity gaps, the researchers attempted to resolve them using machine learning techniques. They trained the explanatory models to identify areas of a dataset that may be prone to low fidelity and then concentrated their efforts on those samples. They also experimented with balanced datasets that had an equal amount of samples from each grouping.

These effective training tactics reduced but did not eliminate fidelity gaps.

The explanation models were then adjusted to investigate why fidelity gaps arose in the first place. Their investigation demonstrated that an explanation model might employ protected group information, such as sex or race, that it could learn from the dataset in an indirect way, even if group labels are obscured.

They intend to investigate this conundrum more in future study. They also intend to do more research on the implications of fidelity gaps in the setting of real-world decision-making.

Balagopalan is encouraged to learn that parallel study on explanation fairness from an independent lab has reached similar conclusions, emphasizing the necessity of thoroughly understanding this subject.

She offers some words of caution for machine-learning users as she looks forward to the next phase of this study.

"Select the explanation model with care. But, more crucially, consider the purpose of utilizing an explanatory model and who it will finally influence," she advises.

"I think this study is a really useful addition to the conversation about fairness in ML," says Krzysztof Gajos, Gordon McKay Professor of Computer Science at Harvard's John A. Paulson School of Engineering and Applied Sciences, who was not involved in the research. "What I found especially intriguing and impactful was the preliminary evidence that differences in explanation fidelity might have meaningful effects on the quality of judgments made by individuals supported by machine learning models." While the predicted difference in choice quality may appear slight (about 1%), we know that the cumulative consequences of such seemingly little variances may be life changing."

Post a Comment

Previous Post Next Post