Relevance for the boardroom
No matter how accurate algorithms might be, if they rely on incomplete data or are overly biased, they can produce results that feel unfair, and correcting this tendency is not merely a technical problem, says Rogier. Apart from eliminating unfair outcomes, there are plenty of other reasons for boardrooms to address this problem. “There are laws and legislation to comply with, so organizations can face fines or lose goodwill if they make a mistake. Besides, you can miss out on major opportunities as a result of overly conservative algorithms, for example if they only focus on the same target demographics that were approached in the past.
Algorithms being unfair
As we have explained earlier in our insight on challenges in Data Science, algorithms cannot – in themselves – be unfair, per-se, but machine learning models are only as smart as the data they have been trained on. Simply put, algorithms convert input (data) into output, which essentially means that they replicate historical data. As this data might be biased, it will translate into unfair algorithms.
Data can for instance be biased with respect to certain groups of people, and if they are, an accurate algorithm will continue to discriminate against these groups. Even with honest data, predictions about minority groups could still be inaccurate, Rogier warns. “It is a simple statistical fact that less can be said about small groups, and this is more likely to have negative consequences than positive ones. In a naive approach, algorithms will rely less on minority groups in the process of producing output, because they are not as well represented in the data.”
_________________________________________________________________________________________________
“It is a simple statistical fact that less can be said about small groups,
and this is more likely to have negative consequences than positive ones.”
_________________________________________________________________________________________________
You can train algorithms to give all people equal opportunities, but Rogier notes: “That does involve sacrificing accuracy, because you’re effectively making a moral decision rather than a data-driven one. For example: some companies use historical employee data to predict how well a prospective new hire will perform. Let’s imagine that in the past, men were significantly overrepresented in the workplace, which means that if you’re not very careful, you will end up with a model that has a clear preference for men.”
Fraud detection
Algorithms can be fed with a varied diet of data, including information about where people come from, how old they are, what gender they are, and so on. The composition of this diet can have unintended consequences. Rogier explains: “The Dutch Tax and Customs Administration used a fraud detection system that relied on the fact whether an individual had a single passport or multiple passports as input. This was a clear violation of the General Equal Treatment Act, which prohibits discrimination based on nationality, race, gender, and more, but even if you do not explicitly feed an algorithm with sensitive input, you must be careful. Take insurers, for example, who use zip codes to price their products. Certain nationalities are over-represented in certain neighborhoods, which can ultimately mean that some nationalities are paying more on average. You have to be aware of such issues and you must be able to explain them.”
Image recognition
Image recognition software has similar problems, Rogier continues. This software will be incredibly important for self-driving cars in the future, as it will enable them to independently recognize pedestrians. “Now imagine that white people are detected in 99.99% of cases and black people in 99.98% of cases. Ultimately, this means that black people will be twice as prone to being hit, which, of course, is unacceptable. All individuals should be recognized at the same rate. Image recognition is also becoming more and more common in healthcare, where it is used to recognize certain diseases and conditions. For this software to work properly, however, there must be enough people with the same physical characteristics in the dataset used.”
Rogier Emmen, Lead Consultant Data Science
"Remember that a dataset will always offer a limited view of reality and rarely shows the full picture."
Man vs. algorithm
Algorithms are not perfect, but there are still decisions and predictions to be made. Is a person with good intentions a better, more honest judge than an algorithm? "Yes and no," says Rogier. “The advantage of human judgement over algorithmic judgement is that people have a much wider array of insight (context, time, values) and do not have to rely solely on the dataset. Besides, humans are capable of logical reasoning, which current AI methods are not. Algorithms, on the other hand, make much better use of the available data. When fed a large, high-quality data set, a Machine Learning algorithm will always be more accurate than a person or even a group of people.
Can we improve algorithms?
The importance of fairness and explainability has long been acknowledged. IBM and Google have already introduced toolkits to help discover biased data and the European Commission has ordered the creation of quality frameworks for developing algorithms. Rogier outlines a possible approach for projects using sensitive data. At ORTEC, we have the following four steps to make sure that a project is fair:
_________________________________________________________________________________________________
“People want transparency and equal opportunities. They want to understand how these algorithms are treating them.”
_________________________________________________________________________________________________
Can all algorithms be fair?
Despite all our efforts, algorithms will never be perfectly fair, Rogier predicts. “They just don’t exist, just as you will never find a perfectly fair person. There is no holy grail for fairness, and what we perceive as ‘fair’ could also change over time. “Fortunately, we are quickly becoming better at creating fair algorithms, rather than just focusing on accuracy. Public interest in the fairness of these algorithms is gradually increasing, thanks to pressure from lawmakers and regulators, the media, and society. Facebook, for instance, has started evaluating its own algorithms. People want transparency and equal opportunities: they want to understand how these algorithms are treating them and whether that is fair. That’s only reasonable, really.”
As Lead Consultant Data Science at ORTEC and with his experience in a multitude of industries, Rogier Emmen proves that data science adds great value to our everyday lives. Despite any complexities, Rogier aims to keeping solutions as simple as possible. Sharing his knowledge with others is what he loves to do, which is also a main reason for him to be teaching at The Analytics Academy.
Stay current through our Data Brief, delivered to your mailbox once a month.
To read the full article please submit the following information.
January 2021
Algorithms are more widely used than ever before and affect people and organizations alike on an even greater scale. Apart from algorithms being a useful way to optimize all sorts of operations, there is also a potential downside: they can potentially discriminate against certain people. The profiles used for fraud detection could for instance be questionable. Algorithms can also potentially influence hiring processes, mortgage applications and online targeting, and in all these processes, it is vital that people are treated fairly. The good news is that, although it is step by step, we can make algorithms fairer.
This article on Explainable & Fair AI is the third part of our series on Data and AI in the Boardroom and is powered by Rogier Emmen, Lead Consultant Data Science.