AI has to defend or explain too!

The moment the neural network we built for making a Wi-Fi network smart started producing amazing results, our focus shifted to the mystery that was unfolding right in front of our eyes. We had built these models, but we don’t know how they work. No one really knows how it does what it does. The only thing we see is that its performance is superhuman.

Brain is very complex. It is a complicated deep neural network. It has lots of memory and is capable of pattern recognition, prediction, imagination, and all sorts of parallel computation, and yet no one really knows how it works.

New advancements like diffusion imaging have given scientists insights into the brain’s inner workings and enabled them to “see” what is going on inside the brain when people are engaged in learning. There is evidence that the learning and memory are made by the strengthening and weakening of connections among brain cells. However what is stored is still a mystery. You can’t look inside the brain and tell that this person knows the meaning of supercalifragilisticexpialidocious. In fact, the brain is made of stuff that dies when you poke it around. It only learns via the received sensory information, the emotional reactions and by processing experiences and then mysteriously stores the learning. You can’t transfer the learning to it or extract the learning from it also. No copy and paste!

And, what we have built is something very similar too. Essentially a black box.

The Neural network models are brain-inspired, and in neural networking, instead of a programmer writing the commands to solve a problem, the neural network generates its own algorithm based on example data and the desired output. It basically builds an advanced algorithm that is embedded in the behavior of thousands of simulated neurons, arranged into hundreds of intricately interconnected layers and the behavior at each neuron tweaked by the back-propagation.

Let me give you an example of an image recognition neural network. The task here is to look at an image and then tell whether it is a ‘male’ or ‘female.’ The task that is trivial for a human brain is not so trivial when a neural network has to do it. If you look at the pixel image, there is no pattern in the pixels. Thus, the first step is training this neural network. Without going into too technical, assume that the neural network is nothing but multiple layers of neurons with weights assigned to each neuron. The first layer has thousands of neurons (one neuron for each pixel in the image), and the last layer has only two neurons (one becomes ‘1’ when the image in the input layer is ‘male’, and the other becomes ‘1’ when the image in the input layer is ‘female’. For convenience, let’s call them the ‘male’ and ‘female’ neuron respectively.) Now, the creator of the neural network feeds millions of images to this model and the expected result for each image, i.e., which neuron in the last layer should become ‘1’ for that image. If the input image is ‘male,’ then the ‘male’ neuron should be ‘1,’ and if the input image is ‘female’ then the ‘female’ neuron should be ‘1’. The learning begins. The neural network uses the initial weights and runs a formula to compute the value of the two neurons in the last layer. Its goal is to quantify how good or bad it is performing. Did it compute ‘1’ for the ‘male’ neuron for the ‘male’ image or not? Or, did it compute ‘1’ for the ‘female’ neuron for the ‘female’ image or not? If not then it adjusts the weights, and it keeps changing weights of neurons in all layers through multiple iterations until it is able to conclude the “male” neuron “1” for all male images and “female” neuron “1” for all the female images. Now, when you feed a new image that has never been fed before, the model runs the formula on those weights and “predicts” whether the input image is “male” or “female.” The neural network has created an algorithm. Even the person who created and trained the network does not know what is being detected at the intermediate stages of the process—or why the model reaches the conclusion that it does. The model works brilliantly but what the creator has created is a black box.

Not sure if you followed the second game of historic Go match between Lee Sedol, one of the world’s top players, and AlphaGo, the AI program built by Google. AlphaGo made a move that astonished everyone – Every reporter, photographer, commentator and even Lee Sedol himself. Fan Hui, the European Go champion who had lost five straight games to AlphaGo earlier, was also completely blown away. “It’s not a human move. I’ve never seen a human play this move,” he said. Indeed, the move that didn’t make sense to the humans changed the path of play, and AlphaGo went on to win the game.

For AlphaGo or an AI-driven Wi-Fi network, it perhaps does not matter why they won, or why the network is performing better, however in many applications, we can not accept arbitrary decisions by AIs, that we don’t understand. If a doctor cannot explain why AI arrives at a specific decision, the doctor may not be able to use its conclusions as a diagnostic tool. If the AI system cannot justify why the loan is being rejected, the financial institution may not be able to use it for loan processing. If the AI system cannot be accountable, it cannot be used in self-driving cars as the assignment of responsibility is not clear. And so on. If a service is going to be augmented with “AI,” we need the rationale of how the algorithm arrived at its recommendation or decision.

The AI service needs to be built responsibly. It needs to be explainable. It needs to be transparent. It needs to be reliable.

And, we should be able to prove that its “fair.” AI systems are only as good as the data we put into them. Bad data can contain implicit racial, gender, or ideological biases.  You can notice from the image recognition example I gave above that the human bias has the potential to creep in.  The “creator” labels the input images as ‘male’ or ‘female’ based on the creator’s bias.  This is one of the worst problems in AI.  Human bias is a source of undesirable decision-making logic to creep into a neural network.  We should be able to determine if the neural network has done ‘discrimination.’

A human can get it wrong, but then they explain. And similarly, when a computer gets things wrong, we need to know why.

In fact, for the field of AI to reach any measurable sense of maturity, we’ll need methods to debug, error-check, and understand the decision-making process of machines. The lack of trust was at the heart of many failures of one of the best-known AI efforts. Artificial intelligence (AI) is a transformational $15 trillion opportunity, but without explainability, it will not reach any measurable sense of deployment.

Explainability is an essential element of the future where we will have artificially intelligent machine partners. In the follow-up post, I will go over the state of the art of explainable AI as well as various approaches to build it.