“Who's winning?" at Cricket: Communicating Machine Learning Algorithms
The communication of statistics to the general public is generally a well covered topic. Statistics are used in all parts of our lives, ranging from the news, to politicians, to usage in many people’s day to day work. This has been particularly prominent in recent years with case rates for COVID and economic measures during the recent cost of living crisis.
Generally, statisticians and journalists have learned how to communicate these things effectively. An example of this is this BBC News article about why a fall in inflation does not mean a fall in prices. This is a complex statistical measure, carefully explained both in this article and throughout the BBC’s coverage. These days, simpler forms of statistical analysis are sometimes being replaced by machine learning.
In cricket, a recent innovation in the fan experience has been developed by a company called CricViz, which is a win predictor algorithm called WinViz. This at any time during a game, can tell you the likelihood of either team winning, in percentage terms. At the start of the game this will be close to 50:50, and if a team is particularly on top, they will have a higher percentage. Basically, it allows fans to “objectively” answer the question of “who is winning?” which is less clear in cricket than other sports. However, WinViz can be badly understood and disliked. Part of the problem is that people think WinViz was “wrong” if it predicts that someone is going to win, and then they lose. This results in people not trusting it in future. When correctly understood, a team winning when they only had a 20% chance of winning the match, does not mean that the algorithm was wrong, just that you witnessed a remarkable, and unlikely victory!
Is it a fan’s fault for misunderstanding what the predictive algorithm is saying? Or should this be something that is communicated more clearly? Is communicating likelihood as a percentage the best way to people who may not have a background in statistics? Perhaps, at the end of the games, WinViz should emphasize and celebrate when it predicted “wrongly”, because that means you have seen a more remarkable cricket match. There are few things more important than cricket [editor’s note, this view is held by Martin Shine and is not representative of Butterfly Data’s perspective of global importance], but this idea of communicating machine learning outputs clearly may be important elsewhere. These algorithms are extensively used in many sectors, but I suspect we will see the communication of the outputs of these algorithms increase over time as they are used more in sectors such as healthcare and policing.
As this happens, great care will need to be taken, not just that the outputs of these algorithms are correct, but also that they are understood by the people they are affecting.