Shine's Spotlight: ‘The Signal and the Noise’ by Nate Silver

One of our directors, Sara, recommended I read this book a long while ago, so I gave it a read for my next blog post.

Nate Silver has a background in statistics and making predictions: working as an economic consultant, playing professional poker, and building models for forecasting the future performance of Major League Baseball players. 

Silver then shot to prominence, particularly for his expertise in using polling data, in the 2008 US Presidential election where he correctly predicted 49 out of the 50 states. This book was released a couple of months before the 2012 Presidential election, a calculated gamble presumably, given that Silver went on to predict all 50 states correctly in that election, which launched him and this book into the spotlight. 

He is the founder and editor of data journalism site FiveThirtyEight. I already knew of this site because it also does sports modeling, so I use their data on the strengths of football teams to calculate difficulty of upcoming fixtures for potential transfers in my Fantasy Football team (on that note, the new season has just started and after reading this book I really ought to win the Butterfly Data league…).

The book is called “The Signal and the Noise”, based on the idea that when you are trying to get insights from the real world, often what you are looking for (the signal) is drowned out by lots of unnecessary information (the noise). It covers a wide range of prediction based topics, with the first half of the book explaining some good and bad models, and the second half about how to improve these models. It dives deep into how to predict things really well, how to communicate uncertainty effectively, and how to improve our usage of data.


This book is not the easiest read, as it stands at 534 pages and much of that is quite dense reading. I am unsure if I would necessarily recommend it because of that, giving it 2 stars on goodreads


I will share a few points that I learnt from this book, which I found either useful or interesting:


The weatherman is lying to you

This book was full of praise for meteorologists. Predicting the weather is all about taking lots of data and applying it to a model, influenced by our understanding of physics. Weather forecasting has also had the benefit of being able to test their predictions every single day, and hence have a wonderful feedback loop for refining their models. Even if it is sometimes wrong, it is remarkable how accurate weather predictions are these days.

However, lots of the time when you see the weather forecast they are actually deliberately lying to you! Many places will put the likelihood of rain as far higher than they actually think the likelihood is. The basis of this being that if you are told it is going to rain, and it doesn’t, you will generally be pretty happy. However, if you are told it is not going to rain, and then get caught outside without an umbrella you will be very unhappy. So local news and weather channels will often give false predictions to keep their viewers and readers happy.


Has this scenario never happened before?

Silver is pretty damning throughout his book of the predictions that economists make. Particularly around the fact that very few people saw the 2008 economic crisis coming. Part of this though I think is that the economy is reliant on so many factors that it is very hard to predict. Furthermore, you can not fully rely on historical data, because regularly in all walks of life, you may encounter a situation where you are in a scenario that has never happened before. When this is the case it’s important to clearly communicate uncertainty in your predictions.


Communicate Uncertainty

When it comes to predicting outcomes with models, a really important thing is communicating how uncertain you are of your predictions. When something is being predicted, what is the likelihood of that prediction being true? What is the range of that prediction? Silver gives an example of a flood in North Dakota in 1997. Everyone knew that river levels were going to be very high, the infrastructure could handle up to 51 feet and the prediction was 49 feet, so everything would be fine? However the model had a +-9 feet built into their prediction, apparently meaning that 35% of the time there would be flooding!

However, only the exact prediction was communicated, not the uncertainty around the prediction. This meant most of the residents of the area thought everything would be fine and did not adequately prepare. Fortunately there was no loss of life, but there was extensive damage. It is important when things are predicted, that the uncertainty is also very clearly communicated.


Big Data can’t solve everything

This book was written 10 years ago and the topic of big data is bigger now than even then. Silver however makes a variety of good points that the quantity of data really is not everything when building a predictive model. In 1997 grandmaster Kasparov famously took on a computer Deep Blue at chess, however he knew partly how the program was built, particularly that it decided how to play its opening moves by using a database of all of the openings ever in the recorded history of chess. Kasparov used this to his knowledge and by his 3rd move he deliberately had his pieces in a position which had never been seen before in the recorded history of chess, rendering Deep Blue’s large database obsolete (there’s much more to this story than this, but it’s a neat story on how big data isn’t always useful).

It is worth considering whether the quality of your data, or how you have built any models, may be more important than the quantity of data you use.


Wait, isn’t this data science?

I was interested that Nate Silver never uses the phrase ‘data science’ throughout the book, before realising that in 2012 the term did not really exist! As you will see from the Google search trends data, the term did not really take off until the mid-2010s.

Get in touch if you have any questions, thoughts or book recommendations for me to blog about. Maybe even check out my Goodreads to see what I’ve been reading!

Previous
Previous

The QR Code Comeback

Next
Next

The Importance of Being Earnest (in socialising with colleagues)