I really love statistics. It doesn’t lie, it shows you the real picture, it removes biases… when applied properly. So I’m deepening my knowledge in statistics and today I would like to share with you recent discoveries I’ve made.

A lot of processes can be modeled using mathematical and statistical methods and by modeling we can find very interesting insights. Also we can calculate the probability of occurrence of certain event.

In statistics and probability theory there is a term – Poisson distribution, it is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event.

Yes, I understand, those words above are little bit complicated, so I’ll show an example.For instance, an individual keeping track of the amount of mail they receive each day may notice that they receive an average number of 4 letters per day. If receiving any particular piece of mail does not affect the arrival times of future pieces of mail, i.e., if pieces of mail from a wide range of sources arrive independently of one another, then a reasonable assumption is that the number of pieces of mail received in a day obeys a Poisson distribution. Other examples that may follow a Poisson include the number of phone calls received by a call center per hour and the number of decay events per second from a radioactive source, goals scored during World Cup, the number of meteorites greater than 1 meter diameter that strike Earth in a year, the number of patients arriving in an emergency room between 10 and 11 pm and so on (Wikipedia).

There are few rules to check before assuming the Poisson distribution an appropriate model, but I won’t go into detail here, you can read it all in Wiki. Instead, I will show you how to calculate the probability of some event based on Poisson distribution.

Let’s assume we have the following information – average amount of goals scored in one game in La Liga is near 3. We can create a model using Poisson distribution and calculate the probability of 9 or more goals in one La Liga game. The code below will perform all the calculations:

import numpy as np

s = np.random.poisson(3, 1000000)

n = np.sum(s >= 9)

p = n / 1000000

print(p*100)

Result: 0.3832. Less than 0.4% chance that in one game teams will score 9 or more goals. Quite cool, isn’t it?

That’s it for today, hopefully you enjoyed learning about Poisson distribution and how we can use it as much as I did. Tomorrow will prepare something new about distributions and probabilities. (Currently doing courses about statistical thinking and I really enjoy it).

Have a nice day and smile more, because:

“All the statistics in the world can’t measure the warmth of a smile.”

―Chris Hart