Intro
To make the plotting is simple (not that painful), in that post I will make few plots with different distributions.
First importing the libraries. I will use the matplotlib for plotting the data. SciPy one of the core libraries for data science calculations:
To make printing comfortable a made the printing function for Mean, Variance, Skew, Kurt:
import numpy as np # we will use it for generation the arrays
from scipy.stats import bernoulli, binom, poisson, norm, uniform, beta
import matplotlib.pyplot as plt
To make printing comfortable a made the printing function for Mean, Variance, Skew, Kurt:
def print_mvsk(*args):
t = args[0]
mean, var, skew, kurt, = float(t[0]),float(t[1]),float(t[2]),float(t[3])
sd = np.sqrt(var)
print(f'mean:{mean:.4f}\tvar:{var:.4f}\tskew:{skew:.4f}\nsd:{sd:.4f}\tkurt:{kurt:.4f}')
Bernoulli Distribution
Bernoulli is long representation of regular Probabilities as we all know it. for example - toss of the coin, or rolling the dice. etc.Example: Toss a fair coin once.
What is the distribution of the number of heads?
A single trial
The trail can result mutually exclusive results.
P(Success)=p
P(Failure)=1-p
Let X = 1 if the success occurs, and X=0 if a failure occurs.
Then X has a Bernoulli distribution:
$P(X=x)=p^x{(1-p)}^{1-x}$ Probability mass function of the Bernoulli distribution
Mean - $\mu=p$ mean
Variance $\sigma^2=p(1-p)$
This is how we code the Bernoulli distributions with using the scipy.stats.bernoulli
fig, ax = plt.subplots(1, 1) #creating the 1 plot
p = 1/6 # our probability of tossing the dice
x = [0,1] # the x axes
# print the distribution's properties
print_mvsk(bernoulli.stats(p, moments='mvsk'))
# generating the data for value 0, 1
data = bernoulli.pmf(x, p)
# vertical line
ax.vlines(x, 0, data, colors='y', lw=20)
plt.ylabel('Probability of winning in dice tos')
plt.xlabel('0 mean prob to lose \n 1 - chances to win')
plt.title('Bernulli Probability Distribution')
plt.grid(True)
plt.show()
# probability of tossing the coin
p = 1/2 #
x = [0,1]
fig, ax = plt.subplots(1, 1)<
data = bernoulli.pmf(x, p)
ax.vlines(x, 0, data, colors='y', lw=20)
plt.ylabel('Probability of winning in coin tos')
plt.xlabel('0 mean prob to lose \n 1 - chances to win')
plt.title('Bernulli Probability Distribution')
plt.grid(False)
plt.show()
Poisson
Suppose we are counting the number of occurrences of an event in a given unit of time, distance, area or volume.For example:
- The number of car accidents in a day.
- The number of dandelions in a square meter plot pf land.
- Events are occurring independently
- The probability that an event occurs in a given length of time does not change through time.
- Events are occurring randomly and independently.
Probability - $P\left(X=x\right)=\frac{\lambda^xe^{-\lambda}\ }{x!}$
where e=2.71828, x=0,1,2…
Mean - $\mu=\lambda$
Variance - $\sigma^2=\lambda$
Example:
Plutonium-239 is an isotope of plutonium that is used in nuclear weapons and reactors.
One nanogram of Plutonium-239 will have an average of 2.3 radioactive decays per second, and the number of decays will follow a Poisson distribution.
Q:What is the probability that in a 2 second period there are exactly 3 radioactive decay?
Let X represent the number of decays in a 2 second period. Since the mean is lambda and we have avg 2.3 per second, and we are looking for 2 seconds period
$\lambda=2.3\ast2=4.6$
$P\left(X=3\right)=\frac{{4.6}^3e^{-4.6}\ }{3!}=0.163$
Q:What is probability of no more then 3 decays?
$P\left(X\le3\right)=P\left(X=0\right)+P\left(X=1\right)+P\left(X=2\right)+P\left(X=3\right)=$
$\frac{{4.6}^0e^{-4.6}\ }{0!}+\frac{{4.6}^1e^{-4.6}\ }{1!}+\frac{{4.6}^2e^{-4.6}\ }{2!}+\frac{{4.6}^3e^{-4.6}\ }{3!}=$
$0.010+0.046+0.106+0.163=0.326$
We will use the scipy.stats.poisson functions:
fig, ax = plt.subplots(1, 1)
mu = 4.6 # our lambda
print_mvsk(poisson.stats(mu, moments='mvsk'))
poisson.ppf(0.01, mu)
x = np.arange(poisson.ppf(0.00001, mu),
poisson.ppf(0.99999, mu))
data = poisson.pmf(x, mu)
# creating new array
data2 = [0]*len(data)
# to show the answer with different color
data2[3]= poisson.pmf(3, mu)
ax.vlines(x, 0, data , colors='r', lw=18, alpha=1)
ax.vlines(x, 0, data2, colors='b', lw=18, alpha=1)
ax.vlines(x, 0, data2, colors='b', lw=18, alpha=1)
plt.ylabel('Probability')
plt.xlabel('Number of Decays')
plt.title('Plutonium-239 prob of having 3 decays ')
plt.show()
Normal / Gaussian Distribution
The random independent variables has Normal distribution. If you have allot random data it will tend to have Normal Distribution.Normal Distribution is Continues distribution (not discrete)
$f(x)=\frac{1}{\sqrt{2\pi\sigma}}e^{-\ \frac{1}{2\sigma^2}{(x-\mu)}^2}$
for
$-\infty<x<\infty$
$-\infty<\mu<\infty$
$\sigma>0$
$\sigma^2>0$
If X is a random variable that has a normal distribution with mean $\mu$ and variance $\sigma^2$, we write this as
$X~N(\mu,\sigma^2)$
The standard normal distribution is a normal distribution with mean 0 and variance 1.
fig, ax = plt.subplots(1, 1)
print_mvsk(norm.stats(moments='mvsk'))
# using the linspace function to create
# continues X axis, instead using the
# np.arrange with step attribute
x = np.linspace(norm.ppf(0.00001),
norm.ppf(0.99999), 1000)
data = norm.pdf(x)
# 'b-' means:
# Blue Color
# - means a solid line
ax.plot(x, data, 'b-', ms=1)
# cover all back ground with red
ax.vlines(x, 0, data, colors='r', lw=1, alpha=1)
# Cover 95 % (2Sigma) of Normal Dist with blue color
x_sigma2 = np.linspace(norm.ppf(0.025),
norm.ppf(0.975), 1000)
sigma2 = norm.pdf(x_sigma2)
ax.vlines(x_sigma2,0, sigma2, color='b', lw=1, alpha=.5, label='asd')
# Cover 1 sigma with green
p_sigma1 = norm.pdf(1)
x_sigma1 = np.linspace(norm.ppf(p_sigma1),
norm.ppf(1-p_sigma1), 1000)
sigma1 = norm.pdf(x_sigma1)
ax.vlines(x_sigma1,0, sigma1, color='g', lw=1, alpha=.5)
plt.ylabel('Prob')
plt.xlabel('Red 100%\nBlue 95%\nGreen=68.2')
plt.show()
Used Materials:
- My Azure Jupiter Notebook
- An Introduction to the Normal Distribution
- Matplotlib
- https://docs.scipy.org/doc/
- https://en.wikipedia.org/wiki/Poisson_distribution
- https://notebooks.azure.com/seanreed1111/libraries/scipy-stats/html/notebook1.ipynb
- https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html
- http://www.compromath.com/2017/02/adding-specific-code-syntax-highlighter.html
No comments:
Post a Comment