Tuesday, February 27, 2018

Distributions and how to plot them with SciPy



Intro

To make the plotting is simple (not that painful), in that post I will make few plots with different distributions.

First importing the libraries. I will use the matplotlib for plotting the data. SciPy one of the core libraries for data science calculations:

import numpy as np # we will use it for generation the arrays
from scipy.stats import bernoulli, binom, poisson, norm, uniform, beta
import matplotlib.pyplot as plt

To make printing comfortable a made the printing function for Mean, Variance, Skew, Kurt:
def print_mvsk(*args):
    t = args[0]
    mean, var, skew, kurt, = float(t[0]),float(t[1]),float(t[2]),float(t[3])
    sd = np.sqrt(var)
    print(f'mean:{mean:.4f}\tvar:{var:.4f}\tskew:{skew:.4f}\nsd:{sd:.4f}\tkurt:{kurt:.4f}')

Thursday, February 15, 2018

Performing Accurate Decimal Operations

In python the decimal calculations using the IEEE 754 algorithm.
>>> a = 4.1
>>> b = 2.2
>>> a + b
6.300000000000001

that means that
>>> (a + b) == 6.3 
False
Because python's float type stores data using the native representations.

If you want more accuracy Decimal class from decimal module can help you with that.

>>> from decimal import Decimal
>>> a = Decimal('4.1')
>>> b = Decimal('2.2')
>>> a + b
Decimal('6.3')
>>> (a + b) == Decimal('6.3')
True

Tuesday, February 06, 2018

Skype Admin

To get list of devices where you skype in used now just 
- open chat with anyone from your contact list
- type /showplaces and hit Enter

that command will show the list of "places" where you are signed it.

This what I get:


GL

Saturday, February 03, 2018

10 useful linux commands

As part of the Galvanize Data Science Immersive program we have task to create the 10 linux command post. :)

Dask is a great tool to do stuff.'

but when you install it - can produce problems:

c:\ProgramData\Anaconda3\Lib\site-packages>py.test dask
============================= test session starts =============================
platform win32 -- Python 3.6.1, pytest-3.0.7, py-1.4.33, pluggy-0.4.0
rootdir: c:\ProgramData\Anaconda3\Lib\site-packages, inifile:
collected 3230 items / 1 errors / 5 skipped

=================================== ERRORS ====================================
__________ ERROR collecting dask/diagnostics/tests/test_profiler.py ___________
dask\diagnostics\tests\test_profiler.py:148: in
    pytest.param(lambda: ResourceProfiler(dt=0.01),
E   AttributeError: module 'pytest' has no attribute 'param'
!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!
===================== 5 skipped, 1 error in 24.40 seconds =====================

to fix that problem run:

pip install -U "pytest>=3.1.0"

ls - list files and directory in the folder
  ls -ahl
htop - shows the processes

mkdir - make a directiry

cd - change current directory

rmdir - remove directory

touch - make a file

find - search for file or directory

rm - remove file

stat - get file info

grep - find text in file
  grep --color -n -i "soap" *.csv
get the list of open ports
netstat -lntu

  • l = only services which are listening on some port
  • n = show port number, don't try to resolve the service name
  • t = tcp ports
  • u = udp ports
  • p = name of the program