10 FREE Python Libraries for Mathematical and Scientific Computing that will TOTALLY SHOCK you
Matt Hancock, mhancock@math.fsu.edu
Why Python?
- Interpreted, object-oriented, general purpose language
- Syntax emphasizes readability
- Interactive environments make it ideal for prototyping
- Big standard library included (text manipulation, file I/O, etc.)
- Large, active development community for other libraries (math, science, database mgmt, web development, etc.)
- C / C++ / Fortran extensions possible
- Runs on most Operating Systems
- Free to use! Free to modify and redistribute!
See: Python beginner's guide
Rough outline
- Python basics: writing / running code, environments, etc.
- The Holy Trinity: NumPy, SciPy, Matplotlib
- Specialized libraries for math/science
Getting started
- Download and install python
- Options for running code:
- Write script first, and then execute it from a terminal; or,
- Execute scripts/commands from interactive console or GUI.
Running a script from terminal
Create new file, collatz.py
:
Running a script from terminal
Integrated Development Environment (IDE): Spyder
- Multidimensional array objects of various data types (floating point, integer, bool, etc.)
- Array slicing / broadcasting
- Linear algebra (basic ones), Fourier transform, random number generation tools
- Tools for integrating Fortran code
- NumPy guide for Matlab users
scipy.cluster
- Hierarchical clustering
scipy.integrate
- Integration (uni- and multi-dimensional) and ODE solvers
scipy.interpolate
- Interpolation (uni- and multi-dimensional, splines, etc.)
scipy.io
- Various file readers / writers (including Matlab format)
scipy.linalg
- Linear algebra routines
scipy.ndimage
- Image processing routines (filters, interpolation, i/o, morphology, etc.)
scipy.optimize
- Optimization (too many methods to list!)
scipy.signal
- DSP routines (filters, wavelets, spectrogram analysis)
scipy.sparse
- Sparse matrix data structures and graph routines
scipy.special
- Various special functions (Bessel functions, orthogonal polynomials, etc.)
scipy.stats
- Statistical tests, pdf parameter fitting, Gaussian KDE, etc.
- 2D / 3D plotting
- Renders $\rm\LaTeX$
- Animation writers - gif / mpeg
- Interactive GUI stuff - sliders, buttons, etc.
Compiling Python code to C - Cython
- Involves typing variables and function return types / arguments
- Sometimes useful to speed up code after it has been prototyped in Python
- Must be careful to type *all* variables, or code may still be slow
Cython example
Compiling the ".pyx" Cython file to C:
# Creates ip_cython.c file.
cython ip_cython.pyx
# Creates ".so" file that can be imported as Python module.
gcc -shared -pthread -fPIC -fwrapv -O2 -Wall \
-fno-strict-aliasing -I/usr/include/python2.7 \
-o ip_cython.so ip_cython.c
Cython example
$\approx$ 10x speed up (for contrived example)!
Calling Fortran from Python - f2py
- Utility that comes packaged with NumPy
- Creates Python wrappers for Fortran modules/subroutines/functions
- Numerical variables / NumPy arrays are passed through Python wrapper function to Fortran routines where treated as native Fortran objects
f2py example
File: ip_fortran.f90
f2py -c -m ip_fortran ip_fortran.f90
f2py example
aka "sklearn"
- Classification - SVM, Logistic regression, Random Forests, neural networks, etc.
- Regression - SVR, Random Forest Regression, ridge, LASSO, etc.
- Clustering - K-means, spectral clustering, etc.
- Dimensionality reduction - PCA, kernel PCA, Laplacian Eigenmaps, etc.
- Model selection - Grid search, cross validation, performance metics
- Pre-processing - Filling missing values, encoding labels, etc.
Sklearn example - classification with SVM
python ./sklearn-example.py
Symbolic Math - SymPy
- Used by Sage Math
- Calculus
- Combinatorics
- Equation solving
- Polynomials
Sympy example
Symbolic computation chains, symbolic differentiation, CPU/GPU compilation - Theano
- Create computation chains symbolically
- Symbolic differentiation
- Compiles symbolic Python computation chains to CPU/GPU code
- Used a lot by neural network / "deep learning" folks, but not a neural-network specific library, e.g. Caffe
Matrix multiplication - Theano vs. Numpy / GPU vs. CPU
Theano example - grad. descent with symbolic gradient computation
$$
\text{Solve:} \;\; \min_a SSE := \min_a \sum_i (y_i - ax_i^2)^2
$$
$$
a_{n+1} \leftarrow a_n - \left. \frac{d SSE}{da} \right|_{a = a_n}
$$
Theano example - grad. descent with symbolic gradient computation
- More advanced image-processing methods (than, e.g., scipy.ndimage
- Low-level ops: exposure, histogram equalization, edge detection
- Mesh generation: marching cubes
- Object detection: Hough transforms, corner/blob detection, template matching
- Gabor filtering / texture analysis
- Lots of filtering: deconvolution, denoising, in-painting
- Segmentation: active contours, random walker, watershed, otsu thresholding
- Generated using skimage's marching cubes method on boolean volume.
- Meshed surface plotted using Matplotlib
Honorable mentions
- Pandas - data structures and analysis
- Pydicom - DICOM (standard medical image format) file parser
- Tensor flow - Google's machine learning library
- Keras - Neural network library that uses either Theano or Tensorflow
- Mayavi - True 3d visualization library, built on VTK
- SQLAlchemy - Object relational mapper for mapping Python object attributes to SQL databases
- Django - Web development framework
- Jinja - Templating engine, used by Mozilla, SourceForge, Instagram, NPR
The End
Go forth, and use Python.