Ali Sherief

Posted on Apr 13, 2020

Scipy at lightspeed ⚡ Part 1

#python

Having covered numpy in the last couple articles, we will now turn our attention to the other numerical module, scipy. Scipy is part of the family of numerical Python libraries, and it is used universally because of it's large repertoire of scientific funcitons, as the name implies.

This looks like it's going to be a multi-part series again, so buckle up and without further ado, lets see what it's got. This is what it's main documentation page says it has and can do:

Clustering algorithms
Physical and mathematical constants
Fast Fourier Transform routines
Integration and ordinary differential equation solvers
Interpolation and smoothing splines
Input and Output
Linear algebra
N-dimensional image processing
Orthogonal distance regression
Optimization and root-finding routines
Signal processing
Sparse matrices and associated routines
Spatial data structures and algorithms
Statistical distributions and functions

And other special functions. Notice how it has signal processing routines, things that I talked about in my extensive Python audio series. The funcitons contained here are more fundamental in their nature and handle the math behind audio transformations. Oftentimes, the audio libraries depend on scipy to provide these routines.

Most scipy submodules are under the base scipy module, such as scipy.linalg, scipy.optimize among others.

Relevant numpy features

Polynomials

This is a numpy feature but it's relevant to the discussion of scipy functions here. The numpy.poly1d class provides polynomial capabilities. Many math functions can operate on polynomials so a programmatic representation is needed for them. Polynomials are specified in the form p = numpy.poly1d([3,4,5]). This creates a polynomial of

3x**2+4x+5

. Each number in the list is used as a coefficient for the x in that particular position, but reversed. I mean that the first element in the list "3" will is used for the x of exponent 2, and because there are only three elements in the list, the polynomial created will have an order of 3-1=2, so a 2nd order polynomial (for it has an x squared term). The next element in the list, 4, will be used for the x with exponent 1. The final element, 5, is the constant (actually it's the x with power 0).

Integrals can be taken with its integ() method, derivatives with its deriv() method and you can even multiply two polynomials together.

>>> from numpy import poly1d
>>> p = poly1d([3,4,5])
>>> print(p)
   2
3 x + 4 x + 5
>>> print(p*p)
   4      3      2
9 x + 24 x + 46 x + 40 x + 25
>>> print(p.integ(k=6))
   3     2
1 x + 2 x + 5 x + 6
>>> print(p.deriv())
6 x + 4
>>> p([4, 5])
array([ 69, 100])

Function vectorization

This is also a numpy feature but it's worth pointing out that not only plain numbers cann be broadcasted, but also functions that take numbers and arrays. The return value of the function is subject to the same broadcasting rules as ordinary numpy arrays.

>>> def addsubtract(a,b):
...    if a > b:
...        return a - b
...    else:
...        return a + b
>>> vec_addsubtract = np.vectorize(addsubtract)
>>> vec_addsubtract([0,3,6,9],[1,3,5,7])
array([1, 6, 1, 2])

The np.vectorize() function will be important later on for vectorizing functions that call special scientific functions.

Logspace

Did you know that in addition to linspace() we also have logspace()? This will create a certain number of elements spaced with logarithmic length instead of linear length.

In [1]: numpy.linspace(1, 50)                                                   
Out[1]: 
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.,
       27., 28., 29., 30., 31., 32., 33., 34., 35., 36., 37., 38., 39.,
       40., 41., 42., 43., 44., 45., 46., 47., 48., 49., 50.])

In [2]: numpy.logspace(1, 50)                                                   
Out[2]: 
array([1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05, 1.e+06, 1.e+07, 1.e+08,
       1.e+09, 1.e+10, 1.e+11, 1.e+12, 1.e+13, 1.e+14, 1.e+15, 1.e+16,
       1.e+17, 1.e+18, 1.e+19, 1.e+20, 1.e+21, 1.e+22, 1.e+23, 1.e+24,
       1.e+25, 1.e+26, 1.e+27, 1.e+28, 1.e+29, 1.e+30, 1.e+31, 1.e+32,
       1.e+33, 1.e+34, 1.e+35, 1.e+36, 1.e+37, 1.e+38, 1.e+39, 1.e+40,
       1.e+41, 1.e+42, 1.e+43, 1.e+44, 1.e+45, 1.e+46, 1.e+47, 1.e+48,
       1.e+49, 1.e+50])

# This is natural logarithm "ln"
# Logspaced arrays have the property that a pair of consecutive numbers in a logspace
# like the one above can be log'd with any base and subtracted from one another
# and result in the same difference (logarithmic distance).
# Note that this difference is not equal to the base of the logarithm.
In [10]: numpy.log(1.e+02) - numpy.log(1.e+01)                                  
Out[10]: 2.302585092994046    # This is *not* e

In [11]: numpy.log(1.e+03) - numpy.log(1.e+02)                                  
Out[11]: 2.302585092994045

# Base 10 logarithm but produces 1
In [12]: numpy.log10(1.e+03) - numpy.log10(1.e+02)                              
Out[12]: 1.0

In [13]: numpy.log10(1.e+04) - numpy.log10(1.e+03)                              
Out[13]: 1.0

The scipy functions

Bessel, gamma and other special functions

Basic mathematical explinations for these are required, but I'm not going to stuff a math analysis in this section, there are too many scipy fuctions to cover for that. Understanding the equations is certainly not required to read this section.

Almost all of these are universial functions (ufuncs) so that means they can take vectorized arguments and return vectorized results.

Airy functions

First off, the Airy functions are the ones which are solutions to the differential equation

\frac{d^2y}{dx^2} - xy = 0

They can be called in Scipy using scipy.special.airy().

>>> import numpy as np
>>> from scipy import special
>>> x = np.linspace(-15, 5, 201)
>>> ai, aip, bi, bip = special.airy(x)
>>>
>>> import matplotlib.pyplot as plt
>>> plt.plot(x, ai, 'r', label='Ai(x)')
>>> plt.plot(x, bi, 'b--', label='Bi(x)')
>>> plt.ylim(-0.5, 1.0)
>>> plt.grid()
>>> plt.legend(loc='upper left')
>>> plt.show()

There's also an exponentially scaled version of airy called special.airye(). The scaled results are calulated as follows:

${eAi = Ai * \it e ^ {\frac{2}{3} z \sqrt{z}}} \newline {eAip = Aip * \it e ^{\frac{2}{3} z \sqrt{z}}} \newline {eBi = Bi * \it e ^{-\lvert \frac{2}{3} \mathbb Re (z \sqrt{z})\rvert}} \newline {eBip = Bip * \it e ^{-\lvert \frac{2}{3} \mathbb Re (z \sqrt{z})\rvert}}$

As well as a function which computes the integral of the Airy function from 0 to x, called special.itairy().

Elliptic functions

Next, we have jacobi elliptic functions which are defined as:

u = \int_{0}^{\varphi}\frac{d\varphi}{\sqrt{1 - m \sin^2\theta}}

And this is located at special.ellipj(). Other elliptic functions are the incomplete/complete elliptic integrals of the first/second kinds, and there is a function for each of these four combinations.

From the scipy docstring:

ellipk -- Complete elliptic integral of the first kind.
ellipkm1 -- Complete elliptic integral of the first kind around m = 1
ellipkinc -- Incomplete elliptic integral of the first kind
ellipe -- Complete elliptic integral of the second kind
ellipeinc -- Incomplete elliptic integral of the second kind

Bessel functions

Bessel functions are solutions to the differential equation

x^2\frac{d^2y}{dx^2} + x\frac{dy}{dx} + (x^2 - \alpha^2)y = 0

But the bessel function also has a mathematical definition, and it is

J_{\alpha}(x) = \sum_{i = 0}^{\infty}\frac{(-1)^m}{m!\Gamma(m+\alpha+1)}(\frac{x}{2})^{2m+\alpha}

The gamma symbol represents the Gamma function, which I will cover later. The forumla I posted is for the bessel function of the first kind. There are actually two kinds of bessel functions. Each of these kinds have an arbitrary number of integer or real orders. There is also a "Modified" bessel function with a different equation.

(It's turning out to be very easy and convenient for me to copy and paste docstrings here as long as I give attribution)

jv -- Bessel function of the first kind of real order and complex argument.
jve -- Exponentially scaled Bessel function of order v.
yn -- Bessel function of the second kind of integer order and real argument.
yv -- Bessel function of the second kind of real order and complex argument.
yve -- Exponentially scaled Bessel function of the second kind of real order.
kn -- Modified Bessel function of the second kind of integer order n
kv -- Modified Bessel function of the second kind of real order v
kve -- Exponentially scaled modified Bessel function of the second kind.
iv -- Modified Bessel function of the first kind of real order.
ive -- Exponentially scaled modified Bessel function of the first kind
hankel1 -- Hankel function of the first kind
hankel1e -- Exponentially scaled Hankel function of the first kind
hankel2 -- Hankel function of the second kind
hankel2e -- Exponentially scaled Hankel function of the second kind

There are also functions which let you look for the zeroes of bessel functions, where nt corresponds to a number:

jnjnp_zeros -- Compute zeros of integer-order Bessel functions Jn and Jn'.
jnyn_zeros -- Compute nt zeros of Bessel functions Jn(x), Jn'(x), Yn(x), and Yn'(x).
jn_zeros -- Compute zeros of integer-order Bessel function Jn(x).
jnp_zeros -- Compute zeros of integer-order Bessel function derivative Jn'(x).
yn_zeros -- Compute zeros of integer-order Bessel function Yn(x).
ynp_zeros -- Compute zeros of integer-order Bessel function derivative Yn'(x).
y0_zeros -- Compute nt zeros of Bessel function Y0(z), and derivative at each zero.
y1_zeros -- Compute nt zeros of Bessel function Y1(z), and derivative at each zero.
y1p_zeros -- Compute nt zeros of Bessel derivative Y1'(z), and value at each zero.

There are also optimized bessel functions available for orders 0 and 1:

j0 -- Bessel function of the first kind of order 0.
j1 -- Bessel function of the first kind of order 1.
y0 -- Bessel function of the second kind of order 0.
y1 -- Bessel function of the second kind of order 1.
i0 -- Modified Bessel function of order 0.
i0e -- Exponentially scaled modified Bessel function of order 0.
i1 -- Modified Bessel function of order 1.
i1e -- Exponentially scaled modified Bessel function of order 1.
k0 -- Modified Bessel function of the second kind of order 0, :math:K_0.
k0e -- Exponentially scaled modified Bessel function K of order 0
k1 -- Modified Bessel function of the second kind of order 1, :math:K_1(x).
k1e -- Exponentially scaled modified Bessel function K of order 1

As well as integrals and derivatives of the bessel functions:

itj0y0 -- Integrals of Bessel functions of order 0
it2j0y0 -- Integrals related to Bessel functions of order 0
iti0k0 -- Integrals of modified Bessel functions of order 0
it2i0k0 -- Integrals related to modified Bessel functions of order 0
besselpoly -- Weighted integral of a Bessel function.
jvp -- Compute nth derivative of Bessel function Jv(z) with respect to z.
yvp -- Compute nth derivative of Bessel function Yv(z) with respect to z.
kvp -- Compute nth derivative of real-order modified Bessel function Kv(z)
ivp -- Compute nth derivative of modified Bessel function Iv(z) with respect to z.
h1vp -- Compute nth derivative of Hankel function H1v(z) with respect to z.
h2vp -- Compute nth derivative of Hankel function H2v(z) with respect to z.

Please note that all of these functions are under the scipy.special submodule.

Side note: You might be wondering why extra functions are needed to compute integrals and derivatives of special functions. Well the answer is that integration and differentiation are analytical (i.e. symbolic, the kind you learned in high school) processes and these special functions I listed here are complicated, so they would either take too long or be completely incorrect using the analytical procedure. Efficient formulas for these integrals and derivatives have been discovered by mathematicians over the years.

Gamma and beta functions

The gamma and beta functions are closely related to each other. The gamma function is also used in the computation of the bessel function above. The gamma funciton is calculated as

\Gamma(x) = \int_{0}^{\infty}x^{z-1}e^{-x}dx,\quad\mathbb{Re}(z) > 0

Or even just, for positive greater than 0 integer n:

\Gamma(n) = (n-1)!

And the formula for the beta function is:

\Beta(x, y) = \int_{0}^{1}t^{x-1}(1-t)^{y-1}dt\quad\mathbb{Re}(x) > 0,\enspace \mathbb{Re}(y) > 0

Here are plots for both of these funcitons. For the beta function, since it is a function of two variables, it does not make sense to plot them on a normal 2D line graph, because there are two variables and a result, so a contour plot is used instead, with lighter red colors indicating values closer to zero, the blue values represent $\infty$ .

And of course here are the scipy functions for these. There are also different variants of the gamma function like the polygamma function and multivariate gamma.

gamma -- Gamma function.
gammaln -- Logarithm of the absolute value of the Gamma function for real inputs.
loggamma -- Principal branch of the logarithm of the Gamma function.
gammasgn -- Sign of the gamma function.
gammainc -- Regularized lower incomplete gamma function.
gammaincinv -- Inverse to gammainc
gammaincc -- Regularized upper incomplete gamma function.
gammainccinv -- Inverse to gammaincc
beta -- Beta function.
betaln -- Natural logarithm of absolute value of beta function.
betainc -- Incomplete beta integral.
betaincinv -- Inverse function to beta integral.
psi -- The digamma function.
rgamma -- Gamma function inverted
polygamma -- Polygamma function n.
multigammaln -- Returns the log of multivariate gamma, also sometimes called the generalized gamma.
digamma -- psi(x[, out])
poch -- Rising factorial (z)_m

Error functions

Yes, there are actually special functions with the mathematical name "error". But these do not represent whether something has gone wrong, they compute a value that is widely used in scientific applications which might have an inaccurate value. An example where error functions are used is in information theory, where they calculate the probability that a digital signal is noisy and has the wrong bits set.

The error funciton, commonly written as "erf", has the formula and graph:

erf(x) = \frac{1}{\sqrt{\pi}}\int_{-x}^{x}e^{t^{2}}dt = \frac{2}{\sqrt{\pi}}\int_{0}^{x}e^{t^{2}}dt

There is also the compliment of the error function which is $erfc(x) = 1 - erf(x)$ , as well as the imaginary error function $erfi(x) = -i \thickspace erf(i \thickspace x)$

erf -- Returns the error function of complex argument.
erfc -- Complementary error function, 1 - erf(x).
erfcx -- Scaled complementary error function, exp(x**2) * erfc(x).
erfi -- Imaginary error function, -i erf(i z).
erfinv -- Inverse function for erf.
erfcinv -- Inverse function for erfc.

Fresnel integrals

Closely related to the error functions are the sine and cosine Fresnel integrals. They have similar terms as the ones in the error function. They are defined as follows:

$S(x) = \int_{0}^{x} \sin(t^{2}) dt = \sum_{n=0}^{\infty}(-1^{n})\frac{x^{4n+3}}{(2n+1)!(4n+3)} \newline C(x) = \int_{0}^{x} \cos(t^{2}) dt = \sum_{n=0}^{\infty}(-1^{n})\frac{x^{4n+1}}{(2n)!(4n+1)}$

The Fresnel sine integral has a sine function in it, and the fresnel cosine integral has a cosine function in it.

In fact, there is a special graph made if you use C(t),S(t) as the x and y points. It's called an Euler spiral.

In scipy you use this fucniton like s, c = scipy.special.fresnel(x).

Bernoulli numbers

The Bernoulli numbers are a sequence of numbers which are used in many other numerical functions. The first few numbers are $B^{+} = 1, 1/2, 1/6, 0, -1/30, 0, 1/42 0, -1/30, 0, 5/66, 0, ...$ . Another commonly used version of the Bernoulli sequence is called $B^{-}$ , and the only difference is that it has $-1/2$ as the second element instead of $1/2$ .

The scipy function that makes Bernoulli numbers is called scipy.special.bernoulli(n), this returns the first N bernoulli numbers as an array.

Information theory functions

The only function I'm going to cover here is the one that produces entropy. The entropy function, $entr(x) = -x\log(x)$ if $x > 0$ , $0$ if $x = 0$ , and $\infty$ if $x < 0$ . A mathematical explination of this is beyond the scope of this blog post.

(Meta note: katex's cases environment doesn't seem to be making newlines properly, so I made it all inline.)

The function that computes this is scipy.special.entropy(x). As usual this is also a ufunc so you can pass an array of inputs and get an array of entropy results.

Combinatoric functions

The usual functions you expect are here, factorials, combinations, and permutations are available. There is also "double factorial" and "multifactorial" functions available which do what they say on the can. e.g. Double factorial of 5 is 5!! = 5 * 3 * 1 = 15

In order, the scipy funcitons for these are factorial, comb, perm, factorial2 and factorialk respectively.

There is also a function for calculating a binomial coefficient at scipy.special.binom().

Statistical distributions

We now cover the elephant in the room. There is a gold mine of statistical and distributions so I will only give a cursory overfiew of the most important ones. These are the most important distrubitions, and for brevity I'm only listing their culmunative distribution counterparts:

bdtr -- Binomial distribution cumulative distribution function.
btdtr -- Cumulative distribution function of the beta distribution.
btdtri -- The p-th quantile of the beta distribution.
fdtr -- F cumulative distribution function.
fdtri -- The p-th quantile of the F-distribution.
gdtr -- Gamma distribution cumulative distribution function.
nbdtr -- Negative binomial cumulative distribution function.
pdtr -- Poisson cumulative distribution function
stdtr -- Student t distribution cumulative distribution function
chdtr -- Chi square cumulative distribution function
ndtr -- Gaussian cumulative distribution function.
log_ndtr -- Logarithm of Gaussian cumulative distribution function.

Here's a graph of the binomial culmunative distribution:

Graph of the gaussian culmnunative distribution:

And lastly a graph of the student t distribution. There were too many distributions above to list all of their graphs here.

Miscellaneous convenience functions

The following are elementary functions that don't fit anywhere else, but which people might be exhausted from writing every time they create a new program or library:

cbrt -- Cube root of x
exp10 -- 10*x
exp2 -- 2*x
radian -- Convert from degrees to radians
cosdg -- Cosine of the angle x given in degrees.
sindg -- Sine of angle given in degrees
tandg -- Tangent of angle x given in degrees.
cotdg -- Cotangent of the angle x given in degrees.
log1p -- Calculates log(1+x) for use when x is near zero
expm1 -- exp(x) - 1 for use when x is near zero.
cosm1 -- cos(x) - 1 for use when x is near zero.
round -- Round to nearest integer
xlogy -- Compute x*log(y) so that the result is 0 if x = 0.
xlog1py -- Compute x*log1p(y) so that the result is 0 if x = 0.
logsumexp -- Compute the log of the sum of exponentials of input elements.
exprel -- Relative error exponential, (exp(x)-1)/x, for use when x is near zero.
sinc -- Return the sinc function.

Closing words

In summary, we went through the special functions in Scipy, and saw the basic mathematical structure of some of them.

If you see any errors in this post, please let me know so I can correct them.

Function declaration snippets were sourced from scipy docstrings.

Image by Craig Melville from Pixabay

DEV Community