Numerical integration --- Mathematica vs Python (w/ Scipy) performance

I'm about to tackle a problem that involves a lot of (multi-dimensional) numerical integrations and also subsequent optimizations, and so I want to make sure at least the integration step is as fast as possible.

The illustrative problem is simple. Suppose I want to compute $E[X]$ where $X \sim N(0,1)$. Of course, the analytical solution is simply $0$.

On the Mathematica side, I'll compare both the built-in NormalDistribution function and also a hand-built Gaussian PDF.

Mathematica via NormalDistribution

rvdist = NormalDistribution[];
Expectation[ x, x \[Distributed] rvdist] // AbsoluteTiming;

(* {0.001126, 2} *)

Mathematica via handwritten PDF

testpdf[x_] := Module[ {μ, σ},
   μ = 0; 
   σ = 1; 
   1/Sqrt[ 2 * Pi * σ^2] * 
    Exp[ - ((x - μ)^2/(2* σ^2))] 
   ];
NIntegrate[ 

  x * testpdf[x], {x, -Infinity, Infinity}] // AbsoluteTiming 
(* {0.009001, 0.} *)

Now let's turn to the Python with Scipy side.

import numpy as np
import scipy
import scipy.integrate
import scipy.stats
import time


def intfun1(): 
    rv = scipy.stats.norm 

    tic = time.time() 
    out = rv.expect( lambda x : x ) 
    toc = time.time() 

    print(out)
    print(toc - tic)


intfun1()
# 0.0
# 0.0021378993988

And also via a handwritten PDF, I have,

def intfun2(): 
    mu = 0 
    sig = 1 

    def npdf(x): 

        return 1.0 / np.sqrt( 2 * np.pi * sig**2 ) * np.exp( - (x - mu)**2 / (2 * sig**2) ) 

    tic = time.time()
    out = scipy.integrate.quad( lambda x : x * npdf(x), -np.inf, np.inf) 
    toc = time.time() 

    print(out)
    print(toc - tic)

intfun2() 

# (0.0, 0.0)
# 0.000141859054565

Perhaps I misunderstand what exactly goes on behind Mathematica's AbsoluteTiming and also Python's time.time(), but otherwise, it seems like Python has a substantial speed increase over Mathematica.

Questions:

Why is Mathematica's numerical integration slower than Python?

If possible, how to make Mathematica faster or as fast as Python?

If speed is really an issue (again, the application is really for a more complicated integration problem along with optimization), is it better to just write my problem in Python rather than Mathematica?

Answer

General comments

First, if you plan to use multi-dimensional integrals it is better to test with multi-dimensional integrals not with one dimensional ones. One might think that the test in the question is an appropriate one if multi-dimensional integration is done by the integrator in a recursive manner. This seems to be case for scipy.integrate.nquad (see scipy.integrate.nquad.html), but it is not for NIntegrate. NIntegrate constructs and utilizes proper multi-dimensional integration rules and/or strategies.

Second, I do not think this is a test from which we can make general conclusions for the speed of a numerical integrator. The integral is too specific: an odd function over (-Infinity, Infinity). (Evaluates to zero.) I assume it is chosen with the specific research to be undertaken in mind.

Third, for very high dimensions the more useful integration strategies are (quite) different than the useful integration strategies in low dimensions. The precision and accuracy goals sought after are much smaller. These observations make the selected test less relevant.

Fourth, NIntegrate plays very well with the optimization functions in Mathematica. I would assume you would be better off using Mathematica than Python, but I do not have much experience with NumPy and SciPy.

More technically

it is better to call the integration routine multiple times in order to get a better timing estimate. I wanted to modify and run both the Mathematica and Python tests like this but I found the installation of NymPy and SciPy to be too much work. For example:

def intfun3(ntimes): 
mu = 0 

sig = 1 

def npdf(x): 
    return 1.0 / np.sqrt( 2 * np.pi * sig**2 ) * np.exp( - (x - mu)**2 / (2 * sig**2) ) 

tic = time.time()
for i in range(1,ntimes) :
    out = scipy.integrate.quad( lambda x : x * npdf(x), -np.inf, np.inf)
toc = time.time() 


print out 
print (toc - tic) / ntimes 

intfun3(1000)

We can get NIntegrate to do the test around 5-6 times faster (on my laptop with Mathematica 10.2) by providing options settings that correspond to the default integration parameters arguments of scipy.integrate.quad. (I have read the descriptions of the parameters in scipy.integrate.quad.html ).

Here are the original and the modified tests:

testpdf[μ_, σ_, x_] := 1/Sqrt[2*Pi*`[Sigma]^2]*Exp[-((x - μ)^2/(2*σ^2))];`

n = 1000;

res = Do[NIntegrate[
     x*testpdf[0, 1, x], {x, -Infinity, Infinity}], {n}] // 
   AbsoluteTiming;
res[[1]]/n

(* Out[521]= 0.00485096 *)

n = 1000;
res = Do[NIntegrate[x*testpdf[0, 1, x], {x, -Infinity, Infinity}, 
     PrecisionGoal -> 8, AccuracyGoal -> 8, 

     Method -> {"DoubleExponential", 
       "SymbolicProcessing" -> 0}], {n}] // AbsoluteTiming;
res[[1]]/n

(* Out[533]= 0.00090782 *)

Using the option "SymbolicProcessing"->0 prevents NIntegrate to do symbolic preprocessing. (See "SymbolicProcessing".) For the integral we are discussing, with the default option settings NIntegrate detects it is an odd function over (-Infinity,Infinity) and integrates only over (0,Infinity) as a numerical check. See "EvenOddSubdivision"

The settings PrecisionGoal->8, AccuracyGoal->8 correspond to "epsabs=1.49e-08, epsrel=1.49e-08" in scipy.integrate.quad.html . Using the method "DoubleExponential" corresponds to the description "If one of the integration limits is infinite, then a Fourier integral is computed[...]" in scipy.integrate.quad.html .

Note that when using the option "SymbolicProcessing"->0, NIntegrate gives warnings that the integral does not converge quickly enough with the message:

NIntegrate::slwcon : "Numerical integration converging too slowly; suspect one of the following: singularity, value of the integration is 0, highly oscillatory integrand, or WorkingPrecision too small. "

Blog

Search This Blog