Skip to main content

Numerical integration --- Mathematica vs Python (w/ Scipy) performance


I'm about to tackle a problem that involves a lot of (multi-dimensional) numerical integrations and also subsequent optimizations, and so I want to make sure at least the integration step is as fast as possible.


The illustrative problem is simple. Suppose I want to compute $E[X]$ where $X \sim N(0,1)$. Of course, the analytical solution is simply $0$.


On the Mathematica side, I'll compare both the built-in NormalDistribution function and also a hand-built Gaussian PDF.


Mathematica via NormalDistribution


rvdist = NormalDistribution[];
Expectation[ x, x \[Distributed] rvdist] // AbsoluteTiming;

(* {0.001126, 2} *)

Mathematica via handwritten PDF


testpdf[x_] := Module[ {μ, σ},
μ = 0;
σ = 1;
1/Sqrt[ 2 * Pi * σ^2] *
Exp[ - ((x - μ)^2/(2* σ^2))]
];
NIntegrate[

x * testpdf[x], {x, -Infinity, Infinity}] // AbsoluteTiming
(* {0.009001, 0.} *)

Now let's turn to the Python with Scipy side.


import numpy as np
import scipy
import scipy.integrate
import scipy.stats
import time


def intfun1():
rv = scipy.stats.norm

tic = time.time()
out = rv.expect( lambda x : x )
toc = time.time()

print(out)
print(toc - tic)


intfun1()
# 0.0
# 0.0021378993988

And also via a handwritten PDF, I have,


def intfun2(): 
mu = 0
sig = 1

def npdf(x):

return 1.0 / np.sqrt( 2 * np.pi * sig**2 ) * np.exp( - (x - mu)**2 / (2 * sig**2) )

tic = time.time()
out = scipy.integrate.quad( lambda x : x * npdf(x), -np.inf, np.inf)
toc = time.time()

print(out)
print(toc - tic)

intfun2()

# (0.0, 0.0)
# 0.000141859054565

Perhaps I misunderstand what exactly goes on behind Mathematica's AbsoluteTiming and also Python's time.time(), but otherwise, it seems like Python has a substantial speed increase over Mathematica.


Questions:



  1. Why is Mathematica's numerical integration slower than Python?

  2. If possible, how to make Mathematica faster or as fast as Python?

  3. If speed is really an issue (again, the application is really for a more complicated integration problem along with optimization), is it better to just write my problem in Python rather than Mathematica?




Answer



General comments


First, if you plan to use multi-dimensional integrals it is better to test with multi-dimensional integrals not with one dimensional ones. One might think that the test in the question is an appropriate one if multi-dimensional integration is done by the integrator in a recursive manner. This seems to be case for scipy.integrate.nquad (see scipy.integrate.nquad.html), but it is not for NIntegrate. NIntegrate constructs and utilizes proper multi-dimensional integration rules and/or strategies.


Second, I do not think this is a test from which we can make general conclusions for the speed of a numerical integrator. The integral is too specific: an odd function over (-Infinity, Infinity). (Evaluates to zero.) I assume it is chosen with the specific research to be undertaken in mind.


Third, for very high dimensions the more useful integration strategies are (quite) different than the useful integration strategies in low dimensions. The precision and accuracy goals sought after are much smaller. These observations make the selected test less relevant.


Fourth, NIntegrate plays very well with the optimization functions in Mathematica. I would assume you would be better off using Mathematica than Python, but I do not have much experience with NumPy and SciPy.


More technically


it is better to call the integration routine multiple times in order to get a better timing estimate. I wanted to modify and run both the Mathematica and Python tests like this but I found the installation of NymPy and SciPy to be too much work. For example:


def intfun3(ntimes): 
mu = 0

sig = 1

def npdf(x):
return 1.0 / np.sqrt( 2 * np.pi * sig**2 ) * np.exp( - (x - mu)**2 / (2 * sig**2) )

tic = time.time()
for i in range(1,ntimes) :
out = scipy.integrate.quad( lambda x : x * npdf(x), -np.inf, np.inf)
toc = time.time()


print out
print (toc - tic) / ntimes

intfun3(1000)

We can get NIntegrate to do the test around 5-6 times faster (on my laptop with Mathematica 10.2) by providing options settings that correspond to the default integration parameters arguments of scipy.integrate.quad. (I have read the descriptions of the parameters in scipy.integrate.quad.html ).


Here are the original and the modified tests:


testpdf[μ_, σ_, x_] := 1/Sqrt[2*Pi*`[Sigma]^2]*Exp[-((x - μ)^2/(2*σ^2))];`

n = 1000;

res = Do[NIntegrate[
x*testpdf[0, 1, x], {x, -Infinity, Infinity}], {n}] //
AbsoluteTiming;
res[[1]]/n

(* Out[521]= 0.00485096 *)

n = 1000;
res = Do[NIntegrate[x*testpdf[0, 1, x], {x, -Infinity, Infinity},
PrecisionGoal -> 8, AccuracyGoal -> 8,

Method -> {"DoubleExponential",
"SymbolicProcessing" -> 0}], {n}] // AbsoluteTiming;
res[[1]]/n

(* Out[533]= 0.00090782 *)

Using the option "SymbolicProcessing"->0 prevents NIntegrate to do symbolic preprocessing. (See "SymbolicProcessing".) For the integral we are discussing, with the default option settings NIntegrate detects it is an odd function over (-Infinity,Infinity) and integrates only over (0,Infinity) as a numerical check. See "EvenOddSubdivision"


The settings PrecisionGoal->8, AccuracyGoal->8 correspond to "epsabs=1.49e-08, epsrel=1.49e-08" in scipy.integrate.quad.html . Using the method "DoubleExponential" corresponds to the description "If one of the integration limits is infinite, then a Fourier integral is computed[...]" in scipy.integrate.quad.html .


Note that when using the option "SymbolicProcessing"->0, NIntegrate gives warnings that the integral does not converge quickly enough with the message:


NIntegrate::slwcon : "Numerical integration converging too slowly; suspect one of the following: singularity, value of the integration is 0, highly oscillatory integrand, or WorkingPrecision too small. "



Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...