I am using Mathematica 10.3 on a large GNU/Linux system (CentOS 6.6) with 40 processors and 250 GB of RAM. It is a delight to watch it chew through linear algebra problems using all the processors.
However, it seems that the system becomes unstable, and perhaps even the Mathematica Kernel crashes, after evaluating expressions that use all of these resources. (A built-in example is to evaluate the NDEigensystem
help notebook on the 40-core machine.) It seems as though the system has run out of permissible user processes.
When I issue (zsh):
~> limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 10MB
coredumpsize 0kB
memoryuse unlimited
maxproc 1024
descriptors 1024
memorylocked 64kB
addressspace unlimited
maxfilelocks unlimited
sigpending 256546
msgqueue 819200
nice 0
rt_priority 0
which leads me to believe that as many settings as are relevant are unlimited
.
I have also explored various settings for OMP_NUM_THREADS
and MKL_NUM_THREADS
--setting these to smaller settings, e.g. MKL_NUM_THREADS=8
, does not seem to exhibit the instability. However, the other cores remain unused, and these really need to be in play for Mathematica to remain competitive with other technologies such as R linked with MKL.
There are other settings for OMP and MKL, such as OMP_DYNAMIC
, but I have not found a magic recipe to get this to work.
My question: have other users seen such (mis)behavior, and are there settings that can be invoked in order to get around these issues?
Comments
Post a Comment