Skip to main content

performance tuning - Iterate until condition is met


I want to find the first 5 prime numbers of the form $n^6 + 1091$. I have used this code:


 Timing[Select[Table[n^6 + 1091, {n, 10000}], PrimeQ, 5]]


Which gives the desired answer, and takes 0.0156 seconds. What if I didn't have a rough idea of what values of $n$ to check?


Timing[Select[Table[n^6 + 1091, {n, 100000}], PrimeQ, 5]]

Takes 10 times as long.


So, my question is: Is there a simple(ish) way to make Mathematica stop when the desired number of solutions are found?



Answer



A more efficient use of Select


If the likely bound of the problem is easily stored in memory it is practical to generate a Range, which is fast, and then Select from that. Since the Range will be unpacked by Select you must consider the unpacked size. For example:


ByteCount @ Developer`FromPackedArray @ Range@1*^7



240 000 032



This is a reasonable starting size in many cases.


Finding five values take a quarter of a second:


Select[Range@1*^7, PrimeQ[#^6 + 1091] &, 5] // Length // Timing


{0.249, 5}


Fining all values with a seed <= 10^7:


Select[Range@1*^7, PrimeQ[#^6 + 1091] &] // Length // Timing


{19.797, 3338}

You can see that this scales pretty well, without excessive overhead in the case that values are found quickly, yet for your search 3338 results can be found before the limit is reached.


(The seeds found with this method need to be converted to primes with #^6 + 1091 &.)


A method for larger problems



If this will not work, either because you have no idea what the likely upper bound is, or it is too high to hold a Range in memory, then it will be most efficient to operate in blocks, due to Mathematica's vector optimizations.


First, there is a more efficient way to build the list of candidates:


Range[100]^6 + 1091

This takes full advantage of the vectorized operations available.


Pick a large enough block size that the average element processing time is relatively low, but not so large as to process more elements than are likely needed. I will pick a block size of 100,000 and I will try to find 5,000 solutions:


block = 100000;

result = {};


find = 5000;

hits = 0;

Do[
If[hits >= find, Return["Done!"]];
Select[(n block + Range[block])^6 + 1091, PrimeQ] //
(result = {result, #}; hits += Length@#) &,
{n, 0, 1*^9}
] ~Monitor~ n


The results are stored in a linked list rather than using Sow. This has the advantage of letting you open a sub-session and examine the results up to that point. For example, in a separate cell enter:


Flatten @ result // Short

And use F7 to show the results, then resume calculation.


The Monitor lets you see how many blocks have been processed.




Timings


I made the claim that these methods are efficient. Let me give some comparative timings to support my position. WReach's lazy lists code which Rojo used in his answer is a wonderful approach. It is not however, as written, fast. My methods are by comparison clunky but they are also more practical.


I will use a variation of Timo's timeAvg function:



SetAttributes[timeAvg, HoldFirst]

timeAvg[func_] := Do[If[# > 0.3, Return[#/5^i]] & @@ Timing@Do[func, {5^i}], {i, 0, 15}]

Compared to my first method:


Table[
{n,
Select[Range@1*^7, PrimeQ[#^6 + 1091] &, n] // timeAvg,
sIntegers[] ~sMap~ (#^6 + 1091 &) ~sFilter~ PrimeQ ~sTake~ n // sList // timeAvg
},

{n, {5, 15, 50, 150, 500}}
] // TableForm[#, TableHeadings -> {None, {"n", "Select", "Lazy"}}] &

Mathematica graphics


My Select method shows the overhead of generating and unpacking the Range but it soon catches up and exceeds the lazy lists method in performance. Remember also the human overhead of writing this if it is not going to be used many times. The Select method is very simple and direct.


Now, for my second method there is a tuning parameter: the block size. It could be argued that changing this parameter mid-test is not fair play so I will use a fixed block size of 1000.


finder[find_, block_: 1000] := Module[{result = {}, hits = 0},
Do[
If[hits >= find, Return[Flatten@result ~Take~ find]];
Select[(n block + Range[block])^6 + 1091, PrimeQ] //

(result = {result, #}; hits += Length@#) &,
{n, 0, 1*^9}
]
]

Table[
{n,
finder[n] // timeAvg,
sIntegers[] ~sMap~ (#^6 + 1091 &) ~sFilter~ PrimeQ ~sTake~ n // sList // timeAvg
},

{n, {5, 15, 50, 150, 500, 1500, 5000}}
] // TableForm[#, TableHeadings -> {None, {"n", "Finder", "Lazy"}}] &

Mathematica graphics


Here the superiority of the block-based method is fully apparent.


Comments

Popular posts from this blog

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

plotting - How to draw lines between specified dots on ListPlot?

I would like to create a plot where I have unconnected dots and some connected. So far, I have figured out how to draw the dots. My code is the following: ListPlot[{{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4,13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full] I have thought using ListLinePlot command, but I don't know how to specify to the command to draw only selected lines between the dots. Do have any suggestions/hints on how to do that? Thank you. Answer One possibility would be to use Epilog with Line : ListPlot[ {{1, 1}, {2, 2}, {3, 3}, {4, 4}, {1, 4}, {2, 5}, {3, 6}, {4, 7}, {1, 7}, {2, 8}, {3, 9}, {4, 10}, {1, 10}, {2, 11}, {3, 12}, {4, 13}, {2.5, 7}}, Ticks -> {{1, 2, 3, 4}, None}, AxesStyle -> Thin, TicksStyle -> Directive[Black, Bold, 12], Mesh -> Full, Epilog -> { Line[ ...