Performance: Collapsing repeated contiguous rows & cols from a matrix

In this answer I needed to remove contiguous zero-valued cols and rows from a matrix, leaving only two of them in place, no matter what the original number was.

I made up this code:

m = RandomVariate[BinomialDistribution[1, 10^-3], {400, 400}]; 
rule = {h__, {0 ..}, w : {0 ..}, {0 ..}, t__} -> {h, w, w, t};

mClean = Transpose[Transpose[m //. rule] //. rule];
Dimensions@mClean

But it is way too slow.
I'm pretty sure this code can be enhanced. Any better ideas?

Answer

Linked lists - based solution

The real reason for the slowdown seems to be the same as usual for ReplaceRepeated - multiple copying of large arrays. I can offer a solution which would still be rule-based, but uses linked lists to avoid the mentioned slowdown. Here are auxiliary functions:

zeroVectorQ[x_] := VectorQ[x, IntegerQ] && Total[Unitize[x]] == 0;


toLinkedList[l_List] := Fold[ll[#2, #1] &, ll[], Reverse[l]]

ClearAll[rzvecs];
rzvecs[mat_List] :=  rzvecs[ll[First@#, ll[]], Last@#] &@toLinkedList[mat];

rzvecs[accum_, rest : (ll[] | ll[_, ll[_, ll[]]])] := 
   List @@ Flatten[ll[accum, rest], Infinity, ll];

rzvecs[accum_, ll[head_?zeroVectorQ, ll[_?zeroVectorQ, tail : ll[_?zeroVectorQ, Except[ll[]]]]]] :=
   rzvecs[accum, ll[head, tail]];


rzvecs[accum_, ll[head_?zeroVectorQ, ll[_?zeroVectorQ, tail_]]] :=
   rzvecs[ll[ll[accum, head], head], tail];

rzvecs[accum_, ll[head_, tail_]] := rzvecs[ll[accum, head], tail];

Now the main function:

removeZeroVectors[mat_] := Nest[Transpose[rzvecs[#]] &, mat, 2]

Benchmarks

Now the benchmarks:

m = RandomVariate[BinomialDistribution[1, 10^-3], {600, 600}];
(res = removeZeroVectors[m]); // AbsoluteTiming
(res1 = Transpose[Transpose[m //. rule] //. rule]); // AbsoluteTiming
res == res1

(*
    {0.046875, Null}
    {3.715820, Null}
    True

*)

Remarks

I have been promoting the uses of linked lists for some time now. In my opinion, in Mathematica they allow one to stay of the higher level of abstraction while achieving very decent (for the top-level code) performance. They also allow one to avoid many non-obvious performance-tuning tricks which take time to come up with, and even more time to understand for others. The algorithms expressed with linked lists are usually rather straight-forward and can be directly read off from the code.

Blog

Search This Blog