Skip to main content

performance tuning - Efficient way to replace a value in packed array of integers


When working with integer label matrices returned by such functions as MorphologicalComponents, ImageForestingComponents etc. it is often necessary to replace a certain label (or a list of labels) with other label(s) without unpacking the matrix. The immediately obvious solutions via Replace/ReplaceAll or via Position unpack packed arrays and for this reason aren't appropriate.


I can imagine writing a Do loop and performing in-place modification of the matrix using Part but it is ugly and expectedly slow (although it is memory-efficient because it won't create a copy of the original matrix). Probably compilation can help with the performance, but I'm sure there must be simpler way to perform such a basic operation without unpacking the matrix.


The question is: what is the best way to replace a list of values in a packed array of integers with other integer values while keeping the array packed?




Here is a couple of examples:





  1. Setting the largest component to be background:


    img = Import["http://i.stack.imgur.com/2a2j6.png"];
    cellM = MorphologicalComponents[ColorNegate@img, CornerNeighbors -> False];
    largest = SortBy[ComponentMeasurements[cellM, "Area"], Last][[-1, 1]];
    (*the obvious solution: inefficient and unpacks*)
    cellM2 = cellM /. largest -> 0;
    cellM2 // Colorize


  2. Shifting indices of all the components except the background (0) by a constant value (motivation: combining two label matrices):



    (*the obvious solution: inefficient and unpacks*)
    cellM3 = cellM /. i_Integer /; i != 0 :> i + 10000;
    cellM3 // Colorize


Answers to questions in the comments (now deleted):




  • From my little experience, the most usual needs are split into the two cases shown above:





    1. a very few (usually only one) values need to be replaced;




    2. all the values excepting a very few (usually only one) need to be replaced.






  • Since the obtained label matrices are intended for further processing with ComponentMeasurements, SelectComponents etc. it is highly desirable to keep them packed just for achieving decent timings. But from the other side, it is very easy to hit the memory limit of a usual laptop (or even PC) just keeping in the memory 2-3 unpacked label matrices of the usual size of modern photos (for example, about 2000×1200 pixels) during image processing.






Answer



Using the method I showed for directly change the background value of a SparseArray? we can efficiently replace the Background of a SparseArray. Conversion to sparse allows specification of the background. Therefore one replacement method is:


fn1[array_?ArrayQ, old_, new_] :=
SparseArray[array, Automatic, old] /.
(sa : SparseArray)[a_, b_, _, d_] :> sa[a, b, new, d]

However this does not achieve the goal of keeping the array packed.


Better appears to be the numeric approach that I alluded to in my Related: links and which ciao posted in a comment. With a tweak or two of my own:



fn2[a_?ArrayQ, old_, new_] := BitXor[1, Unitize[a - old]] (new - old) + a;

Test:


(* cellM from Question example data *)

(r0 = cellM /. largest -> 0); // RepeatedTiming
(r1 = fn1[cellM, 2, 0]); // RepeatedTiming
(r2 = fn2[cellM, 2, 0]); // RepeatedTiming

r0 == r1 == r2



{0.0620, Null}

{0.00557, Null}

{0.00499, Null}

True


r1 is a SparseArray; conversion overhead is modest:


Developer`ToPackedArray @ Normal @ r1; // RepeatedTiming


{0.0014, Null}

The second operation is easily recast in terms this one, e.g.


(s0 = cellM /. i_Integer /; i != 0 :> i + 10000); // RepeatedTiming

(s2 = fn2[cellM + 10000, 10000, 0]); // RepeatedTiming


s0 == s2


{0.3270, Null}

{0.00621, Null}

True




With the syntax change requested and extension to multiple replacements by repeated application:


rep[
a_ /; MatrixQ[a, IntegerQ],
{rls__} | rls_ /; MatchQ[{rls}, {(_Integer -> _Integer) ..}]
] :=
Fold[BitXor[1, Unitize[# - #2[[1]]]] (#2[[2]] - #2[[1]]) + # &, a, {rls}]

Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

mathematical optimization - Minimizing using indices, error: Part::pkspec1: The expression cannot be used as a part specification

I want to use Minimize where the variables to minimize are indices pointing into an array. Here a MWE that hopefully shows what my problem is. vars = u@# & /@ Range[3]; cons = Flatten@ { Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; Minimize[{Total@((vec1[[#]] - vec2[[u[#]]])^2 & /@ Range[1, 3]), cons}, vars, Integers] The error I get: Part::pkspec1: The expression u[1] cannot be used as a part specification. >> Answer Ok, it seems that one can get around Mathematica trying to evaluate vec2[[u[1]]] too early by using the function Indexed[vec2,u[1]] . The working MWE would then look like the following: vars = u@# & /@ Range[3]; cons = Flatten@{ Table[(u[j] != #) & /@ vars[[j + 1 ;; -1]], {j, 1, 3 - 1}], 1 vec1 = {1, 2, 3}; vec2 = {1, 2, 3}; NMinimize[ {Total@((vec1[[#]] - Indexed[vec2, u[#]])^2 & /@ R...

What is and isn't a valid variable specification for Manipulate?

I have an expression whose terms have arguments (representing subscripts), like this: myExpr = A[0] + V[1,T] I would like to put it inside a Manipulate to see its value as I move around the parameters. (The goal is eventually to plot it wrt one of the variables inside.) However, Mathematica complains when I set V[1,T] as a manipulated variable: Manipulate[Evaluate[myExpr], {A[0], 0, 1}, {V[1, T], 0, 1}] (*Manipulate::vsform: Manipulate argument {V[1,T],0,1} does not have the correct form for a variable specification. >> *) As a workaround, if I get rid of the symbol T inside the argument, it works fine: Manipulate[ Evaluate[myExpr /. T -> 15], {A[0], 0, 1}, {V[1, 15], 0, 1}] Why this behavior? Can anyone point me to the documentation that says what counts as a valid variable? And is there a way to get Manpiulate to accept an expression with a symbolic argument as a variable? Investigations I've done so far: I tried using variableQ from this answer , but it says V[1...