I have a list bdrs of SparseArrays, which represent the boundary matrices of a chain complex $R^{d_0}\overset{\partial_1}{\longleftarrow}R^{d_1}\longleftarrow\ldots\longleftarrow R^{d_{N-1}}\overset{\partial_N}{\longleftarrow}R^{d_N}$ over $R\!\in\!\{\mathbb{Z}, \mathbb{Q}, \mathbb{Z}_p\}$. I store this data as $d_0,\ldots,d_N$=dims, $N$=dim, $R$=p$\!\in\!\{$"Z",0,p$\}$. Typically, $\partial_k$ has dimensions $10^6\!\times\!10^6$ and density $10^{-5}$. See this or this or this link, to understand how SparseArray stores data.
For $k=1,\ldots,N$, I wish to (a) compute a sequence
Mm=$M_1,\ldots,M_N$, in which $M_k$ is the set of all invertible entries in $\partial_k$, that have zeros left of them (in their row) and below them (in their column). Then for every $(i,j)$-entry in $M_k$, I wish to (b) delete it from $\partial_k$, and delete the $i$-column from $\partial_{k-1}$ and $j$-row from $\partial_{k+1}$.
First, if $R\!=\!\mathbb{Z}_p$ I mod out the entries, then delete all zero entries, and sort the rest of the entries:
If[p!="Z" && p!=0, bdrs=Mod[bdrs,p]];
bdrs = Table[SparseArray[Sort@ArrayRules@bdrs[[k]],dims[[k;;k+1]]], {k,dim}];
Attempt 1:
Note that some $d_k$ may be $0$. The construction is done by
Mm = DeleteCases[ Table[
      i=bdrs[[k,All,j]]["NonzeroPositions"][[-1,1]]; w=bdrs[[k,i,j]];   
      If[bdrs[[k,i]]["NonzeroPositions"][[1,1]]==j && If[p=="Z", w^2==1, True], 
       bdrs[[k,i,j]]=0;  {i,j}->w, ""], 
      {k,dim}, {j, If[dims[[k]]*dims[[k+1]]==0, {}, 
        DeleteDuplicates@Flatten@bdrs[[k]]["ColumnIndices"]]}], "",2];
or
Mm = DeleteCases[ Table[ {i,j}=e; w=bdrs[[k,i,j]]; 
    If[bdrs[[k,All,j]]["NonzeroPositions"][[-1]]=={i} && If[p=="Z", w^2==1, True],
     bdrs[[k,i,j]]=0;  {i,j}->w, ""], 
    {k,dim}, {e, If[dims[[k]]*dims[[k+1]]==0, {}, 
      First/@ GatherBy[bdrs[[k]]["NonzeroPositions"], First]]}], "",2];
and then deletion of rows/columns:
Do[ If[1     If[kBoth constructions (a) are awfully slow on large matrices, but (b) is really fast.
Attempt 2:
Instead of a SparseArray, I use an Association of rows and of columns.
rows[bdr_] := With[
  {l=GatherBy[ ArrayRules[bdr][[1;;-2]] /.({i_,j_}->w_Integer) :> {i,j,w}, First]}, 
  Association@Table[r[[1,1]]->(Rest/@r), {r,l}]]; 
mm[r_,c_] := Module[{j,w}, Association@ DeleteCases[ Table[ {j,w}=r[i][[1]]; 
  If[c[j][[-1]]=={i,w} && If[p=="Z",w^2==1,True], {i,j}->w,""],{i,Keys@r}],""]];
Mm = Table[r=rows@bdrs[[k]]; c=rows@Transpose@bdrs[[k]]; mm[r,c], {k,dim}];
Here (a) is much faster, but if I replace each bdrs[[k]] with its association of rows and of columns, then achieving (b) becomes very slow (applying DeleteCases[#,MemberQ[#[[1,1]]&/@Mm[[k]],#]&]& to all values in the Association).
Comment:
Later, I will be computing a lot sums of rows, so I need to keep either bdrs or rows & columns of these matrices as Associations.
Examples for testing purposes:
Let us use a command, that builds up a chain complex from the faces of a simplicial complex:
chainComplexSC[bases_] := Module[{dim=Length@bases, dims=Length/@bases, basesk, baseskk, bdrs={}, entries, bdr, x},
   basesk = AssociationThread[bases[[1]],Range@dims[[1]]];     
   Do[ baseskk=AssociationThread[bases[[k]],Range@dims[[k]]];
    entries=Flatten[Table[{basesk[Delete[s,{{i}}]],baseskk[s]}->(-1)^(i+1),{s,Keys@baseskk},{i,k}],1];
    AppendTo[bdrs, SparseArray[entries, dims[[k-1;;k]]]]; 
   basesk=baseskk; Clear[baseskk];, {k,2,dim}]; bdrs];
sCxSimplices[facets_,k_] := ParallelCombine[ DeleteDuplicates@
   Flatten[Table[Subsets[s,{k}],{s,#}],1]&, facets,Union,Method->"CoarsestGrained"];
Let us create the $m\!\times\!n$ chessboard complex:
m = 4; 
n = 4; 
facets = FindClique[GraphComplement@LineGraph@CompleteGraph[{m, n}], Infinity, All];
dim = Max[Length /@ facets];
bases = Table[sCxSimplices[facets, k + 1], {k, 0, dim - 1}];
dims = Length /@ bases;
bdrs = chainComplexSC[bases];
A good testing example is $m\!=\!8, n\!=\!9$, which has the f-vector dims=$72, 2016, 28224, 211680, 846720, 1693440, 1451520, 362880$.
Answer
This might solve problem (a). You have to preprocess the input matrix A such that it has 1 at positions where the original matrix bdrs[[k]] had an invertible element and -1 at positions where the original matrix had a noninvertible nonzero element.
Here is a test matrix:
n = 10000;
m = 10000;
nedges = 4000000;
A = SparseArray`SparseArraySort@SparseArray[
    Transpose[{
       RandomInteger[{1, m}, nedges],
       RandomInteger[{1, n}, nedges]
       }] -> RandomChoice[{-1, 1}, nedges],
    {m, n}];
This computes the set of pairs (i,j} as described by OP; it runs through in about a quarter of a second.
M = A \[Function] Module[{Ainv, At, rowswithinv, firstinvcolumns, firstnonzerocolumns, rowswithleadinginv, leadinginvcols, lastnonzerorows},
   Ainv = SparseArray[Clip[A, {0, 1}]];
   rowswithinv = Random`Private`PositionsOf[Unitize[Differences[Ainv["RowPointers"]]], 1];
   If[Length[rowswithinv] > 0,
    firstinvcolumns = Flatten[Ainv["ColumnIndices"][[Ainv["RowPointers"][[rowswithinv]] + 1]]];
    firstnonzerocolumns = Flatten[A["ColumnIndices"][[A["RowPointers"][[rowswithinv]] + 1]]];
    With[{picker = Random`Private`PositionsOf[UnitStep[firstnonzerocolumns - firstinvcolumns], 1]},
     rowswithleadinginv = rowswithinv[[picker]];
     leadinginvcols = firstinvcolumns[[picker]];
     ];
    If[Length[leadinginvcols] > 0,
     At = Transpose[A[[All, leadinginvcols]]];
     lastnonzerorows = Flatten[At["ColumnIndices"][[At["RowPointers"][[2 ;; -1]]]]];
     With[{picker = Random`Private`PositionsOf[UnitStep[rowswithleadinginv - lastnonzerorows], 1]},
      Transpose[{
        rowswithleadinginv[[picker]],
        leadinginvcols[[picker]]
        }]
      ]
     ,
     {}
     ]
    ,
    {}
    ]
   ];
Now,
 M[A]; // AbsoluteTiming //First
0.227727
The only ingredients of measureable costs are the lines
Ainv = SparseArray[Clip[A, {0, 1}]]; // AbsoluteTiming // First
0.138703
and
At = Transpose[A[[All, leadinginvcols]]];
which costs about 0.096315 seconds.
However, I have not tested the implementation thoroughly, yet.
You might find this interesting. Your example can be generated much quicker by using Szabolcs' package "IGraphM`" along with the following code:
Needs["IGraphM`"]
facets = IGCliques[GraphComplement@LineGraph@CompleteGraph[{m, n}]];
bases = Sort@*Developer`ToPackedArray /@ GatherBy[facets, Length];
dims = Length /@ bases;
Moreover, the following function might be faster at generating the boundary operators:
chainComplexSC2[bases_] := 
  Module[{dim, dims, basesk, baseskk, bdrs = {}, entries, bdr, x, a, cf},
   dim = Length@bases;
   dims = Length /@ bases;
   cf = Compile[{{list, _Integer, 1}},
     Table[Delete[list, i], {i, 1, Length[list]}],
     CompilationTarget -> "WVM",
     RuntimeAttributes -> {Listable},
     Parallelization -> True
     ];
   basesk = AssociationThread[bases[[1]], Range@dims[[1]]];
   Do[
    baseskk = AssociationThread[bases[[k]], Range@dims[[k]]];
    a = SparseArray[
      Rule[
       Transpose[{
         Lookup[basesk, Flatten[cf[Keys@baseskk], 1]],
         Flatten[Transpose@ConstantArray[Range[dims[[k]]], k]]
         }],
       Flatten[ConstantArray[(-1)^Range[2, k + 1], dims[[k]]]]
       ],
      dims[[k - 1 ;; k]]
      ];
    AppendTo[bdrs, a];
    basesk = baseskk;
    Clear[baseskk];,
    {k, 2, dim}];
   bdrs
   ];
Comments
Post a Comment