Skip to main content

performance tuning - Efficiently computing current flow betweenness centrality for graphs


Definitions:


Given a graph $G=(V,E),$ the current flow betweenness is a node-wise measure that captures the fraction of current through a given node with a unit source (s) sink (t) supply $b_{st}$ (1 unit of current inserted at node s, $b_{st}(s)=1$ and extracted at node t, $b_{st}(t)=-1,$ and $b_{st}(v)=0$ for $v\in V\setminus \{s,t\}$).


For a fixed s-t pair, the throughput $\tau$ of a node $v$ is given by:


$$ \tau_{st}(v)=\frac{1}{2}\left(-|b_{st}(v)|+\sum_{e\ni v}|I(e)|\right) \tag{1} $$


where $b_{st}$ is the supply function defined above for the given $s,t$ pair, $I(e)$ is the current flowing through edge $e,$ and $e\ni v$ means all edges incident on vertex $v$ (i.e. $v$ is part of, irrespective of it being at tail or head of edge).


Now the current flow betweenness centrality of a node $v$ is simply a normalized sum over all its throughput for all possible supplied pairs $s,t,$ i.e.:


$$ c(v)=\frac{1}{(n-1)(n-2)} \sum_{s,t\in V}\tau_{s,t}(v) \tag{2}. $$





My implementation of the current-flow betweenness centrality goes as follows:



  • Given a graph $G,$ I compute its incidence matrix b, corresponding Laplacian lap, and its inverse in S only once at the begining.

  • Then I have a module which takes n ($n=|V|$), b, S, conductances, supply nodes s,t and returns the list of currents through edges for the given $s,t$ pair as supply.

  • Then I have module that computes $\tau_{st}$ given in $(1),$ in which I use a piecewise function for supply $b_{st},$ and use Total[] to compute the sum in $(1).$

  • Then I have a module that computes $c$ given in $(2),$ where I use a Table to compute $\tau$ of $v$ for all possible $s,t$ and then again use Total to sum them.

  • Finally, to compute $c$ for all nodes I create a table that runs over all nodes and calls the module for $c.$


Actual implementation with a dummy random graph to showcase:


SeedRandom[123]

n = 15;
m = 20;
G = RandomGraph[{n, m}, VertexLabels -> "Name"]
edges = EdgeList[G];

GDirected =
Graph[Range[n], Map[#[[1]] -> #[[2]] &, edges],
VertexLabels -> "Name"]
conductances = ConstantArray[1., m];
b = -1.*Transpose[IncidenceMatrix[GDirected]];

lap = b\[Transpose].DiagonalMatrix[SparseArray[conductances]].b;
a = SparseArray[ConstantArray[1., {1, n}]];
A = ArrayFlatten[{{lap, a\[Transpose]}, {a, 0.}}];
S = LinearSolve[A];
\[Epsilon] = 1. 10^-8;
s = 1;
t = 2;

Edge current module:


edgecurrents[ncount_, invertedkirch_, incid_, conducarr_, nodei_, 

nodej_, threshold_] :=
Module[{n = ncount, solver = invertedkirch, incidmat = incid,
G = conducarr, source = nodei, sink = nodej, eps = threshold},
appliedcurr = 1.;
J = SparseArray[{{source}, {sink}} -> {appliedcurr, -appliedcurr}, \
{n}, 0.];
psi = solver[Join[J, {0.}]][[;; -2]];
edgecurr = G incidmat.psi;
(*define current threshold to take care of small values*)


foundcurrents = Threshold[edgecurr, eps];
Return[foundcurrents, Module];
];

$\tau$ module:


tau[edgels_, currls_, source_, sink_, vertex_] := 
Module[{edges = edgels, iedges = currls, s = source, t = sink,
v = vertex},
bst[u_, so_, to_] := Piecewise[{{1., u == so}, {-1., u == to}}, 0.];
If[s == t,

res = 0.,
incidv =
Flatten[Position[
edges, (v \[UndirectedEdge] _ | _ \[UndirectedEdge] v)]];
If[incidv == {},
inoutcurrs = 0.;
,
inoutcurrs = Total[Abs[Part[iedges, incidv]]];
];
res = 0.5*(-Abs[bst[v, s, t]] + inoutcurrs);

];
Return[res, Module];
];

$c$ module:


currinbet[vcount_, edgels_, conduc_, vertex_, threshold_] := 
Module[{n = vcount, edges = edgels, conducmat = conduc, v = vertex,
eps = threshold},
taust =
Table[tau[edges, edgecurrents[n, S, b, conducmat, s, t, eps], s,

t, v], {s, n}, {t, n}];
ccb = Total[taust, 2]/((n - 1)*(n - 2));
Return[ccb, Module];
];

Example of currents for $s=1, t=2:$


edgecurrents[n, S, b, conductances, s, t, \[Epsilon]]
{0.640145, 0.359855, -0.0198915, -0.200723, -0.039783, -0.640145, \
-0.0994575, -0.0144665, 0., 0.0144665, -0.0198915, -0.0433996, \
0.0578662, -0.0144665, 0.359855, -0.359855, 0.101266, -0.0596745, 0., \

0.}

and computing the current-flow betweenness for all nodes:


vccb = Threshold[
Table[currinbet[n, EdgeList[G], conductances, i, \[Epsilon]], {i, 1,
n}], \[Epsilon]]

{0.182869, 0.403493, 0.268327, 0.052163, 0.253522, 0.240516, \
0.524532, 0.135177, 0., 0.208672, 0.275441, 0., 0., 0.282883, \
0.246786}


The obtained results are cross-checked with the existing Python library Networkx for computing $c$ and they are in perfect agreement. But sadly efficiency wise, I am doing terribly.




Improved notebook version after Henrik Schumacher's suggestions can be downloaded here, with a working example.




Questions:




  • I (think) have minimized the current through edge calculations since S is simply pre-computed, thanks to Henrik Schumacher's approach here. However, I have the feeling I might be doing some things terribly inefficiently from then onward, as my routine slows down drastically for larger graphs. Is there anywhere I could be doing things much more efficiently?





  • Is my module-based approach or use of tables also responsible for part of the slow-down?




  • Maybe one line of optimization would be to cast $(1)$ and $(2)$ into linear-algebraic computations to speed them up, but I currently do not see how to do so.




(Any general feedback for rendering the code more efficient is most welcome of course.)



Answer



One potential bottleneck is



incidv = Flatten[Position[edges, (v \[UndirectedEdge] _ | _ \[UndirectedEdge] v)]]

as it involves (i) a search in the rather long list of edges and (ii) pattern matching, which both tend to be rather slow.


A quicker way will be to compute all these lists at once via


vertexedgeincidences = IncidenceMatrix[G]["AdjacencyLists"];

and to access the v-th one like this:


incidv = vertexedgeincidences[[v]]

The numbers



inoutcurrs = Total[Abs[Part[iedges, incidv]]];

can also all be computed at once for all v. This can be done with the help if the incidence matrix


B = IncidenceMatrix[G];

via


B.Abs[iedges]

As a general suggestion: Whenever you find yourself evaluating a Sum or Total of something, try to reprase it into Dot-products of vectors, matrices, etc.


Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...