Skip to main content

interoperability - Using non-trivial objects in RLink


I would like to use RLink to access igraph, which has an R interface. I do not know R; my motivation for learning it is to be able to access igraph easily. I installed igraph using install.packages("igraph"). This seems to have worked fine (I'm doing this on Windows, where package installation is supported). I can load the package using REvaluate["library(igraph)"]


Now I would like to reference some igraph functions and use them, in analogy to this:


sin = RFunction["sin"]
cos = RFunction["cos"]

sin@cos[1]

I did


fullgraph = RFunction["graph.full"]
diameter = RFunction["diameter"]

I created a graph using g = fullgraph[5] and I tried diameter[g]. The value it returns is {0.}. This is not the same what I get using REvaluate["diameter(graph.full(5))"], which returns {1.}.


What am I doing wrong?


Does RLink support the kinds of objects returned by igraph, or does it only support basic built-in types? Another thing I observed is that if I try these commands several times, then sin@cos[1] (with the above definitions) will stop working too in the sense that it will throw error about being unable to set variables (RLink's internal state probably becomes inconsistent), but I can't reproduce this consistently.



Answer




The problem


The full answer would require venturing into quite sophisticated matters, and also some of them are not yet fully clear to me. What I can tell right now that your use case hits the borderline between what can be fully imported into RLink (in the sense that it can then be equivalently exported back), and what appears to be imported correctly but in practice can only be used on the Mathematica side, while reverse import into R does not result into a correct object on the R side.


General situation with data types in RLink


Generally, RLink supports most of R objects, not just the basic data types. The reason it is able to do that is that R has a simple and consistent object model, so that the non-core types are usually represented by the core types with certain attributes. For example, a factor data type is actually an integer vector with additional attributes levels and class, and data frame is actually a list with class attribute set to data.frame, and a number of other attributes. A more detailed discussion of the subset of R's object model supported by RLink can be found in the tutorial on R data types in RLink.


The types explicitly not supported include environments, certain language constructs (type language), S4 classes, and a few other types. Still, the set of supported types is large enough for RLink to be able to import even quite complex R objects, such as e.g. results of liner model fit (lm) and nonlinear model fit (nlm) commands. However, it would be fair to say that currently RLink is biased towards importing things into Mathematica rather than exporting from Mathematica to R. Objects that are imported into Mathematica may contain parts which generally will not export correctly back to R.


Most of such non-exportable objects are represented on Mathematica side by RLink's heads RCode and REnvironment, but apparently your case falls under the same category (being, however, undetected by RLink). Since such reverse imports are needed if we want to call RLink's function references on Mathematica arguments, we do have a problem for such non-exportable types.


The case at hand


An attempt to import an igraph object from R into Mathematica results into a deceivingly simple expression:


REvaluate["graph.full(5)"]  



 RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}}, 
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]

however, sending this back to R does not reconstruct correctly the full igraph object (there are a number of technical details I am leaving out at the moment, such as the role of RList/ RVector interpretation ambiguity, and a few other things, but the long story short is that even when one does account for all that correctly, the R object is not exported from Mathematica properly).


Possible solutions


While I will dig deeper into the exact reasons for this behavior and expand this post when I get those, right now I can suggest a few possible workarounds, which hopefully will help you to get the job done.


A work-around based on parse, deparse and eval


First let us discuss a workaround based on R's functions parse, deparse and eval (more or less analogous to Mathematica's ToExpression, ToString and Evaluate). Define the following functions:



eval = RFunction["function(code){eval(parse(text=code))}"]

apply =
RFunction["function(fun,deparsedCode){
fun(eval(parse(text = deparsedCode)))
}"
]

There will be certain inconvenience since you will have to use deparse:


fgr = REvaluate["deparse(graph.full(5))"];


( I left out the output but it is a list of strings, which you can, if you wish, join together by using StringJoin).


To obtain the same result as before, you can use


eval[fgr]


 RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}}, 
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]


but now, calling your diameter will give the correct result:


apply[diameter, fgr]

(* {1.} *)

A better way: using R's closures to emulate object references in RLink


The above workaround has a few flaws characteristic to solutions using parsing (for example, ToString-ToExpression cycles in Mathematica). First, we are in general not guaranteed that this procedure will be robust for all inputs. Second, for large graphs this may seriously impact performance.


Of course, what your example did reveal is the need for object references (handles) for RLink, both for more complex types and for core types as well (that would be analogous to ReturnAsJavaObject in J/Link). While RLink does not "natively" support such a mechanism, it does support R closures, and here I will show how one can use them to emulate object references.


Define the following functions:


REvaluate["makeObjectRef <- function(obj){function(){obj}}"];


applyToRef = RFunction["function(fun,ref){makeObjectRef(fun(ref()))}"];

deref = RFunction["function(ref){ref()}"];

The function makeObjectRef you should use on the R side to wrap it around any value which you want to convert to a reference. What it does is to return a closure which would return that value when called with zero arguments. The functions applyToRef and deref are to be used on the Mathematica side. The former would take a function to apply, and a reference (the closure produced by makeObjectReference), and return a reference to the resulting object. The latter (deref) will "dereference" any reference created via makeObjectRef, and return the value of the R object.


Here is how we can use this. First, define a reference:


graphRef = REvaluate["makeObjectRef(graph.full(5))"];

Check:



deref[graphRef]


 RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}}, 
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]

Now, the result we look for:


deref[applyToRef [diameter, graphRef]]


(* {1.} *)

Remarks


Of the two workarounds I suggested, I would personally prefer the second one, based on the emulation of R object references using R closures. The emulation is good enough, and should not result in a serious overhead. The only serious problem is that the current function reference mechanism does not provide a function to release a function reference which is no longer needed (which is certainly a shortcoming. I could show a hack which would do that but it is implementation-dependent). So, one should be careful to not create too many references since they will hog the memory.


Your example certainly reveals a number of shortcomings currently present in RLink. I hope to have those addressed reasonably soon.


Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....