I would like to use RLink to access igraph, which has an R interface. I do not know R; my motivation for learning it is to be able to access igraph easily. I installed igraph using install.packages("igraph")
. This seems to have worked fine (I'm doing this on Windows, where package installation is supported). I can load the package using REvaluate["library(igraph)"]
Now I would like to reference some igraph functions and use them, in analogy to this:
sin = RFunction["sin"]
cos = RFunction["cos"]
sin@cos[1]
I did
fullgraph = RFunction["graph.full"]
diameter = RFunction["diameter"]
I created a graph using g = fullgraph[5]
and I tried diameter[g]
. The value it returns is {0.}
. This is not the same what I get using REvaluate["diameter(graph.full(5))"]
, which returns {1.}
.
What am I doing wrong?
Does RLink support the kinds of objects returned by igraph, or does it only support basic built-in types? Another thing I observed is that if I try these commands several times, then sin@cos[1]
(with the above definitions) will stop working too in the sense that it will throw error about being unable to set variables (RLink's internal state probably becomes inconsistent), but I can't reproduce this consistently.
Answer
The problem
The full answer would require venturing into quite sophisticated matters, and also some of them are not yet fully clear to me. What I can tell right now that your use case hits the borderline between what can be fully imported into RLink (in the sense that it can then be equivalently exported back), and what appears to be imported correctly but in practice can only be used on the Mathematica side, while reverse import into R does not result into a correct object on the R side.
General situation with data types in RLink
Generally, RLink supports most of R objects, not just the basic data types. The reason it is able to do that is that R has a simple and consistent object model, so that the non-core types are usually represented by the core types with certain attributes. For example, a factor data type is actually an integer vector with additional attributes levels
and class
, and data frame is actually a list with class
attribute set to data.frame
, and a number of other attributes. A more detailed discussion of the subset of R's object model supported by RLink can be found in the tutorial on R data types in RLink.
The types explicitly not supported include environments, certain language constructs (type language), S4 classes, and a few other types. Still, the set of supported types is large enough for RLink to be able to import even quite complex R objects, such as e.g. results of liner model fit (lm
) and nonlinear model fit (nlm
) commands. However, it would be fair to say that currently RLink is biased towards importing things into Mathematica rather than exporting from Mathematica to R. Objects that are imported into Mathematica may contain parts which generally will not export correctly back to R.
Most of such non-exportable objects are represented on Mathematica side by RLink's heads RCode
and REnvironment
, but apparently your case falls under the same category (being, however, undetected by RLink). Since such reverse imports are needed if we want to call RLink's function references on Mathematica arguments, we do have a problem for such non-exportable types.
The case at hand
An attempt to import an igraph
object from R into Mathematica results into a deceivingly simple expression:
REvaluate["graph.full(5)"]
RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}},
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]
however, sending this back to R does not reconstruct correctly the full igraph
object (there are a number of technical details I am leaving out at the moment, such as the role of RList
/ RVector
interpretation ambiguity, and a few other things, but the long story short is that even when one does account for all that correctly, the R object is not exported from Mathematica properly).
Possible solutions
While I will dig deeper into the exact reasons for this behavior and expand this post when I get those, right now I can suggest a few possible workarounds, which hopefully will help you to get the job done.
A work-around based on parse
, deparse
and eval
First let us discuss a workaround based on R's functions parse
, deparse
and eval
(more or less analogous to Mathematica's ToExpression
, ToString
and Evaluate
). Define the following functions:
eval = RFunction["function(code){eval(parse(text=code))}"]
apply =
RFunction["function(fun,deparsedCode){
fun(eval(parse(text = deparsedCode)))
}"
]
There will be certain inconvenience since you will have to use deparse
:
fgr = REvaluate["deparse(graph.full(5))"];
( I left out the output but it is a list of strings, which you can, if you wish, join together by using StringJoin
).
To obtain the same result as before, you can use
eval[fgr]
RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}},
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]
but now, calling your diameter
will give the correct result:
apply[diameter, fgr]
(* {1.} *)
A better way: using R's closures to emulate object references in RLink
The above workaround has a few flaws characteristic to solutions using parsing (for example, ToString
-ToExpression
cycles in Mathematica). First, we are in general not guaranteed that this procedure will be robust for all inputs. Second, for large graphs this may seriously impact performance.
Of course, what your example did reveal is the need for object references (handles) for RLink, both for more complex types and for core types as well (that would be analogous to ReturnAsJavaObject
in J/Link). While RLink does not "natively" support such a mechanism, it does support R closures, and here I will show how one can use them to emulate object references.
Define the following functions:
REvaluate["makeObjectRef <- function(obj){function(){obj}}"];
applyToRef = RFunction["function(fun,ref){makeObjectRef(fun(ref()))}"];
deref = RFunction["function(ref){ref()}"];
The function makeObjectRef
you should use on the R side to wrap it around any value which you want to convert to a reference. What it does is to return a closure which would return that value when called with zero arguments. The functions applyToRef
and deref
are to be used on the Mathematica side. The former would take a function to apply, and a reference (the closure produced by makeObjectReference
), and return a reference to the resulting object. The latter (deref
) will "dereference" any reference created via makeObjectRef
, and return the value of the R object.
Here is how we can use this. First, define a reference:
graphRef = REvaluate["makeObjectRef(graph.full(5))"];
Check:
deref[graphRef]
RObject[{{{2., 3., 4., 5.}}, {{1., 3., 4., 5.}}, {{1., 2., 4., 5.}},
{{1., 2., 3., 5.}}, {{1., 2., 3., 4.}}, {Null}, {Null}, {Null}, {Null}},
RAttributes["class" :> {"igraph"}]]
Now, the result we look for:
deref[applyToRef [diameter, graphRef]]
(* {1.} *)
Remarks
Of the two workarounds I suggested, I would personally prefer the second one, based on the emulation of R object references using R closures. The emulation is good enough, and should not result in a serious overhead. The only serious problem is that the current function reference mechanism does not provide a function to release a function reference which is no longer needed (which is certainly a shortcoming. I could show a hack which would do that but it is implementation-dependent). So, one should be careful to not create too many references since they will hog the memory.
Your example certainly reveals a number of shortcomings currently present in RLink. I hope to have those addressed reasonably soon.
Comments
Post a Comment