At some point I noticed that I was using Riffle
and Partition
together a lot. I would do things like
Partition[Riffle[{1,2,3},{4,5,6}],2]
or
Partition[Riffle[{1,2,3}, 6 , {2, -1, 2}], 2]
The question is: are better alternatives, in terms of memory and speed?
Answer
Here is a solution for both "idioms", using LibraryLink and some kind of custom generics. I'm not sure if I like the abstractions, but here is the answer anyway.
<< SymbolicC`
<< Developer`
<< CCompilerDriver`
<< CCodeGenerator`
The definitions below are made to help abstract this function, so that it works for integers, real numbers and complex numbers.
(*lfn is short for library function name*)
(*LL is short for LibraryLink*)
typeTableHeaders = {"type name", "c type", "LL generic name",
"abbreviation in lfn"};
typeTable =
{
{Integer, "mint", "Integer", "I"},
{Real, "mreal", "Real", "R"},
{Complex, "mcomplex", "Complex", "C"}
};
cTypes = typeTable[[All, 2]];
abstractFunctionName = "parif";
libraryFunctionNames =
abstractFunctionName <> "T" <> # <> "_T" & /@ typeTable[[All, 4]];
getters = StringJoin["MArgument_get", #] & /@ typeTable[[All, 3]];
dataGetters = "MTensor_get" <> # <> "Data" & /@ typeTable[[All, 3]];
tensorTypes = "MType_" <> # & /@ typeTable[[All, 3]];
libraryFunctionNamesSingle = # <> "Single" & /@ libraryFunctionNames;
libraryFunctionNamesMulti = # <> "Multi" & /@ libraryFunctionNames;
relations =
Hold@
{
getter -> getters,
dataGetter -> dataGetters,
type -> cTypes ,
tensorType -> tensorTypes
};
preRules = Thread /@ First@ MapAt[HoldPattern, relations, {1, All, 1}];
The next definitions distinguish between the "single" case where we want to riffle with a constant, and the "multi" case where we want to riffle two lists. I have not taken the effort of making this code more expressive. That is, there is some duplicate code.
preRulesSingle =
Append[
preRules,
Thread[libraryFunctionName -> libraryFunctionNamesSingle]
];
singleDIERule =
setupSecondInputAccess :>
Sequence[
CDeclare[type, "inputSingleton"],
CAssign["inputSingleton", CCall[getter, CArray["Args", 1]]]
];
rulesSingle =
Thread[
Join[
preRulesSingle
,
{
HoldPattern@inputElement -> "inputSingleton",
HoldPattern@determineNextInputElement :> Sequence[],
disownIfNeeded :> Sequence[],
singleDIERule
}
]
];
preRulesMulti =
Append[
preRules,
Thread[libraryFunctionName -> libraryFunctionNamesMulti]
];
multiDIERule =
setupSecondInputAccess :>
Sequence[
CDeclare["MTensor", "input2"],
CAssign["input2",
CCall["MArgument_getMTensor", CArray["Args", 1]]],
CDeclare[CPointerType[type], "input2DataPtr"],
CAssign[
"input2DataPtr",
CCall[CPointerMember["libData", dataGetter], {"input2"}]
]
];
rulesMulti =
Thread[
Join[
preRulesMulti
,
{
HoldPattern@inputElement -> CDereference["input2DataPtr"],
HoldPattern@determineNextInputElement :>
COperator[Increment, "input2DataPtr"],
disownIfNeeded :>
CCall[CPointerMember["libData", "MTensor_disown"], "input2"],
multiDIERule
}
]
];
Now we construct some SymbolicC with some tokens inside it
parifSCHeld =
Hold@
CFunction[
"int"
,
libraryFunctionName
,
{{"WolframLibraryData", "libData"}, {"mint",
"Argc"}, {CPointerType["MArgument"], "Args"}, {"MArgument",
"Res"}}
,
CBlock[
{
CDeclare["int", CAssign["err", "LIBRARY_NO_ERROR"]],
CDeclare["MTensor", "input"],
CDeclare["MTensor", "result"],
CDeclare["int", "inputLength"],
CDeclare["mint", CArray["resultDimensions", 2]],
CAssign[
"input",
CCall["MArgument_getMTensor", CArray["Args", 0]]
],
CAssign[
"inputLength",
CDereference[
CCall[
CPointerMember["libData", "MTensor_getDimensions"],
{"input"}
]
]
],
CAssign[
CArray["resultDimensions", 0],
"inputLength"
],
CAssign[
CArray["resultDimensions", 1],
2
]
,
CAssign["err",
CCall[
CPointerMember["libData", "MTensor_new"],
{tensorType, 2, "resultDimensions", CAddress["result"]}
]
]
,
CDeclare[CPointerType[type], "inputDataPtr"],
CAssign[
"inputDataPtr",
CCall[
CPointerMember["libData", dataGetter],
{"input"}
]
],
CDeclare[CPointerType[type], "resultDataPtr"],
CAssign[
"resultDataPtr",
CCall[
CPointerMember["libData", dataGetter],
{"result"}
]
]
,
CDeclare["long", CAssign["iter", 0]]
,
setupSecondInputAccess
,
CWhile[
COperator[Less, {"iter", "inputLength"}],
CBlock[
{
CAssign[
CDereference["resultDataPtr"],
CDereference["inputDataPtr"]
],
COperator[Increment, "resultDataPtr"],
COperator[Increment, "inputDataPtr"],
CAssign[
CDereference["resultDataPtr"],
inputElement
],
determineNextInputElement,
COperator[Increment, "resultDataPtr"],
COperator[Increment, "iter"]
}
]
]
,
CCall["MArgument_setMTensor", {"Res", "result"}],
CCall[CPointerMember["libData", "MTensor_disown"], "input"],
disownIfNeeded,
CReturn["err"]
}
]
];
Now, we turn it into a string, that would normally correspond to the contents of a .c file.
symbCWithReplacementsSingle = parifSCHeld //. rulesSingle;
symbCWithReplacementsMulti = parifSCHeld //. rulesMulti;
cStrings =
"DLLEXPORT" <> " " <> ToCCodeString@First@# & /@
Join[symbCWithReplacementsSingle, symbCWithReplacementsMulti];
cFunctionsString = StringRiffle[cStrings, "\n\n"];
boilerPlate = "
#include \"WolframLibrary.h\"
/* Return the version of Library Link */
DLLEXPORT mint WolframLibrary_getVersion( ) {
\treturn WolframLibraryVersion;
}
/* Initialize Library */
DLLEXPORT int WolframLibrary_initialize( WolframLibraryData \
libData) {
\treturn LIBRARY_NO_ERROR;
}
/* Uninitialize Library */
DLLEXPORT void WolframLibrary_uninitialize( WolframLibraryData \
libData) {
\treturn;
}
";
totalCString = boilerPlate <> cFunctionsString;
The code below creates the library.
libName = "parif";
CreateLibrary[totalCString, libName]
This loads the functions (again duplicate code)
parifFunctionsSingle =
LibraryFunctionLoad[
libName, #2, {{#, 1, "Shared"}, {#}}, {#, 2}] & @@@
Transpose@{typeTable[[All, 1]], libraryFunctionNamesSingle};
Set @@ {ToExpression["parifSingle" <> #, InputForm,
Unevaluated], #2} & @@@
Transpose@{typeTable[[All, 4]], parifFunctionsSingle};
parifFunctionsMulti =
LibraryFunctionLoad[
libName, #2, {{#, 1, "Shared"}, {#, 1, "Shared"}}, {#, 2}] & @@@
Transpose@{typeTable[[All, 1]], libraryFunctionNamesMulti};
Set @@ {ToExpression["parifMulti" <> #, InputForm,
Unevaluated], #2} & @@@
Transpose@{typeTable[[All, 4]], parifFunctionsMulti};
Timing comparisons
betterish[a_List,const_]:=Transpose[{a,ConstantArray[const,Length@a]}];
arFlatWiz[a_,b_]:= ArrayFlatten[{{{a}\[Transpose],b}}];
nn=10^7;
randomIntegers=RandomInteger[100,nn];
(resB=betterish[randomIntegers,2])//RepeatedTiming//First
(resAFW=arFlatWiz[randomIntegers, 2])//RepeatedTiming//First
(resSI=parifSingleI[randomIntegers,2])//RepeatedTiming//First
resB === resAFW===resSI
gives
0.201
0.213
0.0769
True
And
randomReals1=RandomReal[1,nn];
randomReals2=RandomReal[1,nn];
(resT=Transpose@{randomReals1, randomReals2})//RepeatedTiming//First
(resMRLL=parifMultiR[randomReals1,randomReals2])//RepeatedTiming//First
resMRLL===resT
gives
0.18
0.0838
True
So indeed we are faster than Transpose
here.
Comments
Post a Comment