Skip to main content

syntax - Convenient string manipulation


With Mathematica I always feel that strings are "second class citizens." Compared to a language such as PERL one must juggle a lot of code to accomplish the same task.


The available functionality is not bad but the syntax is uncomfortable. While there are a few shorthand forms such as <> for StringJoin and ~~ for StringExpression, most of the string functionality lacks such syntax, and uses clumsy names like: StringReplace, StringDrop, StringReverse, Characters, CharacterRange, FromCharacterCode, and RegularExpression.


In Mathematica strings are handled like mathematical objects, allowing 5 "a" + "b" where "a" and "b" act as symbols. This is a feature that I would not change, even if doing so would not break stacks of code. Nevertheless it precludes certain terse string syntax wherein the expression 5 "a" + "b" would be rendered "aaaaab" for example.




What is the best way to make string manipulation more convenient in Mathematica?


Ideas that come to mind, either alone or in combination, are:




  1. Overload existing functions to work on strings, e.g. Take, Replace, Reverse.




    • This was the original topic of my question to which Sasha replied. It was seen as inadvisable.




  2. Use shortened names for string functions, e.g. StringReplace >> StrRpl, Characters >> Chrs, RegularExpression >> RegEx




  3. Create new infix syntax for string functions, and possibly new string operations.





  4. Create a new container for strings, e.g. str["string"], and then definitions for various functions. (This was suggested by Leonid Shifrin.)




  5. A variation of (4), expand strings (automatically?) to characters, e.g. "string" >> str["s","t","r","i","n","g"] so that the characters can be seen by Part, Take, etc.




  6. Call another language such as PERL from within Mathematica to handle string processing.





  7. Create new string functions that conglomerate frequently used sequences of operations.





Answer



I suggest an approach based on creating lexical and / or dynamic environments (custom scoping constructs if you wish), inside which the rules of our "universe" will be altered. I will illustrate with a dynamic environment:


ClearAll[withStringManipulations];
SetAttributes[withStringManipulations, HoldAll];
withStringManipulations[code_] :=
Internal`InheritedBlock[{Take, Drop, Position, Join, Append,
Prepend, Length, Part, Plus},

Unprotect[Take, Drop, Position, Join, Append, Prepend, Length, Part, Plus];
Take[s_String, pos_] := StringTake[s, pos];
Drop[s_String, pos_] := StringDrop[s, pos];
HoldPattern[Part[s_String, n_]] := StringTake[s, {n, n}];
Join[ss__String] := StringJoin[ss];
Append[s_String, ss_String] := StringJoin[s, ss];
Prepend[s_String, ss_String] := StringJoin[ss, s];
Length[s_String] := StringLength[s];
Plus =
Function[Null,

If[MatchQ[{##}, {__String}],
StringJoin[##],
(* else *)
Module[{result, ov = OwnValues[Plus]},
Unprotect[Plus];
OwnValues[Plus] = {};
result = Plus[##];
OwnValues[Plus] = ov;
Protect[Plus];
result]]];

Protect[Take, Drop, Position, Join, Append, Prepend, Length, Part, Plus];
code
];

This is not a complete set of things you can do, just an example. Because I used Internal`InheritedBlock, the global versions of functions Part etc are never modified, so this is safe in the sense that it does not have system-wide effects. With Plus, I had to go through some pain, since it has an Orderless attribute and I did not want to alter that, but wanted to avoid sorting when arguments are strings.


Some examples:


In[31]:= withStringManipulations["a"+"b"+"c"]
Out[31]= abc

In[32]:= withStringManipulations[1+2+3]

Out[32]= 6

In[34]:= withStringManipulations[With[{s = "abc"},Table[s[[i]],{i,Length[s]}]]]//InputForm
Out[34]//InputForm=
{"a", "b", "c"}

withStringManipulations[Append["abc","d"]]
Out[37]= abcd

As I said, this is just an example to illustrate the idea. Anyone interested can create their own environments by setting their own rules. This is IMO a very cheap and powerful way to reuse the system functions' syntax to one's liking, without endangering the system.



Be aware, however, that the above environment is dynamic (in terms of scoping), and so not suitable for example to create higher-order functions which would accept some arbitrary user's code (unless the user knows exactly what the consequences will be, but in practice you as a package-writer can not depend on the user much), since these functions (Part etc) will be also behaving differently in that code. It is also possible to create lexical environments, where the changes will only affect the code literally present inside the environment.


Comments

Popular posts from this blog

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1.