Skip to main content

notebooks - NotebookFind and String Pattern Expressions


Is there a way that NotebookFind can be used to match string pattern expressions rather than just strings?


The documentation for NotebookFind states that only a string, box expression or complete cell can be used as the search term so my question is really whether or not pattern matching can be achieved through writing some additional code that wraps or replaces NotebookFind.


One obvious strategy would be to convert the notebook to a text representation using NotebookGet and then perform the pattern matching search on the text representation, but this is not ideal for my intended application because I would like any match that is found to be highlighted (by selecting it) much like NotebookFind already does.


Eventually I would like to build a replacement for Mathematica's built-in Search and Replace functionality. Two key enhancements that I hope to provide are:





  1. the ability to search and replace across all open notebooks in the front-end or all notebooks in a selected directory (which is not too difficult to accomplish) and




  2. the ability to search and replace using string pattern expressions.




I realize that Workbench already offers these features. My goal is to enable users who prefer the notebook interface (rather than the .m editor promoted by Workbench) to continue developing complex multi-notebook packages from within the front-end.


Edit:


Celtschk proposes a strategy below in the comments that may provide a partial solution. One of the issues that is still not clear however is how to deal with surrounding context in a pattern match when returning to NotebookFind.



Perhaps the following example will help clarify the potential problem. Without digressing into the theory of formal grammars, let's say that we want our string pattern language to be powerful enough to express not just wildcard patterns but also surrounding context. Imagine in particular that we want to find each occurrence of the string pattern "foo?" in some notebook that is enclosed by a pair of parentheses (not necessarily immediately surrounding the "foo?" pattern). We can do that easily using standard Mathematica string pattern expressions by operating on the string representation of the notebook.


Let's now assume that there is one occurrence of "foo1" and two occurrences of "foo2" in the notebook, the latter of which is not surrounded at any distance by a pair of parentheses. How would we then exclude the second "foo2" from being found when we return to NotebookFind to search for "foo1" and "foo2"?


Of course we could have matched the entire string plus surrounding context (which in this case would include the surrounding pair of parentheses) when searching the string representation for parentheses-enclosed instances of "foo?" -- but this is not really what we want, and in certain instances could be quite inconvenient in a tool designed to assist the user in refactoring a large body of Mathematica code.



Answer



Ok so this is going to be a long one. This is definitely not a general purpose implementation, but It shows the general idea that one could use.


So you basicly want to be able to type out NotebookFind["(.?foo\d.?)"], which would match to for example "(something something foo4 dark side)". However you only want it to highlight foo4, and not the rest. So the way to do this is to first search through the notebook and figure out that our pattern matches the entire string, and search only for the particular realized sub-expression "foo4" and figure out which of the potentially many search result for foo4 collides with the search for the entire pattern.


So for the purpose of this implimentation I'll assume that you have a RegularExpression pattern, where the part you want to highlight the first matched subpattern (Which means you enclose it in parenthesis in the search string). So the above pattern would be: RegularExpression["[(].?(foo[\d]+).?[)]"]. We then:



  • search through strings in the notebook expression for cases where this matches

  • then extract the subexpression matched,


  • then sort out how many times we match the subexpression without matching the full.

  • Then call NotebookFind[] enough times to land on the correct match.


So here goes for the actual code. It doesn't work for matching notebook level expressions and only searches through strings.


This function just creates a pattern for the actual substitution based on the search pattern.


StringPatternWrapper[stringpattern_]:=  
(a_String/;StringMatchQ[a,stringpattern]):>StringCases[a,stringpattern:>"$1"]

This function finds the positions and cases of the matched pattern. The pattern provided for this function should first be sent through StringPatternWrapper[]-


 findPostionAndExactMatch[nbexp_,pattern_]:=

{Position[nbexp,pattern[[1]],∞],
Cases[nbexp,pattern,∞]}//ridiculousFormatingFunction

where ridiculousFormatingFunction is a messy function for reformating the output.


 ridiculousFormatingFunction[list_] := 
Map[(a\[Function]Map[{a[[1]],#}&,a[[2]]]),Transpose[list]]//
(Flatten[Table[{#[[1,1]],#[[1,2]],n},{n,1,#[[2]]}]&/@Tally[Flatten[#,1]],1])&

And then a function for finding all the matches to the matched subexpression


findAllExactMatches[nbexp_,exact_] := 

Flatten[Map[Table[#, {Length@StringCases[nbexp[[Sequence@@#]],exact]}]&,
Position[nbexp,a_String/;StringMatchQ[a,exact],∞]],1]

Because some stings might contain more then one match, we need some fixing of the numbers


repeatNumberForMatch[match_,nbexp_] := 
First@Position[
findAllExactMatches[nbexp,RegularExpression[".*?"<>match[[2]]<>".*?"]],match[[1]]
][[match[[3]]]]

And finally we have a nice little function which returns a list of all the matches, which expressions they apear in, and how many times you need to skip when using NotebookFind.



 matchTable[nbexp_,pattern_] := Prepend[#,repeatNumberForMatch[#,nbexp]]&/@
findPostionAndExactMatch[nbexp,pattern]

Here is an output example from Match table using the provided example notebook below:


  matchTable[NotebookGet[nb], 
StringPatternWrapper[
RegularExpression[".*?" <> "[(].*?(foo[\\d]+).*?[)]" <> ".*?"]]] //
Prepend[#, {"Repat find number", "Indices", "Exact Match",
"Number inside string"}] & // Grid


output from matchTable


Here is a short usage example


notebookFindN[nb_,find_,n_]:=
(SelectionMove[nb,Before,Notebook];Do[NotebookFind[nb,find,Next],{n}])

clickerUI[nb_,pattern_]:=
Button[#[[3]],notebookFindN[nb,#[[3]],#[[1]]]]&/@matchTable[NotebookGet[nb],pattern]

And some test code and a test notebook:


 nb = {

Cell["This is a direct match for the realised sub-pattern foo1 but not the full", "Text"],
Cell["This is another match identical to the realised one foo2, and still not the full, however this one needs to be skiped when using NotebookFind", "Text"],
Cell["And finally ( we have a full match for foo2 the pattern ) (foo2) <- That's another one, and so is that -> (foo4)", "Text"],
Cell["Some times a single string can have more then one entry foo2 foo2, so we need to count how many and which ones we are looking for, which makes the code slightly messy.", "Text"],
Cell["And finally ( we have one last full match for foo2 ) enclosed in parenthesis", "Text"]
} // CreateDocument;

clickerUI[nb,
StringPatternWrapper@
RegularExpression[".*?" <> "[(].*?(foo[\\d]+).*?[)]" <>".*?"]

] // Row

Hope this can be of some help. Personally I'd like to have code that could equally well search though strings and notebook level expressions, however this requires a better structuring of the method I think.


Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...