Skip to main content

parsing - How to get the longest bracket pairs from a string


In C#, we can use regular expression to do some balancing groups, like we can get


{  aa>, <a>}

from


"xx   aa> yy<a>"

using the regular expression


<((?<)|(?<-gro>>)|[^<>])*(?(gro)(?!))>


We can get the syntax coloring with for instance RegexBuddy:



How to realize this in Mathematica?


I prefer using regular expression to do this, but Mathematica just supports some basic usage in regular expressions, and doesn't support advanced usage such as dynamic regular expression and balancing groups.


PS: Another example: how to get {[ab*[c]d], (b(x99))} from "dd9[ab*[c]d]esiddx(45x(b(x99))"?



Answer



First Case


str = "xx   aa> yy<a>";
StringCases[str,
RegularExpression["(?P<([^<>]|(?P>a))*>)"]

]
(* {" aa>", "<a>"} *)

This works as follows:




Second Case


str2 = "dd9[ab*[c]d]esiddx(45x(b(x99))"
StringCases[str2,
RegularExpression["(?P
(\\[|\\()([^\\[\\]\\(\\)]|(?P>a))*(\\]|\\)))"]
]
(* {"[ab*[c]d]", "(b(x99))"} *)

This works as above. Here, instead of < at the beginning of the (sub)string, we allow for [ or ( with (\\[|\\(). The other modifications are in line with this change.


Note that this regular expression may not be satisfying for cases such as


str3 = "dd9[ab*[c]d)esiddx(45x(b(x99))";

(* The square bracket after d is replaced by a parenthesis. *)

StringCases[str3,
RegularExpression["(?P
(\\[|\\()([^\\[\\]\\(\\)]|(?P>a))*(\\]|\\)))"]
]
(* {"[ab*[c]d)", "(b(x99))"} *)

The first element starts with a [ and ends with ). This can be avoided by adding a pattern and a condition test on this pattern:


StringCases[str3, 
RegularExpression["(?P
((?P\\[)|\\()([^\\[\\]\\(\\)]|(?P>a))*(?(b)\\]|\\)))"]

]
(* {"[c]", "(b(x99))"} *)

The starting [ is referred to as b. The pattern (?(b)\\]|\\)) tells us that if b had a match, then the character to match should be ], or otherwise ).


Comments

Popular posts from this blog

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...

How to thread a list

I have data in format data = {{a1, a2}, {b1, b2}, {c1, c2}, {d1, d2}} Tableform: I want to thread it to : tdata = {{{a1, b1}, {a2, b2}}, {{a1, c1}, {a2, c2}}, {{a1, d1}, {a2, d2}}} Tableform: And I would like to do better then pseudofunction[n_] := Transpose[{data2[[1]], data2[[n]]}]; SetAttributes[pseudofunction, Listable]; Range[2, 4] // pseudofunction Here is my benchmark data, where data3 is normal sample of real data. data3 = Drop[ExcelWorkBook[[Column1 ;; Column4]], None, 1]; data2 = {a #, b #, c #, d #} & /@ Range[1, 10^5]; data = RandomReal[{0, 1}, {10^6, 4}]; Here is my benchmark code kptnw[list_] := Transpose[{Table[First@#, {Length@# - 1}], Rest@#}, {3, 1, 2}] &@list kptnw2[list_] := Transpose[{ConstantArray[First@#, Length@# - 1], Rest@#}, {3, 1, 2}] &@list OleksandrR[list_] := Flatten[Outer[List, List@First[list], Rest[list], 1], {{2}, {1, 4}}] paradox2[list_] := Partition[Riffle[list[[1]], #], 2] & /@ Drop[list, 1] RM[list_] := FoldList[Transpose[{First@li...

front end - keyboard shortcut to invoke Insert new matrix

I frequently need to type in some matrices, and the menu command Insert > Table/Matrix > New... allows matrices with lines drawn between columns and rows, which is very helpful. I would like to make a keyboard shortcut for it, but cannot find the relevant frontend token command (4209405) for it. Since the FullForm[] and InputForm[] of matrices with lines drawn between rows and columns is the same as those without lines, it's hard to do this via 3rd party system-wide text expanders (e.g. autohotkey or atext on mac). How does one assign a keyboard shortcut for the menu item Insert > Table/Matrix > New... , preferably using only mathematica? Thanks! Answer In the MenuSetup.tr (for linux located in the $InstallationDirectory/SystemFiles/FrontEnd/TextResources/X/ directory), I changed the line MenuItem["&New...", "CreateGridBoxDialog"] to read MenuItem["&New...", "CreateGridBoxDialog", MenuKey["m", Modifiers-...