Skip to main content

Extracting lists from list based on length


I have a list looking like this;


{{"ADS GY Equity", "", "", "ALV GY Equity", "", "", "BAS GY Equity", 
"", "", "BAYN GY Equity", "", "", "BEI GY Equity", "", "",
"BMW GY Equity", ""}, {"Date", "PX_LAST", "", "Date", "PX_LAST", "",

"Date", "PX_LAST", "", "Date", "PX_LAST", "", "Date", "PX_LAST",
"", "Date", "PX_LAST"}, {"addidas", 17.925, "", "alvac", 287.292,
"", "basse", 24.875, "", "bayern", 42.34, "", "begge", 23.667, "",
"BMW", 29.49}, {{2000, 1, 4, 0, 0, 0.}, 17.5,
"", {2000, 1, 4, 0, 0, 0.}, 285.934, "", {2000, 1, 4, 0, 0, 0.},
23.925, "", {2000, 1, 4, 0, 0, 0.}, 41.216,
"", {2000, 1, 4, 0, 0, 0.}, 21.333, "", {2000, 1, 4, 0, 0, 0.},
28.3}, {{2000, 1, 5, 0, 0, 0.}, 17.5, "", {2000, 1, 5, 0, 0, 0.},
294.078, "", {2000, 1, 5, 0, 0, 0.}, 23.375,
"", {2000, 1, 5, 0, 0, 0.}, 40.176, "", {2000, 1, 5, 0, 0, 0.}, 21.,

"", {2000, 1, 5, 0, 0, 0.}, 27.74}, {{2000, 1, 6, 0, 0, 0.}, 18.25,
"", {2000, 1, 6, 0, 0, 0.}, 297.698, "", {2000, 1, 6, 0, 0, 0.},
24.015, "", {2000, 1, 6, 0, 0, 0.}, 41.356,
"", {2000, 1, 6, 0, 0, 0.}, 21.867, "", {2000, 1, 6, 0, 0, 0.},
27.65}, {{2000, 1, 7, 0, 0, 0.}, 18., "", {2000, 1, 7, 0, 0, 0.},
305.977, "", {2000, 1, 7, 0, 0, 0.}, 25.,
"", {2000, 1, 7, 0, 0, 0.}, 43.08, "", {2000, 1, 7, 0, 0, 0.},
22.33, "", {2000, 1, 7, 0, 0, 0.}, 27.6}, {{2000, 1, 10, 0, 0, 0.},
18.272, "", {2000, 1, 10, 0, 0, 0.}, 307.742,
"", {2000, 1, 10, 0, 0, 0.}, 25.11, "", {2000, 1, 10, 0, 0, 0.},

44.644, "", {2000, 1, 10, 0, 0, 0.}, 22.667,
"", {2000, 1, 10, 0, 0, 0.}, 28.7}, {{2000, 1, 11, 0, 0, 0.},
18.103, "", "", "", "", {2000, 1, 11, 0, 0, 0.}, 23.995,
"", {2000, 1, 11, 0, 0, 0.}, 43.323, "", {2000, 1, 11, 0, 0, 0.},
21.883, "", {2000, 1, 11, 0, 0, 0.},
28.6}, {{2000, 1, 12, 0, 0, 0.}, 18., "", "", "",
"", {2000, 1, 12, 0, 0, 0.}, 24., "", {2000, 1, 12, 0, 0, 0.},
41.45, "", {2000, 1, 12, 0, 0, 0.}, 21.717,
"", {2000, 1, 12, 0, 0, 0.}, 28.19}, {{2000, 1, 13, 0, 0, 0.},
17.462, "", "", "", "", {2000, 1, 13, 0, 0, 0.}, 23.75,

"", {2000, 1, 13, 0, 0, 0.}, 41.169, "", {2000, 1, 13, 0, 0, 0.},
22.163, "", {2000, 1, 13, 0, 0, 0.},
27.4}, {{2000, 1, 14, 0, 0, 0.}, 16.837, "", "", "",
"", {2000, 1, 14, 0, 0, 0.}, 23.75, "", {2000, 1, 14, 0, 0, 0.},
42.152, "", {2000, 1, 14, 0, 0, 0.}, 22.583,
"", {2000, 1, 14, 0, 0, 0.}, 27.2}, {{2000, 1, 17, 0, 0, 0.}, 17.,
"", "", "", "", {2000, 1, 17, 0, 0, 0.}, 23.935, "", "", "",
"", {2000, 1, 17, 0, 0, 0.}, 23.33, "", {2000, 1, 17, 0, 0, 0.},
27.28}, {{2000, 1, 18, 0, 0, 0.}, 17., "", "", "",
"", {2000, 1, 18, 0, 0, 0.}, 23.975, "", "", "",

"", {2000, 1, 18, 0, 0, 0.}, 22.667, "", {2000, 1, 18, 0, 0, 0.},
27.}, {{2000, 1, 19, 0, 0, 0.}, 16.65, "", "", "",
"", {2000, 1, 19, 0, 0, 0.}, 24.92, "", "", "",
"", {2000, 1, 19, 0, 0, 0.}, 23.3, "", {2000, 1, 19, 0, 0, 0.},
27.59}}

I want to extract every list which has length 14 or 15. This should leave me with eight lists, because I also want to drop the lists/columns with no values in it.


This is how the output should look like after extraction


  {{{"ADS GY Equity", "", "BAS GY Equity", "", "BEI GY Equity", "", 
"BMW GY Equity", ""}, {"Date", "PX_LAST", "Date", "PX_LAST",

"Date", "PX_LAST", "Date", "PX_LAST"}, {"addidas", 17.925, "basse",
24.875, "begge", 23.667, "BMW", 29.49}, {{2000, 1, 4, 0, 0, 0.},
17.5, {2000, 1, 4, 0, 0, 0.}, 23.925, {2000, 1, 4, 0, 0, 0.},
21.333, {2000, 1, 4, 0, 0, 0.}, 28.3}, {{2000, 1, 5, 0, 0, 0.},
17.5, {2000, 1, 5, 0, 0, 0.}, 23.375, {2000, 1, 5, 0, 0, 0.},
21., {2000, 1, 5, 0, 0, 0.}, 27.74}, {{2000, 1, 6, 0, 0, 0.},
18.25, {2000, 1, 6, 0, 0, 0.}, 24.015, {2000, 1, 6, 0, 0, 0.},
21.867, {2000, 1, 6, 0, 0, 0.}, 27.65}, {{2000, 1, 7, 0, 0, 0.},
18., {2000, 1, 7, 0, 0, 0.}, 25., {2000, 1, 7, 0, 0, 0.},
22.33, {2000, 1, 7, 0, 0, 0.}, 27.6}, {{2000, 1, 10, 0, 0, 0.},

18.272, {2000, 1, 10, 0, 0, 0.}, 25.11, {2000, 1, 10, 0, 0, 0.},
22.667, {2000, 1, 10, 0, 0, 0.}, 28.7}, {{2000, 1, 11, 0, 0, 0.},
18.103, {2000, 1, 11, 0, 0, 0.}, 23.995, {2000, 1, 11, 0, 0, 0.},
21.883, {2000, 1, 11, 0, 0, 0.}, 28.6}, {{2000, 1, 12, 0, 0, 0.},
18., {2000, 1, 12, 0, 0, 0.}, 24., {2000, 1, 12, 0, 0, 0.},
21.717, {2000, 1, 12, 0, 0, 0.}, 28.19}, {{2000, 1, 13, 0, 0, 0.},
17.462, {2000, 1, 13, 0, 0, 0.}, 23.75, {2000, 1, 13, 0, 0, 0.},
22.163, {2000, 1, 13, 0, 0, 0.}, 27.4}, {{2000, 1, 14, 0, 0, 0.},
16.837, {2000, 1, 14, 0, 0, 0.}, 23.75, {2000, 1, 14, 0, 0, 0.},
22.583, {2000, 1, 14, 0, 0, 0.}, 27.2}, {{2000, 1, 17, 0, 0, 0.},

17., {2000, 1, 17, 0, 0, 0.}, 23.935, {2000, 1, 17, 0, 0, 0.},
23.33, {2000, 1, 17, 0, 0, 0.}, 27.28}, {{2000, 1, 18, 0, 0, 0.},
17., {2000, 1, 18, 0, 0, 0.}, 23.975, {2000, 1, 18, 0, 0, 0.},
22.667, {2000, 1, 18, 0, 0, 0.}, 27.}, {{2000, 1, 19, 0, 0, 0.},
16.65, {2000, 1, 19, 0, 0, 0.}, 24.92, {2000, 1, 19, 0, 0, 0.},
23.3, {2000, 1, 19, 0, 0, 0.}, 27.59}}}

If I where to explain simply what has been done, It only extracted the companies date and closing price where the number of closing prices was equal to the number of longest number of closing prices for any of the companies.



Answer



Starting with your data assigned to dat please try this:



tdat = Partition[dat\[Transpose], 2, 3];

newdat = Join @@ DeleteCases[tdat, {{__, ""}, _}]\[Transpose];

I am going by the assumption that any primary column ending with "" means that it is short and should be pared from the table.


Sample:


newdat // MatrixForm

enter image description here


Explanation



As requested here is an explanation of my code. Due to the format of tables in Mathematica it is usually much easier to operate on rows rather than columns therefore my first step is to Transpose the data. (\[Transpose] is a special transpose symbol; see the linked documentation.)


When looking at the data a structure becomes apparent: there are two columns of data in each group, and a spacing or dividing column containing nothing but "". (Indicentially these dividing columns are completely omitted in my output; if you need them, ask, and I'll put them back in.) I therefore Partition the data into groups of two columns, moving three places between each group and thereby skipping the "empty" dividing columns. I call this data tdat.


Now that the data is in a row-based form and grouped we can use fairly simple pattern matching to extract only the full-length sections. It is simpler in this case to delete the unwanted ones so I use DeleteCases. My pattern is:


{{__, ""}, _}

This pattern will be matched to each group of two rows (originally columns). It is a literal list with two elements. The second is _ which is short for Blank[] and matches any single expression; I use this because I want to allow any second column. The first element is {__, ""} which matches a list (here row, originally column) that starts with any series of elements (__, short for BlankSequence[]) and ends with literal "". We therefore match any data group the first row (column) of which ends with "" and delete these.


Finally, the data groups are joined into a single table using Join and @@ (short for Apply), then transposed back to column form. Note that the low precedence of the \[Transpose] operator means that this transpose is done after the Join.


By the way, if you are doing this on a large volume of data there will be faster methods than pattern matching, perhaps using Pick in place of the DeleteCases step. However, I find pattern matching more versatile and concise, and therefore the best place to start unless speed or large data is a priority.


Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...