I have a list looking like this;
{{"ADS GY Equity", "", "", "ALV GY Equity", "", "", "BAS GY Equity",
"", "", "BAYN GY Equity", "", "", "BEI GY Equity", "", "",
"BMW GY Equity", ""}, {"Date", "PX_LAST", "", "Date", "PX_LAST", "",
"Date", "PX_LAST", "", "Date", "PX_LAST", "", "Date", "PX_LAST",
"", "Date", "PX_LAST"}, {"addidas", 17.925, "", "alvac", 287.292,
"", "basse", 24.875, "", "bayern", 42.34, "", "begge", 23.667, "",
"BMW", 29.49}, {{2000, 1, 4, 0, 0, 0.}, 17.5,
"", {2000, 1, 4, 0, 0, 0.}, 285.934, "", {2000, 1, 4, 0, 0, 0.},
23.925, "", {2000, 1, 4, 0, 0, 0.}, 41.216,
"", {2000, 1, 4, 0, 0, 0.}, 21.333, "", {2000, 1, 4, 0, 0, 0.},
28.3}, {{2000, 1, 5, 0, 0, 0.}, 17.5, "", {2000, 1, 5, 0, 0, 0.},
294.078, "", {2000, 1, 5, 0, 0, 0.}, 23.375,
"", {2000, 1, 5, 0, 0, 0.}, 40.176, "", {2000, 1, 5, 0, 0, 0.}, 21.,
"", {2000, 1, 5, 0, 0, 0.}, 27.74}, {{2000, 1, 6, 0, 0, 0.}, 18.25,
"", {2000, 1, 6, 0, 0, 0.}, 297.698, "", {2000, 1, 6, 0, 0, 0.},
24.015, "", {2000, 1, 6, 0, 0, 0.}, 41.356,
"", {2000, 1, 6, 0, 0, 0.}, 21.867, "", {2000, 1, 6, 0, 0, 0.},
27.65}, {{2000, 1, 7, 0, 0, 0.}, 18., "", {2000, 1, 7, 0, 0, 0.},
305.977, "", {2000, 1, 7, 0, 0, 0.}, 25.,
"", {2000, 1, 7, 0, 0, 0.}, 43.08, "", {2000, 1, 7, 0, 0, 0.},
22.33, "", {2000, 1, 7, 0, 0, 0.}, 27.6}, {{2000, 1, 10, 0, 0, 0.},
18.272, "", {2000, 1, 10, 0, 0, 0.}, 307.742,
"", {2000, 1, 10, 0, 0, 0.}, 25.11, "", {2000, 1, 10, 0, 0, 0.},
44.644, "", {2000, 1, 10, 0, 0, 0.}, 22.667,
"", {2000, 1, 10, 0, 0, 0.}, 28.7}, {{2000, 1, 11, 0, 0, 0.},
18.103, "", "", "", "", {2000, 1, 11, 0, 0, 0.}, 23.995,
"", {2000, 1, 11, 0, 0, 0.}, 43.323, "", {2000, 1, 11, 0, 0, 0.},
21.883, "", {2000, 1, 11, 0, 0, 0.},
28.6}, {{2000, 1, 12, 0, 0, 0.}, 18., "", "", "",
"", {2000, 1, 12, 0, 0, 0.}, 24., "", {2000, 1, 12, 0, 0, 0.},
41.45, "", {2000, 1, 12, 0, 0, 0.}, 21.717,
"", {2000, 1, 12, 0, 0, 0.}, 28.19}, {{2000, 1, 13, 0, 0, 0.},
17.462, "", "", "", "", {2000, 1, 13, 0, 0, 0.}, 23.75,
"", {2000, 1, 13, 0, 0, 0.}, 41.169, "", {2000, 1, 13, 0, 0, 0.},
22.163, "", {2000, 1, 13, 0, 0, 0.},
27.4}, {{2000, 1, 14, 0, 0, 0.}, 16.837, "", "", "",
"", {2000, 1, 14, 0, 0, 0.}, 23.75, "", {2000, 1, 14, 0, 0, 0.},
42.152, "", {2000, 1, 14, 0, 0, 0.}, 22.583,
"", {2000, 1, 14, 0, 0, 0.}, 27.2}, {{2000, 1, 17, 0, 0, 0.}, 17.,
"", "", "", "", {2000, 1, 17, 0, 0, 0.}, 23.935, "", "", "",
"", {2000, 1, 17, 0, 0, 0.}, 23.33, "", {2000, 1, 17, 0, 0, 0.},
27.28}, {{2000, 1, 18, 0, 0, 0.}, 17., "", "", "",
"", {2000, 1, 18, 0, 0, 0.}, 23.975, "", "", "",
"", {2000, 1, 18, 0, 0, 0.}, 22.667, "", {2000, 1, 18, 0, 0, 0.},
27.}, {{2000, 1, 19, 0, 0, 0.}, 16.65, "", "", "",
"", {2000, 1, 19, 0, 0, 0.}, 24.92, "", "", "",
"", {2000, 1, 19, 0, 0, 0.}, 23.3, "", {2000, 1, 19, 0, 0, 0.},
27.59}}
I want to extract every list which has length 14 or 15. This should leave me with eight lists, because I also want to drop the lists/columns with no values in it.
This is how the output should look like after extraction
{{{"ADS GY Equity", "", "BAS GY Equity", "", "BEI GY Equity", "",
"BMW GY Equity", ""}, {"Date", "PX_LAST", "Date", "PX_LAST",
"Date", "PX_LAST", "Date", "PX_LAST"}, {"addidas", 17.925, "basse",
24.875, "begge", 23.667, "BMW", 29.49}, {{2000, 1, 4, 0, 0, 0.},
17.5, {2000, 1, 4, 0, 0, 0.}, 23.925, {2000, 1, 4, 0, 0, 0.},
21.333, {2000, 1, 4, 0, 0, 0.}, 28.3}, {{2000, 1, 5, 0, 0, 0.},
17.5, {2000, 1, 5, 0, 0, 0.}, 23.375, {2000, 1, 5, 0, 0, 0.},
21., {2000, 1, 5, 0, 0, 0.}, 27.74}, {{2000, 1, 6, 0, 0, 0.},
18.25, {2000, 1, 6, 0, 0, 0.}, 24.015, {2000, 1, 6, 0, 0, 0.},
21.867, {2000, 1, 6, 0, 0, 0.}, 27.65}, {{2000, 1, 7, 0, 0, 0.},
18., {2000, 1, 7, 0, 0, 0.}, 25., {2000, 1, 7, 0, 0, 0.},
22.33, {2000, 1, 7, 0, 0, 0.}, 27.6}, {{2000, 1, 10, 0, 0, 0.},
18.272, {2000, 1, 10, 0, 0, 0.}, 25.11, {2000, 1, 10, 0, 0, 0.},
22.667, {2000, 1, 10, 0, 0, 0.}, 28.7}, {{2000, 1, 11, 0, 0, 0.},
18.103, {2000, 1, 11, 0, 0, 0.}, 23.995, {2000, 1, 11, 0, 0, 0.},
21.883, {2000, 1, 11, 0, 0, 0.}, 28.6}, {{2000, 1, 12, 0, 0, 0.},
18., {2000, 1, 12, 0, 0, 0.}, 24., {2000, 1, 12, 0, 0, 0.},
21.717, {2000, 1, 12, 0, 0, 0.}, 28.19}, {{2000, 1, 13, 0, 0, 0.},
17.462, {2000, 1, 13, 0, 0, 0.}, 23.75, {2000, 1, 13, 0, 0, 0.},
22.163, {2000, 1, 13, 0, 0, 0.}, 27.4}, {{2000, 1, 14, 0, 0, 0.},
16.837, {2000, 1, 14, 0, 0, 0.}, 23.75, {2000, 1, 14, 0, 0, 0.},
22.583, {2000, 1, 14, 0, 0, 0.}, 27.2}, {{2000, 1, 17, 0, 0, 0.},
17., {2000, 1, 17, 0, 0, 0.}, 23.935, {2000, 1, 17, 0, 0, 0.},
23.33, {2000, 1, 17, 0, 0, 0.}, 27.28}, {{2000, 1, 18, 0, 0, 0.},
17., {2000, 1, 18, 0, 0, 0.}, 23.975, {2000, 1, 18, 0, 0, 0.},
22.667, {2000, 1, 18, 0, 0, 0.}, 27.}, {{2000, 1, 19, 0, 0, 0.},
16.65, {2000, 1, 19, 0, 0, 0.}, 24.92, {2000, 1, 19, 0, 0, 0.},
23.3, {2000, 1, 19, 0, 0, 0.}, 27.59}}}
If I where to explain simply what has been done, It only extracted the companies date and closing price where the number of closing prices was equal to the number of longest number of closing prices for any of the companies.
Answer
Starting with your data assigned to dat
please try this:
tdat = Partition[dat\[Transpose], 2, 3];
newdat = Join @@ DeleteCases[tdat, {{__, ""}, _}]\[Transpose];
I am going by the assumption that any primary column ending with ""
means that it is short and should be pared from the table.
Sample:
newdat // MatrixForm
Explanation
As requested here is an explanation of my code. Due to the format of tables in Mathematica it is usually much easier to operate on rows rather than columns therefore my first step is to Transpose
the data. (\[Transpose]
is a special transpose symbol; see the linked documentation.)
When looking at the data a structure becomes apparent: there are two columns of data in each group, and a spacing or dividing column containing nothing but ""
. (Indicentially these dividing columns are completely omitted in my output; if you need them, ask, and I'll put them back in.) I therefore Partition
the data into groups of two columns, moving three places between each group and thereby skipping the "empty" dividing columns. I call this data tdat
.
Now that the data is in a row-based form and grouped we can use fairly simple pattern matching to extract only the full-length sections. It is simpler in this case to delete the unwanted ones so I use DeleteCases
. My pattern is:
{{__, ""}, _}
This pattern will be matched to each group of two rows (originally columns). It is a literal list with two elements. The second is _
which is short for Blank[]
and matches any single expression; I use this because I want to allow any second column. The first element is {__, ""}
which matches a list (here row, originally column) that starts with any series of elements (__
, short for BlankSequence[]
) and ends with literal ""
. We therefore match any data group the first row (column) of which ends with ""
and delete these.
Finally, the data groups are joined into a single table using Join
and @@ (short for Apply
), then transposed back to column form. Note that the low precedence of the \[Transpose]
operator means that this transpose is done after the Join
.
By the way, if you are doing this on a large volume of data there will be faster methods than pattern matching, perhaps using Pick
in place of the DeleteCases
step. However, I find pattern matching more versatile and concise, and therefore the best place to start unless speed or large data is a priority.
Comments
Post a Comment