Skip to main content

import - Plotting data with exponentials


I have some data with 19000 sublists such as :


{"   7.9080000e+01   1.9283193e+04"}

Where the first number is the value for variable A and the second for variable B.


All my attempts to transform this format have failed so far. I think my best guess was using ToExpression unsuccessfully.


How can I transform such lists to a "plottable" format by



  • Changing the String format ?


  • Computing the e ?

  • Import the data differently ?



Answer



You should be able to use ReadList on the string contents of each sublist. Here I'm just creating a small list containing three elements identical to the one you provided. The result can be plotted using ListPlot for example.


In[20]:= in = {{"   7.9080000e+01   1.9283193e+04"}, 
{" 7.9080000e+01 1.9283193e+04"},
{" 7.9080000e+01 1.9283193e+04"}};

In[22]:= Table[ReadList[StringToStream@First[i], Number], {i, in}]


Out[22]= {{79.08, 19283.2}, {79.08, 19283.2}, {79.08, 19283.2}}

EDIT:


Due to the comments I should point out that this Table is going to produce an array that is not packed. This means that the evaluator isn't aware ahead of time that all of the values are a particular type (namely real in this case) and so it is going to lean toward more general methods and is going to consume more memory to store the table.


As the documentation for Developer`ToPackedArray points out, using Developer`ToPackedArray will not change results generated by Mathematica, but can enhance speed of execution and reduce memory usage.


In order to pack the result we can simply use ruebenko's suggestion placing Developer`ToPackedArray@ in front of our Table.


TESTING EDIT:


I decided to test whether ImportString proposed by Mr. Wizard or the ReadList approach might be faster. In fairness I separated the ExportString out presuming that the string would already be saved somewhere for importing. It appears that ReadList is much faster at least for the fabricated example I've created here. I'd be curious to see if this is true for 500's data.


In[21]:= data = Table["   7.9080000e+01   1.9283193e+04", {5000}];


In[22]:= Export["numbers.txt", data];

In[23]:= in = Partition[ReadList[StringToStream@Import["numbers.txt",
"Plaintext"], Record], 1];

In[24]:= (andyr = Table[ReadList[StringToStream@First[i], Number]
, {i, in}]); // AbsoluteTiming

Out[24]= {0.0780015, Null}


In[25]:= str = ExportString[in, "Table"];

In[26]:= (mrwiz = ImportString[str, "Table"]); // AbsoluteTiming

Out[26]= {4.1340795, Null}

In[27]:= andyr === mrwiz

Out[27]= True


I should also point out that this comparison is only fair if we assume that the data is already in memory. If not, the cost for Importing should be factored in to the ReadList approach.


Comments