Skip to main content

Is it possible to import dates and times directly as AbsoluteTime and by pass DateLists?



Date and time calculations are quite slow so I am exploring ways to minimize the number of calculations performed. As an example:


msft = FinancialData["MSFT", "Jan 1, 2005"]
Timing[DateListPlot[msft]]
{0.253632, *picture*}

whereas


Timing[
msft1 = msft;
msft1[[All, 1]] = AbsoluteTime /@ msft[[All, 1]];
DateListPlot[msft1]

]
{0.214658, *picture*}

So it is faster to convert to AbsoluteTime and make a plot than to stick with date lists (as a side issue this can actually be done even faster with ListPlot or ListLinePlot and a tick function).


When wanting to work with date lists to select subsets of data you will typically choose a search criteria in which the absolute time of elements is tested against some condition. So this is another example where it is useful to already have the data in absolute time.


So the question is whether it is possible to import dates directly into Mathematica and have them as absolute times rather than date lists. My two specific examples would be FinancialData as per the example above -- can this be made to output absolute times via some configuration rather than having to convert the data post import? Also for Excel spreadsheets containing columns with dates (i.e. date strings as per Excel formats) and data. Can these be imported and converted to absolute times during the import process instead of converted to, and output as, date lists?


(If solutions to this problem means that Import and FinancialData internals take date lists and convert to absolute time then I don't really see that as a solution. I am seeking ways to go direct to absolute time.)



Answer



The data in the file test.xls are




03/Jan/2000   45.46
04/Jan/2000 43.92
05/Jan/2000 44.38
06/Jan/2000 42.9
07/Jan/2000 43.46
10/Jan/2000 43.78
11/Jan/2000 42.65
12/Jan/2000 41.26
13/Jan/2000 42.04
14/Jan/2000 43.78


An alternative approach is to exploit the fact that Office documents are zipped collections of XML files. So,


Step 1: rename the source file by adding .zip to the file name: test.xlsx.zip.


Step 2: Import the appropriate xml file in the zip file, extract the data elements and re-format:


 Cases[Import["C:\\ your directory \\test.xlsx.zip", {"ZIP", "xl\\worksheets\\sheet1.xml"}], 
XMLElement["v", {}, {value_}] :> value, Infinity]
// Partition[#, 2] &

This gives:




{{"36528", "45.46"}, {"36529", "43.92"}, {"36530", "44.38"}, {"36531", "42.9"}, {"36532", "43.46"}, {"36535", "43.78"}, {"36536", "42.65"}, {"36537", "41.26"}, {"36538", "42.04"}, {"36539", "43.78"}}



where the first entry in each sublist is Excel's DATEVALUE (serial date number that counts the number of days from 1/1/1900).


Puzzle: I would expect that converting Excel's DATEVALUE to Mathematica's AbsoluteTime (number of seconds from 1/1/1900) would be as simple as multiplying the former by 24*60*60. But doing that with:


  excelDateValues = {"36528", "36529", "36530", "36531", "36532", "36535", "36536", "36537", "36538", "36539"}

and


  DateList /@ (24*60*60*ToExpression@excelDateValues)

gives




{{2000, 1, 5, 0, 0, 0.}, {2000, 1, 6, 0, 0, 0.}, {2000, 1, 7, 0, 0, 0.}, 
{2000, 1, 8, 0, 0, 0.}, {2000, 1, 9, 0, 0, 0.}, {2000, 1, 12, 0, 0, 0.},
{2000, 1, 13, 0, 0, 0.}, {2000, 1, 14, 0, 0, 0.}, {2000, 1, 15, 0, 0, 0.},
{2000, 1, 16, 0, 0, 0.}}

which is off by two days. Hopefully, there is a less naive approach to the get the right conversion factor to go from excel Datevalues to Mma AbsoluteTime so that a modified version of Cases[] above gives the desired result.


Puzzle resolved: Thanks to Mr.Wizard's reference, the historical background to the two-day discrepancy is explained beautifully in Joel Spolsky's great story . So, unless your data does contain dates going back early 1900's for most cases just subtracting 2 from final output dates should be ok. But ... things can get more complicated considering possible excel date system settings and varying defaults accross OSs. (see XL 1900 and 1904 date systems)


EDIT: Import uses the filename extension if no format is provided as the second argument. For zip files it returns the filenames in the zipped archive. For the example case


   Import["C:\\ your directory \\test.xlsx.zip"]


returns



 {"[Content_Types].xml", "_rels\\.rels","xl\\_rels\\workbook.xml.rels", "xl\\workbook.xml", "xl\\styles.xml",  "xl\\worksheets\\sheet1.xml", "xl\\theme\\theme1.xml", "customXml\\item1.xml", "customXml\\_rels\\item1.xml.rels", "customXml\\_rels\\item2.xml.rels", "docProps\\app.xml", "customXml\\itemProps2.xml", "customXml\\item2.xml", "customXml\\itemProps1.xml", "docProps\\core.xml"}

Comments

Popular posts from this blog

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Mathematica: 3D plot based on combined 2D graphs

I have several sigmoidal fits to 3 different datasets, with mean fit predictions plus the 95% confidence limits (not symmetrical around the mean) and the actual data. I would now like to show these different 2D plots projected in 3D as in but then using proper perspective. In the link here they give some solutions to combine the plots using isometric perspective, but I would like to use proper 3 point perspective. Any thoughts? Also any way to show the mean points per time point for each series plus or minus the standard error on the mean would be cool too, either using points+vertical bars, or using spheres plus tubes. Below are some test data and the fit function I am using. Note that I am working on a logit(proportion) scale and that the final vertical scale is Log10(percentage). (* some test data *) data = Table[Null, {i, 4}]; data[[1]] = {{1, -5.8}, {2, -5.4}, {3, -0.8}, {4, -0.2}, {5, 4.6}, {1, -6.4}, {2, -5.6}, {3, -0.7}, {4, 0.04}, {5, 1.0}, {1, -6.8}, {2, -4.7}, {3, -1....

functions - Get leading series expansion term?

Given a function f[x] , I would like to have a function leadingSeries that returns just the leading term in the series around x=0 . For example: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x)] x and leadingSeries[(1/x + 2 + (1 - 1/x^3)/4)/(4 + x)] -(1/(16 x^3)) Is there such a function in Mathematica? Or maybe one can implement it efficiently? EDIT I finally went with the following implementation, based on Carl Woll 's answer: lds[ex_,x_]:=( (ex/.x->(x+O[x]^2))/.SeriesData[U_,Z_,L_List,Mi_,Ma_,De_]:>SeriesData[U,Z,{L[[1]]},Mi,Mi+1,De]//Quiet//Normal) The advantage is, that this one also properly works with functions whose leading term is a constant: lds[Exp[x],x] 1 Answer Update 1 Updated to eliminate SeriesData and to not return additional terms Perhaps you could use: leadingSeries[expr_, x_] := Normal[expr /. x->(x+O[x]^2) /. a_List :> Take[a, 1]] Then for your examples: leadingSeries[(1/x + 2)/(4 + 1/x^2 + x), x] leadingSeries[Exp[x], x] leadingSeries[(1/x + 2 + (1 - 1/x...