Skip to main content

Is it possible to import dates and times directly as AbsoluteTime and by pass DateLists?



Date and time calculations are quite slow so I am exploring ways to minimize the number of calculations performed. As an example:


msft = FinancialData["MSFT", "Jan 1, 2005"]
Timing[DateListPlot[msft]]
{0.253632, *picture*}

whereas


Timing[
msft1 = msft;
msft1[[All, 1]] = AbsoluteTime /@ msft[[All, 1]];
DateListPlot[msft1]

]
{0.214658, *picture*}

So it is faster to convert to AbsoluteTime and make a plot than to stick with date lists (as a side issue this can actually be done even faster with ListPlot or ListLinePlot and a tick function).


When wanting to work with date lists to select subsets of data you will typically choose a search criteria in which the absolute time of elements is tested against some condition. So this is another example where it is useful to already have the data in absolute time.


So the question is whether it is possible to import dates directly into Mathematica and have them as absolute times rather than date lists. My two specific examples would be FinancialData as per the example above -- can this be made to output absolute times via some configuration rather than having to convert the data post import? Also for Excel spreadsheets containing columns with dates (i.e. date strings as per Excel formats) and data. Can these be imported and converted to absolute times during the import process instead of converted to, and output as, date lists?


(If solutions to this problem means that Import and FinancialData internals take date lists and convert to absolute time then I don't really see that as a solution. I am seeking ways to go direct to absolute time.)



Answer



The data in the file test.xls are




03/Jan/2000   45.46
04/Jan/2000 43.92
05/Jan/2000 44.38
06/Jan/2000 42.9
07/Jan/2000 43.46
10/Jan/2000 43.78
11/Jan/2000 42.65
12/Jan/2000 41.26
13/Jan/2000 42.04
14/Jan/2000 43.78


An alternative approach is to exploit the fact that Office documents are zipped collections of XML files. So,


Step 1: rename the source file by adding .zip to the file name: test.xlsx.zip.


Step 2: Import the appropriate xml file in the zip file, extract the data elements and re-format:


 Cases[Import["C:\\ your directory \\test.xlsx.zip", {"ZIP", "xl\\worksheets\\sheet1.xml"}], 
XMLElement["v", {}, {value_}] :> value, Infinity]
// Partition[#, 2] &

This gives:




{{"36528", "45.46"}, {"36529", "43.92"}, {"36530", "44.38"}, {"36531", "42.9"}, {"36532", "43.46"}, {"36535", "43.78"}, {"36536", "42.65"}, {"36537", "41.26"}, {"36538", "42.04"}, {"36539", "43.78"}}



where the first entry in each sublist is Excel's DATEVALUE (serial date number that counts the number of days from 1/1/1900).


Puzzle: I would expect that converting Excel's DATEVALUE to Mathematica's AbsoluteTime (number of seconds from 1/1/1900) would be as simple as multiplying the former by 24*60*60. But doing that with:


  excelDateValues = {"36528", "36529", "36530", "36531", "36532", "36535", "36536", "36537", "36538", "36539"}

and


  DateList /@ (24*60*60*ToExpression@excelDateValues)

gives




{{2000, 1, 5, 0, 0, 0.}, {2000, 1, 6, 0, 0, 0.}, {2000, 1, 7, 0, 0, 0.}, 
{2000, 1, 8, 0, 0, 0.}, {2000, 1, 9, 0, 0, 0.}, {2000, 1, 12, 0, 0, 0.},
{2000, 1, 13, 0, 0, 0.}, {2000, 1, 14, 0, 0, 0.}, {2000, 1, 15, 0, 0, 0.},
{2000, 1, 16, 0, 0, 0.}}

which is off by two days. Hopefully, there is a less naive approach to the get the right conversion factor to go from excel Datevalues to Mma AbsoluteTime so that a modified version of Cases[] above gives the desired result.


Puzzle resolved: Thanks to Mr.Wizard's reference, the historical background to the two-day discrepancy is explained beautifully in Joel Spolsky's great story . So, unless your data does contain dates going back early 1900's for most cases just subtracting 2 from final output dates should be ok. But ... things can get more complicated considering possible excel date system settings and varying defaults accross OSs. (see XL 1900 and 1904 date systems)


EDIT: Import uses the filename extension if no format is provided as the second argument. For zip files it returns the filenames in the zipped archive. For the example case


   Import["C:\\ your directory \\test.xlsx.zip"]


returns



 {"[Content_Types].xml", "_rels\\.rels","xl\\_rels\\workbook.xml.rels", "xl\\workbook.xml", "xl\\styles.xml",  "xl\\worksheets\\sheet1.xml", "xl\\theme\\theme1.xml", "customXml\\item1.xml", "customXml\\_rels\\item1.xml.rels", "customXml\\_rels\\item2.xml.rels", "docProps\\app.xml", "customXml\\itemProps2.xml", "customXml\\item2.xml", "customXml\\itemProps1.xml", "docProps\\core.xml"}

Comments

Popular posts from this blog

plotting - Filling between two spheres in SphericalPlot3D

Manipulate[ SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, Mesh -> None, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], {n, 0, 1}] I cant' seem to be able to make a filling between two spheres. I've already tried the obvious Filling -> {1 -> {2}} but Mathematica doesn't seem to like that option. Is there any easy way around this or ... Answer There is no built-in filling in SphericalPlot3D . One option is to use ParametricPlot3D to draw the surfaces between the two shells: Manipulate[ Show[SphericalPlot3D[{1, 2 - n}, {θ, 0, Pi}, {ϕ, 0, 1.5 Pi}, PlotPoints -> 15, PlotRange -> {-2.2, 2.2}], ParametricPlot3D[{ r {Sin[t] Cos[1.5 Pi], Sin[t] Sin[1.5 Pi], Cos[t]}, r {Sin[t] Cos[0 Pi], Sin[t] Sin[0 Pi], Cos[t]}}, {r, 1, 2 - n}, {t, 0, Pi}, PlotStyle -> Yellow, Mesh -> {2, 15}]], {n, 0, 1}]

plotting - Plot 4D data with color as 4th dimension

I have a list of 4D data (x position, y position, amplitude, wavelength). I want to plot x, y, and amplitude on a 3D plot and have the color of the points correspond to the wavelength. I have seen many examples using functions to define color but my wavelength cannot be expressed by an analytic function. Is there a simple way to do this? Answer Here a another possible way to visualize 4D data: data = Flatten[Table[{x, y, x^2 + y^2, Sin[x - y]}, {x, -Pi, Pi,Pi/10}, {y,-Pi,Pi, Pi/10}], 1]; You can use the function Point along with VertexColors . Now the points are places using the first three elements and the color is determined by the fourth. In this case I used Hue, but you can use whatever you prefer. Graphics3D[ Point[data[[All, 1 ;; 3]], VertexColors -> Hue /@ data[[All, 4]]], Axes -> True, BoxRatios -> {1, 1, 1/GoldenRatio}]

plotting - Adding a thick curve to a regionplot

Suppose we have the following simple RegionPlot: f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}] Now I'm trying to change the curve defined by $y=g[x]$ into a thick black curve, while leaving all other boundaries in the plot unchanged. I've tried adding the region $y=g[x]$ and playing with the plotstyle, which didn't work, and I've tried BoundaryStyle, which changed all the boundaries in the plot. Now I'm kinda out of ideas... Any help would be appreciated! Answer With f[x_] := 1 - x^2 g[x_] := 1 - 0.5 x^2 You can use Epilog to add the thick line: RegionPlot[{y < f[x], f[x] < y < g[x], y > g[x]}, {x, 0, 2}, {y, 0, 2}, PlotPoints -> 50, Epilog -> (Plot[g[x], {x, 0, 2}, PlotStyle -> {Black, Thick}][[1]]), PlotStyle -> {Directive[Yellow, Opacity[0.4]], Directive[Pink, Opacity[0.4]],