I have a list of list as below (I show part of it)
{{1902, 0.4662}, {1903, 0.22443}, {1905, 0.02936}, {1906,
0.02702}, {1908, -0.08354}, {1909, -0.05241}, {1911,
0.02388}, {1912, 0.03738}, {1914, 0.25015}, {1915, 0.2831}, {1917,
0.4415}, {1919, 0.18315}, {1921, 0.2256}, {1923, 0.24132}, {1926,
0.21473}, {1928, 0.29596}, {1930, 0.47693}, {1933, 0.41607}, {1935,
0.22161}, {1937, 0.3322}, {1940, 0.2099}, {1942, 0.23376}, {1944,
0.44114}, {1947, 0.15876}, {1949, 0.43953}, {1951, 0.71407}, {1954,
0.9595}, {1956, 0.59436}, {2000, 0.6832}, {2004, 0.86861}, {2007,
0.48201}, {2011, 0.70796}, {2015, 0.57029}, {2020, 0.61997}, {2026,
0.79266}, {2032, 0.78726}, {2038, 0.83884}}
For example, in {1902,0.4662} "1902" represent time 19:02 and 0.4662 represent the data at time 19:02.
What I want to do is to calculate average of the data in every 5 minutes. That is from 19:02 to 19:06, from 19:07 to 19:11. Notice the time step is not evenly distributed.
Histogram
can naturally count how many of the data are in each interval. I want to partition the list like Histogram
and then I can calculate average in each interval. And notice that the data may not be in a single day.
I can't figure out an elegant way to do this with built-in function. Can somebody help me?
Besides I think this is a very simple statistical calculation of data. I want to know what software could do this easily and directly.
Update: the full data sample is here http://en.textsave.org/VdL with date information
Answer
One approach:
Create intervals from dataset;
int = Table[Interval[{j, j + 4}], {j, 1902, 2038, 5}];
Calling your dataset
data
the means for the desired intervals can be obtained:Mean /@ GatherBy[data, IntervalMemberQ[int, #[[1]]] &][[All, All, 2]]
yielding:
{0.186753, -0.0373567, 0.19021, 0.283417, 0.228025, 0.386445,
0.31884, 0.27105, 0.33745, 0.437453, 0.77693, 0.6832, 0.86861,
0.594985, 0.57029, 0.61997, 0.79266, 0.78726, 0.83884}
EDIT
Mr. Wizard correctly pointed out the error in my code. My first edit was wrong. The easiest approach (it seems to me) is to convert times to temporal data.
f[x_] := {2013, 9, 28, IntegerPart[x/100],
100 (x/100 - IntegerPart[x/100])}
Then using TemporalData
td = TemporalData[{f[#[[1]]], #[[2]]} & /@ data];
answ = TemporalData`Aggregate[td, {5, "Minute"}]
The default function applied to the partitioned temporal data is mean.
This can be visualised:
DateListPlot[{td["Path"], answ["Path"]}, Joined -> {False, True}]
A good post is here.
Comments
Post a Comment