After a lot of trial and error, I created a function that I think is fast enough for general data science (I mean, to be applicable to millions of records).

At first, I tried to create two interpolation functions: one for the gap itself (with a zero order interpolation function), and one for the values (the "PathFunction" itself). But Mathematica Interpolation function is needing some revamping, both in speed and options . For a zero order, there's only `Ceilling`

. That's OK, since we can "slide" data, and cover for the "`Floor`

" numbers with a `MemberQ`

test. I haven't tried all possible options of this technique, and eventually, I could have gotten better results, if counting with memorization (pre-build the `InterpolationFunction`

), but all versions felt very cumbersome.

But then I remembered Leonid post. This lead me to considerable speedups, even without any special memorization. This is the adapted version (not compiled):

```
timeSeriesInterpolationGap[ts_, moment_, gap_] :=
Module[{n0, m, pos, times = ts["Times"], n1, values = ts["Values"],
timeMoment, gapTime},
If[DateObjectQ[moment], timeMoment = AbsoluteTime[moment],
timeMoment = moment];
If[QuantityQ[gap],
gapTime = QuantityMagnitude@UnitConvert[gap, "Seconds"],
gapTime = gap];
Which[
times[[1]] <= timeMoment < times[[-1]],
n0 = 1; n1 = Length[times];
While[n0 <= n1, m = Floor[(n0 + n1)/2];
If[times[[m]] == timeMoment, Break[]];
If[times[[m]] < timeMoment, n0 = m + 1, n1 = m - 1]];
pos = If[times[[m]] < timeMoment, m, m - 1];
If[times[[pos + 1]] - times[[pos]] <= gapTime,
values[[pos]] + (values[[pos + 1]] -
values[[pos]])/(times[[pos + 1]] -
times[[pos]])*(timeMoment - times[[pos]]), Missing[]],
timeMoment == times[[-1]],
values[[-1]],
True,
Missing[]
]]
```

The function can be called with times or dates. Also, the gap can be specified with time units (which internally will be converted into seconds), or without units (which will be considered as seconds).

**WARNING:** it is not an exact replica of `TimeSeries`

interpolation function, since repeated timestamps are not treated correctly. To be an exact replica, it has to answer to the following test:

```
TimeSeries[{{0, 1}, {0, 2}, {0, 3}, {0, 4}}][0]
(*5/2*)
TimeSeries[{{0, 1}, {0, 2}, {0, 3}, {1, 4}}][0.5]
(*3.*)
TimeSeries[{{0, 1}, {1, 2}, {1, 3}, {1, 4}}][0.5]
(*2.*)
```

Also, there's no treatment of `Missing`

values, but that is also the case for the `TimeSeries`

functionality.

**Any help improving it, to match with the built-in one, but still fast, is welcomed.**

On 300 000 records, the function is approximately three orders of magnitude faster than the solution presented by @gwr. One thing that is not helping on gwr answer is the fact that `FirstPosition`

is two orders of magnitude slower than Leonid's function, which doesn't make too much sense (I mean, doesn't make sense that Mathematica doesn't have a more optimized function). This lead me to also test the `TimeSeriesWindow`

function. It compares correctly with Leonid's function, which even makes less sense (I haven't checked if we can see the code).

I also created a version that just does the interpolation, without any gap checking. I haven't mixed both on the same code...:

```
timeSeriesInterpolation[ts_, moment_] :=
Module[{n0, m, pos, times = ts["Times"], n1, values = ts["Values"],
timeMoment},
If[DateObjectQ[moment], timeMoment = AbsoluteTime[moment],
timeMoment = moment];
Which[
times[[1]] <= timeMoment < times[[-1]],
n0 = 1; n1 = Length[times];
While[n0 <= n1,
m = Floor[(n0 + n1)/2];
If[times[[m]] == timeMoment, Break[]];
If[times[[m]] < timeMoment, n0 = m + 1, n1 = m - 1]
];
pos = If[times[[m]] < timeMoment, m, m - 1];
values[[pos]] + (values[[pos + 1]] -
values[[pos]])/(times[[pos + 1]] - times[[pos]])*(timeMoment -
times[[pos]]),
timeMoment == times[[-1]],
values[[-1]],
True,
Missing[]
]]
```

(again, suffers from the same difference with the `TimeSeries`

functionality that was listed above)

Testing this version with the `TimeSeries`

functionality, for the same 300 000 records:

[10^-6 s] The fastest is the use of `"PathFunction"`

, having prebuilt it first [0.5 s], and calling the prebuilt ourselves: `myFun[time_]=ts["PathFunction"][time]`

(notice the `=`

, instead of `:=`

)

[10^-3 s] Using `ts[time]`

is 20% faster than `ts[date]`

. Both have a pre-built time [0.5 s]. It is interesting that calling our own pre-built `Set`

memorization of the `"PathFunction"`

makes such a difference in speed (three orders of magnitude...)

[10^-5 s] Using `ts["PathFunction"][time]`

. Obviously, it also has the pre-built time. But again, it is still strange that calling our own pre-built `Set`

memorization of the `"PathFunction"`

makes a difference in speed (one order of magnitude...)

[10^-4 s] Using the `timeSeriesInterpolation[ts, time]`

(and two times slower, if using date instead of time). **BUT NO PRE-BUILT** time. So, it is almost as fast as what TimeSeries can be, and doesn't need pre-building. **Obviously, the main advantage is the fact that it's the only option we have that has time gap awareness**.

Why so much trouble? `TimeSeries`

are not exactly as the examples the documentation presents... They typically have millions of records, and hence 10^-6 s is fundamental. Only if I could get 10^-6 with the gap analysis...

It is also welcomed the addition of an option that considers a certain tolerance to the gap. That is, if the searched timestamp is within a too big gap, but at just a delta value from a recorded timestamp, it assumes the value of the recorded timestamp (this is important for records being saved with different time precisions, which is common on field instrumentation, PLC, etc.).

Interesting also to notice that if a `TimeSeries`

is supplied with units, the `timeSeriesInterpolation`

function is one order of magnitude slower. This seems to be due to a huge over cost coming from units arithmetic... But at least it works, while everything based on `"PathFunction"`

breaks when in the presence of `Quantity`

. The results from the following tests seem quite explicit:

```
RepeatedTiming[Quantity[1.0, "Seconds"]/10]
(*{0.000273, Quantity[0.1, "Seconds"]}*)
RepeatedTiming[Quantity[QuantityMagnitude[Quantity[1.0, "Seconds"]]/10,
QuantityUnit[Quantity[1.0, "Seconds"]]]]
(*{0.0000667, Quantity[0.1, "Seconds"]}*)
RepeatedTiming[Quantity[Quantity[1.0, "Seconds"][[1]]/10,
Quantity[1.0, "Seconds"][[2]]]]
(*{0.0000278, Quantity[0.1, "Seconds"]}*)
RepeatedTiming[1/10]
(*{8.0*10^-7, 1/10}*)
```

There seems to be some space for optimizations...

From what you are saying I would assume that you would like interpolation iff $gap < 3$ (not $>3$) as you have written here? – gwr – 2016-05-20T16:50:01.947

@gwr correct... just corrected it... – P. Fonseca – 2016-05-20T17:05:34.677

My answer here may be useful to you http://mathematica.stackexchange.com/a/103435/43

– Andy Ross – 2016-05-21T03:36:38.197