Using Differences on data: trouble with floats and doubles

7

Consider the following data set (after I have run FullForm), which is imported from a file (stored typically as 10.040):

data = {10.`,10.02`,10.04`,10.06`,10.08`,10.1`,10.12`,10.14`,10.16`,10.18`,10.2`,10.22`,10.24`,10.26`,10.28`,10.3`,10.32`,10.34`,10.36`,10.38`,10.4`,10.42`,10.44`,10.46`,10.48`,10.5`,10.52`,10.54`,10.56`,10.58`,10.6`,10.62`,10.64`,10.66`,10.68`,10.7`,10.72`,10.74`,10.76`,10.78`,10.8`,10.82`,10.84`,10.86`,10.88`,10.9`,10.92`,10.94`,10.96`,10.98`,11.`}

As you can see, there is a step of .02 between each (and every) data point. If I run

DeleteDuplicates@Differences@data

I would expect:

{0.2}

Instead I get (on my computer, YMMV, and after FullForm):

{0.019999999999999574`,0.02000000000000135`}

Erk. Now, I've run into these types of problems before (I'm looking at you LabView), and so I expect it has something to do with differences of doubles / floats / machine precision numbers. In LabView, I fixed this by essentially creating a Equivalent type function which, given a list of numbers, did a "fuzzy" Union of sorts, and I can do the same for MMA:

DeleteDuplicates[Differences@data,Abs[#1-#2]/Min[#1, #2] < 10^-6 &]

Is this something I can stop from happening with some import parameters, or is there a better way of handling this?

tkott

Posted 2012-04-27T15:18:57.090

Reputation: 4 819

I don't think the test function in the last expression is doing what you think it's doing. Try for example DeleteDuplicates[{1, 2, 3, 4, 5, 10^-7}, #1 - #2 <= (#1 - #2)/Min[#1, #2] 10^-6 &] which returns {1}. This is because #1 - #2 <= (#1 - #2)/Min[#1, #2] 10^-6 & returns true for a pair of elements {a, b} iff 10^-6 <= a <= b or And[b <= a, b <= 10^-6]. You probably want something like Abs[#1-#2]/Min[#1, #2] < 10^-6 &. – Heike – 2012-04-27T16:09:55.440

@Heike oops, you're right, thanks! I typed that out too quick I guess. – tkott – 2012-04-27T16:11:46.247

1

Related: http://stackoverflow.com/q/6166895/618728

– Mr.Wizard – 2012-04-27T17:08:13.073

Answers

4

There are many ways to control the accuracy.

Here is one:

t = Table[x + RandomReal[{0, 10^-7}], {x, 0, 1, .1}]
Rationalize[#, 10^-3] & /@ Differences[t]
(* -> {1/10, 1/10, 1/10, 1/10, 1/10, 1/10, 1/10, 1/10, 1/10, 1/10}*)

Dr. belisarius

Posted 2012-04-27T15:18:57.090

Reputation: 112 848

Ha, I had been using Rationalize separately today, but didn't consider it for this use case. I wonder if it's faster... – tkott – 2012-04-27T15:52:52.167