Why can I look at a graph and immediately find the closest point to another point, but it takes me O(n) time through programming?

113

35

Let me clarify:

Given a scatterplot of some given number of points n, if I want to find the closest point to any point in the plot mentally, I can immediately ignore most points in the graph, narrowing my choices down to some small, constant number of points nearby.

Yet, in programming, given a set of points n, in order to find the closest point to any one, it requires checking every other point, which is ${\cal O}(n)$ time.

I am guessing that the visual sight of a graph is likely the equivalent of some data structure I am incapable of understanding; because with programming, by converting the points to a more structured method such as a quadtree, one can find the closest points to $k$ points in $n$ in $k\cdot\log(n)$ time, or ammortized ${\cal O}(\log n)$ time.

But there is still no known ${\cal O}(1)$ ammortized algorithms (that I can find) for point-finding after data restructuring.

So why does this appear to be possible with mere visual inspection?

Ari

Posted 2014-03-17T06:05:56.817

Reputation: 669

34You are aware of all the points already and roughly where they are; the "software drivers" for your eyes have already done the hard work for you of interpreting the image. In your analogy you are considering this work "free" when in actual fact it isn't. If you already had a data structure that broke down the point positions into some sort of octotree representation you could do much better than O(n). A lot of pre-processing happens in the subconscious part of your brain before the information even reaches the conscious part; never forget that in these sorts of analogies. – Richard Tingle – 2014-03-17T11:36:48.740

19I think at least one of your assumptions does not hold in general. assume all points arranged on a circle with 'small' perturbations and 1 additional point P being the center of the circle. If you want to find the closest point to P, you cannot dismiss any of the other points in the graph. – collapsar – 2014-03-17T12:29:22.507

4Because our brain is really amazing! Sounds like a cheap answer but it's true. We really don't know a whole lot about how our (apparently massively parallel) image processing works. – Carl Witthoft – 2014-03-17T12:29:49.030

7Well, basically, your brain uses space partitioning without you noticing. The fact that this appears really fast does not mean it's a constant time - you're working with some finite resolution, and your image processing software is designed for that (and might even be handling all that parallely). The fact that you're using a hundred million little CPUs to do the preprocessing doesn't put you in ${\cal O}(1)$ - it just does the complicated operation on a lot of little processors. And don't forget the plotting to the 2D paper - that on its own has to be at least ${\cal O}(n)$. – Luaan – 2014-03-17T13:01:53.833

2To add to all the previous people have said -- finding the closest point is not an O(n) problem with some preprocessing. You might want to look into kd-trees, quad-trees and Voronoi tesselation. – TC1 – 2014-03-17T13:37:38.487

9Not sure if it has already been mentioned, but the human brain works very differently from a SISD von Neumann type computing system. Particularly relevant here is that, as I understand it, the human brain is inherently parallel and especially so when it comes to processing sensory stimuli: you can hear, see, and feel multiple things at the same time and be (roughly, anyway) aware of all of them simultaneously. I'm concentrating on writing a comment but see my desk, a can of soda, my jacket hanging on the door, the pen on my desk, etc. Your brain can check many points simultaneously. – Patrick87 – 2014-03-17T15:52:23.290

@Patrick87 as long as you have a fixed number of processors dedicated to the problem, parallelism does not affect complexity classes. – collapsar – 2014-03-17T16:34:30.390

1@collapsar although if the number of processors is very large, it can be reasonably treated as infinite (much like how software running in very large finite memory is still called "turing-complete"). – Brilliand – 2014-03-17T18:33:36.513

@Brilliand The viability of your approach depends on the expected size of problem instances. moreover, afaik, with regard to any task beyond detection of perceptual primitives (eg. high intensity gradients in vision, aka 'edges'), it is not clear how many neurons make up a 'processor' in the brain (nor whether the parallel processor model of the brain is really appropriate in the first place), so the claim of a 'reasonably infinite' processor count might be way too strong - however, that deeply invades the field of neurobiology. – collapsar – 2014-03-17T19:08:56.020

1Look into quadtrees. Very similar to how one might approach the problem from a visual direction. – Adam Davis – 2014-03-17T19:25:25.453

1>mere visual inspection... it turns out visual inspection is in fact a quite complex algorithm in itself ;) try implementing that in software and you have an algorithm that solves the problem just like you do – Niklas B. – 2014-03-17T23:19:06.010

For large graphs you brain cannot do this, for small graphs your brain has more “cpus” then there are points in the graph. – Ian Ringrose – 2014-03-18T11:08:47.913

Because a) your Visual Cortex is massively parallel, and 2) the number of parallel computations in your visual cortex equals or exceeds the numbers of objects that your eyes can distinguish. Thus O(n/c) = O(1) so long as c >= n. – RBarryYoung – 2014-03-20T20:22:11.093

As an illustration to the importance of the computational model/data structures used (and kind of "just for fun" too), take a look at "spaghetti sort": http://en.wikipedia.org/wiki/Spaghetti_sort

– Kache – 2014-05-20T11:14:46.760

i think the heart of the question is encoding of the data structure (among other things). Although it is true that in some model of (human) computation O(n) time is the limit. Depending on encoding of input data it can also be in O(1). In fact many np-complete problems are only np-complete using certain encodings of the data and not with others (e.g the subset-sum problem) – Nikos M. – 2014-06-08T11:17:56.027

Answers

106

Your model of what you do mentally is incorrect. In fact, you operate in two steps:

  1. Eliminate all points that are too far, in $O(1)$ time.
  2. Measure the $m$ points that are about as close, in $\Theta(m)$ time.

If you've played games like pétanque (bowls) or curling, this should be familiar — you don't need to examine the objects that are very far from the target, but you may need to measure the closest contenders.

To illustrate this point, which green dot is closest to the red dot? (Only by a little over 1 pixel, but there is one that's closest.) To make things easier, the dots have even been color-coded by distance.

a cloud of points

This picture contains $m=10$ points which are nearly on a circle, and $n \gg 10$ green points in total. Step 1 lets you eliminate all but about $m$ points, but step 2 requires checking each of the $m$ points. There is no a priori bound for $m$.

A physical observation lets you shrink the problem size from the whole set of $n$ points to a restricted candidate set of $m$ points. This step is not a computation step as commonly understood, because it is based on a continuous process. Continuous processes are not subject to the usual intuitions about computational complexity and in particular to asymptotic analysis.

Now, you may ask, why can't a continuous process completely solve the problem? How does it come to these $m$ points, why can't we refine the process to get $m=1$?

The answer is that I cheated a bit: I presented a set of points which is generated to consists of $m$ almost-closest points and $n-m$ points which are further. In general, determining which points lie within a precise boundary requires a precise observation which has to be performed point by point. A coarse process of elimination lets you exclude many obvious non-candidates, but merely deciding which candidates are left requires enumerating them.

You can model this system in a discrete, computational world. Assume that the points are represented in a data structure that sorts them into cells on a grid, i.e. the point $(x,y)$ is stored in a list for the cell $(\lfloor x \rfloor, \lfloor y \rfloor)$. If you're looking for the points that are closest to $(x_0, y_0)$ and the cell that contains this point contains at most one other point, then it is sufficient to check the containing cell and the 8 neighboring cells. The total number of points in these 9 cells is $m$. This model respects some key properties of the human model:

  • $m$ is potentially unbounded — a degenerate worse case of e.g. points lying almost on a circle is always possible.
  • The practical efficiency depends on having selected a scale that matches the data (e.g. you'll save nothing if your dots are on a piece of paper and your cells are 1 km wide).

Gilles

Posted 2014-03-17T06:05:56.817

Reputation: 29 838

9What's more, not all graphs can be projected into the plain so that Euklidian distances match distances in the graph (e.g. if the edge weights do not form a metric). – Raphael – 2014-03-17T16:16:51.893

4@Raphael I understood the question as being about computational geometry rather than graph theory, but indeed this is an additional complication. – Gilles – 2014-03-17T16:20:24.320

The human eye doesn't easily adapt to precise measurements - many people will find detecting the closest dot in your example simply impossible without measurement devices. – Brilliand – 2014-03-17T18:40:07.190

@Brilliand He takes that into account - step 1 is eliminating a bunch of points that are definitely not the closest. Step 2 is to measure, point by point, which one is the closest - which takes a measuring device and (at worst) M attempts. – EtherDragon – 2014-03-17T20:27:22.927

Is there a name for the algorithm that first sorts the points in a grid, then checking only the current grid cell? I have developed something similar (find all pairs with distance<m) because I needed it for my research (which is in geosciences), but have little knowledge of computer science. For example, how do I choose an optimal cell size? Too large cells and I'm calculating too many distances brute-force, too small cells and I spend too much time sorting the cells. – gerrit – 2014-03-17T21:04:16.207

@gerrit No idea, CG isn't my field. I suggest asking a question on this site. – Gilles – 2014-03-17T21:21:37.267

2

@Gilles Done. I just learned the term computational geometry.

– gerrit – 2014-03-17T21:26:02.343

I fail to see how you do step 1 on a computer in $O(1)$ time to systematically eliminate the $n$ points that are far away. I understand from your answer to @gerrit comment that you do not either. Then I do not see what you have solved, or even simply explained, since step 1 is the core of the question. – babou – 2014-03-17T23:17:23.070

@babou My point is that the computer doesn't do step 1 in general, but does so in a special case (where the grid size is guessed appropriately). – Gilles – 2014-03-17T23:26:31.567

2This might be a nit-pick, I can understand what you're trying to show, but as one who is colorblind "pick the closest green to the red" leads to a lot of head scratching about which points are which. Just something to think about in the future -- choose any other color combinations besides red/green! – tpg2114 – 2014-03-18T04:09:28.727

In a database with an appropriate index, I can eliminate points easily, based on the index data that is continuously kept up to date. – Simon Richter – 2014-03-18T06:33:05.943

3@tpg2114 Don't forget red/green is not the only type of color-blindness. Showing it with shape (or any attribute other than color) would be more inclusive still than "any other color combinations besides red/green". – Jonathan Van Matre – 2014-03-18T14:39:29.477

And @Gilles, I appreciated the apt application of the petanque metaphor. – Jonathan Van Matre – 2014-03-18T14:42:06.110

@tpg2114 I think the target (red) point should be clear from the purposes of the example (the point in the middle of the circle of ~equidistant points) – Janik Zikovsky – 2014-03-18T19:05:32.490

1@PsZk I acknowledged that -- I said I can understand the point being made. But I just wanted to leave a reminder that visualizations should keep a few things in mind, notably color-deficient readers. – tpg2114 – 2014-03-18T19:22:13.297

1I'm surprised no-one has brought up the relevance of optical illusions to the "visual inspection" method. We may think it's easy to pick the point "B" closest to point "A", but there's a significant margin for error, especially when other points are nearby, such as in the example graphic above. – Thomas – 2014-03-20T17:32:37.263

I think this answer glosses over measurement accuracy. From optical measurement with eyes, the dots in circle are equally close to the center, within the error bars of measurement. A suitable logarithmic scaling (be it with a wacky set of lenses, or with having different coordinate system when doing the plot) would allow eyes to still pick the closest point directly. – hyde – 2014-03-22T20:28:07.770

@hyde On the contrary, measurement accuracy is part of the issue here. You could get the same O(1) efficiency on a computer if you picked a grid with the right size, i.e. if the data was nice enough for a particular measurement accuracy. But if we're talking worst-case complexity, this has to include cases with an unbounded amount of contenders within a fixed accuracy scale. – Gilles – 2014-03-22T20:31:05.747

@Gilles I'm not sure I follow. IMO question implies coordinates on computer have enough precision to distinguish the closest point. Your answer (or at least the picture in it) depends on human brain not having enough precision to distinguish the closest point. If you took a low-rez photo of plot and fed it to a computer, it couldn't tell what is closest either. – hyde – 2014-03-22T20:47:00.833

@hyde The point is that to achieve this accuracy requires a pass through the data, to get to the right zoom level. And that pass costs O(n). – Gilles – 2014-03-22T20:54:37.267

38

The reason is that the data has been put in a "data structure" optimized for this query and that the preprocessing time in preparing the graph should be included in your measured times which is proportional to the number of dots, giving a O(n) complexity right there.

If you put the coordinates in a table listing X and Y coordinates of each point you would require a much larger mental effort to calculate the distances between points, sort the list of distances and choose the smallest.

As an example of a query not working well visually, would be to look at the night sky and - based on your view only and a table of coordinates of each star - locate the closest star to Earth or which astrological sign has the smallest distance between the stars it consists of. Here you would need a zoomable and turnable 3D model in order to determine this visually, where a computer would consider this to be essentially the same problem as your original.

Thorbjørn Ravn Andersen

Posted 2014-03-17T06:05:56.817

Reputation: 489

2+1 - I was scrolling down looking for someone making exactly this point. The representation of an incoming data is important - just try finding the median of a sorted list vs an unsorted one! – cloudfeet – 2014-03-17T13:11:06.803

20

This question starts from a faulty premise. You just think you can mentally find the nearest point in $O(1)$ point, but in fact, if you can't. It feels like that because you're used to dealing with very small graphs, but small examples can be misleading, when we're dealing with asymptotic complexity. If you tried to do this with a scatter plot with one billion points, you'd quickly discover that you can't do it in $O(1)$ time.

D.W.

Posted 2014-03-17T06:05:56.817

Reputation: 83 008

But this is not entirely true. Given a plot of one billion points, I don't even need to LOOK at 99.999% of the points on the graph. If I just just draw a mental radius around the point I am considering, to include only the points I believe are likely contenders, I will be left with some small, constant number of points to now consider. – Ari – 2014-03-17T06:22:59.233

8Imagine placing a billion points along a circle, but all slightly perturbed a little bit, so your points form a fuzzy looking ring. To find the point closest to the center by eye, I don't see how you could do much better than checking all the points one-by-one. – Nick Alger – 2014-03-17T07:45:02.483

4@NickAlger So it's more like O(numberOfPointsAboutTheSameDistanceFromTheTargetPointAsTheClosestPoint), which is not necessarily related to n. Either way, I think an answer to this should present possible data structures in terms of how the human mind perceives and queries it. Simply saying it's not O(1) feels ... lazy? inadequate? – Dukeling – 2014-03-17T11:01:13.060

5@Dukeling O(something) refers to the worst case. If there are any layouts where human brain can't do it in constant time, then it's definitely not O(1). If there's some limit X where human brain can process X points in constant time, but can't process X*2 points at all - then it's not O(1). – Peteris – 2014-03-17T13:38:09.867

@Peteris I didn't say it's O(1), but rather simply that it's not necessarily dependent on n (while O(n) is also a correct complexity, since big-O is an upper bound, the complexity I stated above is a tighter bound). The answer mentions "a scatter plot with one billion points" - without Nick comment's interpretation, we could be left with a point easily identified as closer than any other (making that a million points, or 100 billion, won't necessarily make it take a different amount of time to identify the closest point, so definitely not directly dependent on n as this answer seems to imply). – Dukeling – 2014-03-17T14:27:49.540

3@Dukeling It is neccessarily dependent on n, since on worst case it is equal to n, and if you've given n arbitrary points you have to expect that it might be impossible to do it faster than C*n operations. – Peteris – 2014-03-17T15:47:31.097

2@Peteris I guess we disagree on what it means to be "necessarily dependent on n" and how to determine the closest upper bound. – Dukeling – 2014-03-17T16:47:39.320

1@Peteris: I think O(numberOfPointsAboutTheSameDistanceFromTheTargetPointAsTheClosestPoint) is accurate, and also that you're correct that there is no established relationship between numberOfPointsAboutTheSameDistanceFromTheTargetPointAsTheClosestPoint and n However, that doesn't mean that the process is O(n). It's only O(n) if n is the same magnitude as numberOfPointsAboutTheSameDistanceFromTheTargetPointAsTheClosestPoint Ex: for(a=0-&gt;n) for(b=0-&gt;m). He's saying O(n*m), and you're saying O(max(n,m)^2). – Mooing Duck – 2014-03-17T16:54:01.003

2@Peteris "O(something) refers to the worst case" -- no, not necessarily. (Though that seems to be point here.) – Raphael – 2014-03-18T07:54:25.540

Two things: First, the "human algorithm" makes a time-space tradeoff. It's using O(n) space to allow O(1) (ish) lookups. The same "is it O(1) or O(n)" confusion applies to radix sort, which can be viewed as O(n) or O(n log n) depending on precisely how you define things. I recommend reading the wikipedia discussion. Second, it takes time for the human to build the acceleration structure; and in this case (but not all cases) the human can take advantage of parallel processing in the retina. Otherwise that time would be O(n). – Paul Du Bois – 2014-03-24T22:40:03.517

15

The superiority of the visual inspection hinges on crucial premises which cannot be guaranteed in general:

  • scaling: You focus on the graphical representation of the area of interest. this means, the geometry has been scaled down to fit into your field of vision. in the general setting this already requires $O(n)$ time for preprocessing.

  • count: (cf. comment of Nick Alger on the answer given by D.W.) assume a point count exceeding the number of your retinal cells - you will not even identify all points involved.

  • variance: (cf. comment of Nick Alger on the answer given by D.W.) assume a set of points on a regular (e.g. hexagonal) grid being subjected to small random perturbations. if these perturbations become less than the resolution of your retina (or any other overlaid grid), you will not only be hard-pressed to detect the actual minimum distance but pick the wrong pairs of points with a high probability.

assuming that the mental processes involve some rasterization of a geometrical representation (vs. let's say a distance matrix), these processes do not scale arbitrarily with the size of the problem instance. put in different terms, for a general setting, a preprocessing sampling procedure would be required running in $O(n)$. in human visual inspection, the parameters of this preprocessing are hardwired in the perception apparatus (number of retinal cells, retinal area) which makes this processing stage appear to be $O(1)$.

collapsar

Posted 2014-03-17T06:05:56.817

Reputation: 790

1The OP was factoring out the $O(n)$ visual inspection of all points since it has to be done by any system that takes them all into account. I think that is why he is considering amortized complexity $O(log(n))$. On the other hand, except for rare individuals, most people will not remember the graph from one question to the next. Also, the fact that the eye uses a raster image means that scanning is in constant time, and that focusing with enough precision on the relevant part is logarithmic. Small variations on long distances are not perceived anyway (see Weber-Fechner laws in my answer). – babou – 2014-03-17T11:13:17.720

As far as I understand, the OP factored out $O(n)$ for applying an algorithm that computes the NN for all $n$ points in $O(n\,\log\,n)$ preparing for repeated lookups later on. my point has rather been that no raster scanning in constant time will cater for all problem instances as distance differences might fall below the gridline spacing or $n$ may be greater than the number of grid points. focusing on the relevant part in logarithmic time does not do the trick for all points 'almost' on a circle with 1 point at the center as all points are relevant (i may have gotten you wrong though). – collapsar – 2014-03-17T12:53:52.410

14

  1. The computer is solving a different problem. It takes a list of points, not a rasterized image of points. Converting from a list to an image, i.e. "plotting" the points, takes O(n) time.

    Quick! Which is closest to (1,2):

    • (9, 9)
    • (5, 2)
    • (3,-2)
    • (4, 3)
    • (0, 4)
    • (1, 9)

    A lot harder, right? I bet if I made the list twice as long you'd have to do twice as much work.

  2. You're not aware of how much work your brain is doing. You don't "just know" which point is closer. Your brain is doing computational work to figure out that answer and make it available. The brain works on each point in parallel, so the time to finish stays roughly the same, but the amount of required work still increases with the number of points.

Craig Gidney

Posted 2014-03-17T06:05:56.817

Reputation: 4 224

12

For the same reason when you look at a triangle and know it is a triangle you are forgetting the many millions of calculations you do without noticing it.

Neural networks

In effect you are a neural network that has been trained and loaded with masses upon masses of data.

Take the infant shape sorting game as an example:

enter image description here

When a child first interacts with this it is likely they will attempt to insert shapes into the wrong holes, this is because they have not yet trained their brain or encountered enough data to build networks. They can't make assumptions about edges, size, .etc to determine which shape fits the hole.

This seems obvious to you (I hope) because you have built these connections, you may even think it is intuitive and not have to break it down, for example you just know the triangle fits the triangle and don't need to approximate the size, count the edges, .etc. This is not true, you have done all that in your subconscious, the only conscious thought you had was that it is a triangle. Many millions of computations happened from taking a visual representation, understanding what it was representing, understanding what the individual elements are and then estimating their distances, the fact that you had a large database of information to poll against made this simpler.

Your brain isn't binary

The data your brain works on isn't binary (As far as we know), isn't true or false, it holds many states that we use to interpret the data, we also get things wrong often, even when we follow the correct process, this is because the data changes often. I would hazard a guess that our brains function a lot more like a quantum computer would where the bits are in an approximate state until read. That is, if our brain works like a computer at all, it really is not known.

Hence an algorithm for working with binary data will not work the same. You can't compare the two. In your head you are using concepts to perform, rich data types that hold far more information, you have the ability to craft links where they are not explicitly defined; upon seeing a triangle you may think of Pink Floyd's Dark side of the moon cover.

enter image description here

Back to the scatter graph, there is no reason you couldn't do this on a computer using a bitmap and measuring the distance from a point in increasing radii until you encountered another point. It is possibly the closest you could get to a human's approximation. It is likely to be much slower because of the data limitation and because our brains don't necessarily care about computation complexity and take a complex route to do things.

It wouldn't be O(1), or even O(n) if n is the number of points, instead its complexity now depends on the maximum linear distance from the point selected to a bound of the image.

tl;dr

Your brain is not a binary computer.


George Reith

Posted 2014-03-17T06:05:56.817

Reputation: 221

8

you are forgetting one important step: plotting all those points on the graph you are looking at.

this is by necessity a O(n) operation.

after this a computer can use image recognition software to find the approximate points closest to the center in much the same way the human eye can. This is a O(sizeOfImage) operation worst case.

for a human to do the same as the computer remember that the computer gets a list of coordinates and can only look at one at the time.

ratchet freak

Posted 2014-03-17T06:05:56.817

Reputation: 1 769

1If one picks a constant "resolution", one can use an algorithm which is time O(log(resolution)) per point to plot them and identify all the points which are "close" to the point of interest. The O(log(resolution)) is vaguely analogous to the fact that it takes longer to plot points accurately on paper than to do so less precisely. Note also that increasing the resolution will increase the per-all-points cost of algorithms to eliminate non-candidate points, but reduce the number of non-nearest points survive elimination. – supercat – 2014-03-17T19:49:12.173

7

My interpretation of the question:

I do not believe this question is to be taken simplistically as a a computational geometry complexity issue. It should be better understood as saying: we perceive an ability to find the answer in constant time, when we can. What explains this perception, and up to this explanation and to human limitations, can a computer do as well.

Thus the question should probably be seen first as a question for a psychologist. The issue is probably related to your perception of time and effort. Can you brain really perceive a difference between $O(1)$ time and $O(log( n))$ time? Specific counter examples do not really matter, as in issues of perception we tend to think instinctively of average cost (complexity being probably to precise a concept psychologically). More accurately, we are more interested in the common case, than in special cases when we feel we cannot readily answer the question.

This may be reinforced by the Weber-Fechner laws, that states that our perception is to be measured on a logarithmic scale of the actual physical measure. In other words, we perceive relative variations rather than absolute variations. This is for example why sound intensity is measured in decibels.

If you apply this to the perception of the time we use to find the closest point, this is no longer $O(log (n))$ but $O_\psi(log (log (n)))$, where $O_\psi$ is my just invented Landau notation for "psychological complexity".

Actually I am cheating, because the psychological perception of the scatterplot graph size also obeys the logarithmic law, which should compensate in this simple complexity relation. Still, it means that a scatterplot will alsways seem much simpler to us than it really is, especially for very large ones. But whatever the size we perceive, if we have a built-in logarithmic algorithm to find a closest point (such as neuronal quadtrees), then the perceived processing time is to be mesured by $O_\psi(log (log (n)))$ which for all practical purposes is probably perceptually undistinguishable from a constant, and ther is necessarily some constant time added to it to start the recognition process and acknowledge the result.

Taking into account the physiological limitations

The above conclusion is further sustained when considering the image acquisition steps.

The OP was careful to separate the construction of a proper data structure, "such as a quadtree", which is amortized on several queries.

This does not work for most people who do not memorize the image. I think the image is scanned for each query, but that does not imply scanning all points: not the first time, and not for later queries.

The eyes scans a raster image, in constant time $T_{scan}$, idenpendently of the size of the scene taken in, and with a fixed resolution defined by the retina structure (see below). Thus it gets a constant amount of information, possibly not distinguishing all points. It can then focus on the relevant part of the image, to distinguish relevant points in another time $T_{scan}$, plus possibly the time to change the orientation and focus of the eye. Theoretically, this operation could have to be repeated, leading to logarithmic focussing, but I believe that in perceptual practice, there is at most one additional step for focussing vision.

Scanning probably results in a structure in the brain that can be analyzed to find the answer. It may still contain a very large number of points. Though I do not know how the brain proceeds, It is not unreasonable to think that it is some kind of focussing process that takes at worst logarithmic time, possibly even less. This process is applied to the perceived image, that has a bounded size. This implies of course a bounded number of points, though it can be quite large. Thus there is a fixed upperbound $m$ to the information to be processed. Assuming logarithmic processing, and reusing the above analysis, the perceived processing time is $O_\psi(log (log (m)))$.

The resolution of the human eye is fixed by the number of rods, which is about 125 millions. That is about $2^{27}$. Using base 2 for the logs, that gives for the processing an upperbound of about $log_2(27)$, i.e. about 5 steps of whatever cost there is for a step. Using instead the estimated value of the eye resolution which is on the order of 500 megapixels does not change the final result.

Without knowing the actual units to be used, this simply shows that the variation for processing is at worst on the same order as other constant time operations. Hence, it is quite natural that the perceived time for finding the closest point feels constant . . . whether we determine the closest point or only a set of the closer points.

About counter-examples and a possible solution

It is of course easy to build counter-examples that make eyes determination of the closest point very difficult among a small collection of the closer points. This is why the OP is actually asking for an algorithm that eliminate quickly most of the points, except for the closest ones. This issue of the possible difficulty of chosing among several close points is adressed in many answers, with the paradigmatic example of closest points being almost on a circle around the reference point. Typically the Weber-Fechner laws precludes being able to distinguish small distance variations over long enough distances. This effect may actually be increased by the presence of other points which, though eliminated, may distort the perception of distances. So trying to identify the closest point will be a harder task, and may well require specific examination steps, such as using instruments, that will completely destroy the feeling of constant time. But it seems clearly outside the range of experiments considered by the OP, hence not very relevant.

The question to answer, which is the question actually asked by the OP, is whether there is a way to eliminate most of the points, except possibly for the remaining few which seem to have very similar distances to the reference point.

Following our analysis of what may be hidden behind a perceived constant time, a computer solution that does it in $O(log(n))$ time could be considered satisfactory. On the other hand, relying on amortized cost should not really be acceptable, since the brain does not do it that way, afaik.

Rejecting amortized cost does not allow for a computer solution, since all points have to be looked at. This underscores a major difference in the computing power of the brain and of human perception: it can use analog computation with properties that are quite different from digital computation. This is typically the case when billions of points are not distinguishable by the eye, which does not have the resolution to see anything but a big cloud with various shades of dark. But the eye can then focus on relevant smaller part, and see a bounded number of points, containing the relevant ones. It does not have to know of all points individually. For a computer to do the same, you would have to give it a similar sensor, rather than the precise numerical coordinates of each point. It is a very different problem.

"Mere visual inspection" is in some respects a lot more powerful than digital computation. And it is due also to the physics of sensors, not just to a possibly greater computing power of the brain.

babou

Posted 2014-03-17T06:05:56.817

Reputation: 16 731

i don't think it's an issue whether your brain can distinguish $O(1)$ from $O(\log{n})$ in tasks involving perception. as there are hardwired parameters of human perception (mainly resolution) you could argue that processing perceptual information is always $O(1)$. Also note that you trivially distinguish between $O(1)$ and $O(\log{n})$ when you solve a task beyond sheer perceptual recognition, e.g. locating a given number in a graphical representation of a balanced binary heap with labelled nodes. note that perceptual restraints do not matter as you only inspect the graphics locally. – collapsar – 2014-03-17T10:37:48.243

The other side of the scale is also distorted by human perception: most increasing functions will yield values of "forever" even for $n$ computers would laugh about if they were sentient. – Raphael – 2014-03-17T17:23:35.300

@collapsar This first approximation of my answer was actually stated as a question, regarding what can be perceived, as an introduction to what follows. This is further developed into the $O_\psi(log (log (n)))$ answer, which is much smaller. Then it has to be compared to other tasks that are constant time on the same order, and may not make much difference perceptually. This is not the case with your binary heap example. Sorry for late answer: network down for several hours. – babou – 2014-03-17T23:27:53.557

4

We had students in exams who, when asked how fast you can sort an array, would claim that computers are stupid, and need n*log(n) (or worse), while humans can do it faster.

The reply of my professor was always: I will give a list of 10.000 items. Let's see if you can come up with a method that is faster than what a computer would do.

And then: how many processing cores are involved when you try to find the closest point? You are not a single-processor machine, you have a neural network, that has some flexibility when it comes to tasks like this.

Zane

Posted 2014-03-17T06:05:56.817

Reputation: 151

1Plus the various aspects of what you know about the data and what resources you have available when you need to sort. For instance if your fellow students need to sort something that cannot fit completely in the room they are. – Thorbjørn Ravn Andersen – 2014-03-17T17:37:06.497

@ThorbjørnRavnAndersen: this is a nice one to understand how important space-complexity is "something that cannot fit completely in the room" 8^) – Zane – 2014-03-18T17:49:24.890

3

I believe @Patrick87 gave you the clue: your eyes and brain are a massively parallel computing machine. Some argued that this does not explain the issue, because for arbitrarily large problems a finite number of parallel processors makes no difference.

But it does here: as hinted by many, your eyes (and brain) have a limited capacity to solve this problem; and this is because one cannot fit any number of points within the span of a normal human gaze. Your eyes need to be able to distinguish them for a start, and if there are too many, then they will be so close than your eyes won't notice the difference. Bottom line: it is fast for well enough points that fit in your normal sight, i.e. very few. In other cases it will fail.

So, you can solve this problem in O(1) for small and simple cases that your brain can process in a breeze. Beyond that, it can't and therefore, it is not even O(anything) because it most likely fails.

carlosayam

Posted 2014-03-17T06:05:56.817

Reputation: 256

1

Nobody has mentioned that this problem can be solved very quickly on a computer with a spatial index. This is the equivalent of plotting out the points in an image for your eyes to scan quickly and eliminate most of the points.

There is a very good indexing algorithm used by Google and others to find the nearest point(s) called a Geohash. http://en.wikipedia.org/wiki/Geohash.

I think that this will even up the contest in favour of the computer. I was not impressed with some of the answers that used linear thinking.

user3451435

Posted 2014-03-17T06:05:56.817

Reputation: 127

With a spatial index, the problem is still $\Theta(n)$ in the worst case. The spatial index gives you something like $\Theta(\lg n)$ for a random distribution. – Gilles – 2014-03-23T13:44:05.973

The point is that a spatial index makes it roughly as easy as it is for a human looking at a screen littered with dots. – reinierpost – 2014-03-24T22:26:35.853

1

If we consider the case of finding a nearest neighbours in a point set of n-dimensions in Euclidean space, the complexity is typically bounded by the number of dimensions as it grows large (i.e. larger than the size of the dataset).

P. Agrawal in a survey on range searching lists a few logarithmic solutions for the problem. However, as dimension grows large, the complexity of a query is then $O(\log^{d-2} n)$. We do not consider the time to read the data or build the data structure but only the time to query it.

The problem of finding a closest point to a node in a graph has a Euclidean expression whenever the graph can be embedded into Euclidean space with a distortion small enough. And using a adjacency list with weights, we still need to build the adjacency list.

Now your question effectively has a psychological aspect. The closest answer that has been studied is the "neural network" gestaltist perspective on the subject. If you build a neural network that can recognize the closest point to an element, if all the neurons of the network run in parallel, as is usual, then you effectively have an $O(1)$ complexity. If our brain effectively functions "like" a neural network, then this may explain why we have the impression of a "gestalt".

user13675

Posted 2014-03-17T06:05:56.817

Reputation: 804

-1

other answers are good but how about a zen counter question that stretches the basic reasoning/premise of the original question to extremes to show some faultiness [but is also the paradox at the core of AI research]:

if I can think with human intelligence, why cant I create a computer that thinks as well as a human?

there are multiple ways to answer your question, but basically, our thought processes and brain perceptual capabilities are not necessarily accessible to introspection, and the introspection we do apply to them can be misleading. for example we can recognize objects but we have no way to perceive/explain the quasi-algorithmic process that goes on to allow this. also there are many psychology experiments that show there are subtle distortions in our perceptions of reality and in particular time perception, see eg time perception.

it is generally thought/conjectured by scientists that the human brain does in fact employ algorithms but they function differently than computerized ones, and also there is a very large amount of parallel processing going on in the brain via neural networks that cannot be compared sensibly to sequential computer algorithms. in mammals, a significant percentage of the entire brain volume is dedicated to visual processing.

in other words human brains are in many ways highly optimized visual computers, and they do in fact in many ways have a capability that exceeds the worlds greatest supercomputers currently, such as in object recognition etcetera, and that is due to deficiencies (in comparison) in human-constructed software/hardware in comparison to biology that has been highly tuned/evolved/optimized over millions of years.

vzn

Posted 2014-03-17T06:05:56.817

Reputation: 9 131

the short answer though is that to refer to the big-O $O(f(n))$ complexity of anything related to human perception is verging on a major misuse of the basic mathematical concept, which measures math functions. ie apples and oranges! – vzn – 2014-03-17T23:48:52.090

-2

Generally speaking you are solving two different problems and if you compete in the same competition, complexity will be O(1) for both of you. Why? Let's make situation a bit simpler - assume that you have a line with one red point and n green points. Your task is to find green point which is closest to red point. Everything is on the graph. Now what you do and what you program is doing is basically the same - just "walk away"(in both directions) from red point and check whether the pixel you are looking at is white/black(background) or green. Now the complexity is O(1).

What's interesting is that some data presentation methods gives answers to some questions immediately(O(1)). Basic example is extremely simple - just count white pixels on black image(each pixel value is 0 = black or 1 = white). What you need to do is just add all pixels values - complexity is the same for 1 white pixel and for 1000, but it depends on image size - O(m), m=image.width*image.height. Is it possible to do it faster? Of course, all we need to do is use different method of storing images which is integral image: enter image description here Now calculating the result is O(1) (if you have integral image already calculated). Another way is just store all white pixels in array/vector/list/... and just count it size - O(1).

cyriel

Posted 2014-03-17T06:05:56.817

Reputation: 97

"walking away" on a line is not $O(1)$. The red point can be arbitrarily far from the next green one. Counting the size of an array, vector or list also is not $O(1)$. – FrankW – 2014-03-22T06:19:01.583

@FrankW - so what's the complexity of "walking away"? I'm not trying to say that you are wrong, i just want to know. Counting size of array/vector/list - generally array size is constant so there is no need to count it, vector - i'm not sure, i would say it depends on implementation(but most likely in most implementations it's just of field of an object so there is no need to count it), list - you are right, it's not O(1) - my mistake. – cyriel – 2014-03-22T09:48:27.540

If you want to look at every (or every $k$-th for some constant $k$) pixel, it's $O(#pixels)$. array/vector/list: If you want to use a fixed size, you have to count (or know otherwise) the number of items you want to store beforehand. Also, I would not call it "counting", if you just read the number from a variable. – FrankW – 2014-03-23T00:32:52.780