A *perfect hash function* can be defined as an injective function from a set $S$ to a subset of the integers $\{0, 1, 2, ..., n\}$. If a perfect hash function exists for your data and storage needs, you can easily get $O(1)$ behavior. For instance, you can get $O(1)$ performance from a hash table for the following task: given an array $l$ of integers and a set $S$ of integers, determine whether $l$ contains $x$ for each $x \in S$. A pre-procesing step would involve making a hash table in $O(|l|)$, followed by checking each element of $S$ against it in $O(|S|)$. Altogether, this is $O(|l| + |S|)$. A naive implementation using linear search might be $O(|l||S|)$; using binary search, you can do $O(\log(|l|)|S|)$ (note that this solution is $O(|l|)$ space, since the hash table must map distinct integers in $l$ to distinct bins).

EDIT: To clarify on how the hash table is generated in $O(|l|)$:

The list $l$ contains integers from a finite set $U \subset \mathbb{N}$, possibly with repeats, and $S \subseteq U$. We want to determine whether $x \in S$ is in $l$. To do so, we pre-compute a hash table for elements of $l$: a lookup table. The hash table will encode a function $h: U \rightarrow \{true, false\}$. To define $h$, initially assume $h(x) = false$ for all $x \in U$. Then, linearly scan through elements $y$ of $l$, setting $h(y) = true$. This takes $O(|l|)$ time and $O(|U|)$ space.

Notice that my original analysis assumed that $l$ contained at least $O(|U|)$ distinct elements. If it contains fewer distinct elements (say, $O(|1|)$), the space requirement may be higher (although it is no more than $O(|U|)$).

EDIT2: The hash table can be stored as a simple array. The hash function can be the identity function on $U$. Notice that the identity function is trivially a perfect hash function. $h$ is the hash table and encodes a separate function. I am being sloppy/confused in some of the above, but will try to improve it soon.

1@Raphael I would be very interested in an answer that explains (along broad lines) when I can count on $O(1)$ amortized and when I can't. As for how the hash values are distributed, that's part of my question really: how can I know? I know hash functions are supposed to distribute values well; but if they always did the worst case would never be reached, which doesn't make sense. – Gilles – 2012-03-15T20:52:25.743

1Also be careful of premature optimization; for smallish (several thousand elements) data I have often seen $O(\log n)$ balanced binary trees outperform hashtables due to lower overhead (string comparisons are vastly cheaper than string hashes). – isturdy – 2013-05-06T12:59:39.873

3Can you live with $\cal{O}(1)$ amortised access time? In general, hash table performance will heavily depend on how much overhead for sparse hashtables you are prepared to tolerate and on how the actual hash values are distributed. – Raphael – 2012-03-12T19:31:07.927

5Oh, btw: you can avoid linear worst-case behaviour by using (balanced) search trees instead of lists. – Raphael – 2012-03-12T19:31:54.027

Let us continue this discussion in chat.

– Raphael – 2015-02-24T11:28:57.193