Why do we not combine random number generators?

60

9

There are many applications where a pseudo random number generator is used. So people implement one that they think is great only to find later that it's flawed. Something like this happened with the Javascript random number generator recently. RandU much earlier too. There are also issues of inappropriate initial seeding for something like the Twister.

I cannot find examples of anyone combining two or more families of generators with the usual xor operator. If there is sufficient computer power to run things like java.SecureRandom or Twister implementations, why do people not combine them? ISAAC xor XORShift xor RandU should be a fairly good example, and where you can see the weakness of a single generator being mitigated by the others. It should also help with the distribution of numbers into higher dimensions as the intrinsic algorithms are totally different. Is there some fundamental principle that they shouldn't be combined?

If you were to build a true random number generator, people would probably advise that you combine two or more sources of entropy. Is my example different?

I'm excluding the common example of several linear feedback shift registers working together as they're from the same family.

Paul Uszak

Posted 2016-05-20T01:32:54.023

Reputation: 585

The answer might depend on the application. What do you want to use the pseudorandom sequence for? – Yuval Filmus – 2016-05-20T11:48:23.180

I think this is a good question. I guess that the XOR product of an "independent" set of non-trivial PRNGs will be at least as hard to predict as the result of any one of them. (But not necessarily any more random, and also dependent on your definition of "independent", and of course slower.) – mwfearnley – 2016-05-20T14:52:45.890

1

Have you found Fortuna (https://en.wikipedia.org/wiki/Fortuna_%28PRNG%29) it sounds like its close to what you describe that it aggregates various random sources into one.

– Little Code – 2016-05-20T18:05:26.687

I don't want to compete with the good answers here, but one practical reason we don't do this is because doing so makes it very easy to develop the illusion of better random behaviors. If one goes down this path, it becomes very easy to mess up and actually weaken your PRNG. Thus, budding developers are encouraged not to go down this path, simply to avoid putting themselves in difficult positions. – Cort Ammon – 2016-05-20T20:05:46.897

1@LittleCode Actually it sounds different altogether. Fortuna outputs data from a single hash function. It just messes about with a lot of weak entropy collection mechanisms before (re)hashing it though a single output function. My question related to outputting from several functions (why not 10 of them)? If this is a fill device, speed is irrelevant anyway. – Paul Uszak – 2016-05-21T22:15:09.070

1

The late George Marsaglia, a noted researcher in the field of PRNGs who invented multiply new PRNG types such as multiply-with-carry and xor-shift, did precisely this when he proposed the KISS generator in the 1990s which is a combination of three PRNGs of different type. I have been using KISS successfully for the past twenty years, not for cryptography, of course. A useful secondary source with regard to KISS is this 2011 paper by Greg Rose in which he points out an issue with one of the constituent PRNGs, which doesn't invalidate the combining concept

– njuffa – 2016-05-21T22:17:48.347

4Knuth relates the result of naively combining pseudorandom number generators (using one random number to choose which generator to use) resulted in a function which converges to a fixed value! So, back in the days just before the microcomputer revolution, he warned us to never mix random generators. – JDługosz – 2016-05-22T21:19:02.080

Answers

7

IIRC (and this is from memory), the 1955 Rand bestseller A Million Random Digits did something like this. Before computers were cheap, people picked random numbers out of this book.

The authors generated random bits with electronic noise, but that turned out to be biassed (it's hard to make a flipflop spent exactly equal times on the flip and the flop). However, combining bits made the distribution much more uniform.

Amateur

Posted 2016-05-20T01:32:54.023

Reputation: 94

44

Sure, you can combine PRNGs like this, if you want, assuming they are seeded independently. However, it will be slower and it probably won't solve the most pressing problems that people have.

In practice, if you have a requirement for a very high-quality PRNG, you use a well-vetted cryptographic-strength PRNG and you seed it with true entropy. If you do this, your most likely failure mode is not a problem with the PRNG algorithm itself; the most likely failure mode is lack of adequate entropy (or maybe implementation errors). Xor-ing multiple PRNGs doesn't help with this failure mode. So, if you want a very high-quality PRNG, there's probably little point in xor-ing them.

Alternatively, if you want a statistical PRNG that's good enough for simulation purposes, typically the #1 concern is either speed (generate pseudorandom numbers really fast) or simplicity (don't want to spend much development time on researching or implementing it). Xor-ing slows down the PRNG and makes it more complex, so it doesn't address the primary needs in that context, either.

As long as you exhibit reasonable care and competence, standard PRNGs are more than good enough, so there's really no reason why we need anything fancier (no need for xor-ing). If you don't have even minimal levels of care or competence, you're probably not going to choose something complex like xor-ing, and the best way to improve things is to focus on more care and competence in the selection of the PRNG rather than on xor-ing.

Bottom line: Basically, the xor trick doesn't solve the problems people usually actually have when using PRNGs.

D.W.

Posted 2016-05-20T01:32:54.023

Reputation: 83 008

3"lack of adequate entropy ... Xoring multiple PRNGs doesn't help with this" -- indeed it can hinder, since you increase the amount of entropy needed to seed your PRNGs. Which is why you don't want to make it routine practice to combine well-vetted PRNGs, even though it does indeed protect you against one of those well-vetted PRNGs turning out to be complete rubbish (in the implementation you're using). – Steve Jessop – 2016-05-20T12:13:39.163

Another reason is that implementation bugs are far, far, far more common than fundamental problems with algorithms, so the simpler the better. A standard algorithm can at least be tested against another implementation or reference values, a custom-made xor can't. – Gilles – 2016-05-20T20:05:04.370

1@D.W. Why "seeded independently?" Since my question relates to combinations of different families of generators, each family should produce a unique output sequence from identical seeds. For example, java.SecureRandom and RC4 could easily be seeded from the same key, then combined. – Paul Uszak – 2016-05-20T22:57:35.407

In research, "good enough for simulation purposes" is not about simplicity, and not even about speed. It's about how well particular algorithm works, that is: how well it is understood? how independent numbers seem to be? How well results of it's use matched results of true randomness? – Mołot – 2016-05-21T22:59:33.390

@PaulUszak You have algorithm, seed, and pseudorandom sequence. If you seed together, you still effectively have one algorithm, even if more complicated one. But it's still one seed to find and one algorithm to "guess" to be able to predict next number. You are not getting any more entropy. You are not making attacker's life any harder. – Mołot – 2016-05-21T23:03:20.470

1@D.W. The big assumption you state "use a well-vetted cryptographic-strength PRNG". Reality is this is practically impossible to ascertain as with most cryptographical ciphers, hashes and so on - weaknesses are found over time. They were "well-vetted" for knowledge of yesterday or yesteryear. – Shiv – 2016-05-22T02:38:17.863

@Shiv, my point is that if you use a well-vetted crypto-strength PRNG, algorithmic weaknesses are much rarer than the other kinds of flaws I describe in my answer. – D.W. – 2016-05-22T02:55:26.060

@D.W. I have to admit I cannot understand your twin issues of combined generators being either slower or more prone to bugs. The generator is but a minor bit of code taken from common libraries when compared to the main application. If you're Monte Carlo modelling the derivatives market, predicting next week's weather or simulating a hydrogen bomb explosion, you may have hundreds of lines of code. A xor B xor C xor D seems cheap at twice the price for certainty. Nyet? – Paul Uszak – 2016-05-24T15:47:52.403

1@PaulUszak, I don't think I ever argued that xor-ing two generators makes it more prone to bugs. I'm saying that, if you choose a good PRNG (just one), one of the most likely failure modes is a failure of seeding or an implementation failure, and xor-ing two generators doesn't help with either. (Of course, if the single PRNG doesn't fail, xor-ing two generators isn't useful, either.) So basically it's addressing the wrong problem. In other words, xor-ing generators doesn't increase certainty much, because it doesn't address the most important causes of uncertainty. – D.W. – 2016-05-24T15:55:55.537

18

In fact, something of a breakthrough has just been announced by doing precisely this.

University of Texas computer science professor David Zuckerman and PhD student Eshan Chattopadhyay found that a "high-quality" random number could be generated by combining two "low-quality" random sources.

Here's their paper: Explicit Two-Source Extractors and Resilient Functions

NietzscheanAI

Posted 2016-05-20T01:32:54.023

Reputation: 734

8This is a purely theoretical paper on a different topic which has absolutely no practical relevance, despite the PR efforts by UT. – Yuval Filmus – 2016-05-20T07:36:35.217

4@Yuval Filmus - would you care to expand on that comment? – NietzscheanAI – 2016-05-20T07:37:46.173

8There's a big divide between theory and practice. Usually practitioners don't care about theory, and vice versa. In this case the PR branch of UT decided to latch on an excellent theoretical paper, describing it as practically relevant, which it isn't. The problems considered in the paper are not so interesting from a practical perspective, and have simple solutions that work well enough, though it's impossible to prove that they do. – Yuval Filmus – 2016-05-20T07:42:01.460

2Moreover, this particular paper is just one work in the theoretical area of extractors. You could bill any other paper in the area in the same way. They are all about combining weak sources to create a strong source. The difference is just in the parameters. – Yuval Filmus – 2016-05-20T07:44:13.220

3Finally, the construction in the paper is most probably an overkill, not something you would ever want to implement. Concrete parameters for this type of construction are hard to determine, and they're usually extremely bad, since the papers always focus on the asymptotic regime, and ignore constants. – Yuval Filmus – 2016-05-20T07:46:33.867

1@YuvalFilmus Actaully it's not that theoretical. I measure entropy by compressibility, so if a weak source compresses to say 1% of original size, I consider that a 1% source. I have XORed a 3% source with a 0.1% source. You get a surprising 30% resultant source. Now imagine if you combined two or three 95% sources... – Paul Uszak – 2016-05-20T23:06:09.987

2@PaulUszak Well, you're not using their combination method. XORing is not good enough for their theoretical purposes, so they develop a much more sophisticated combining mechanism, which is their real innovation. Combining weak random sources to get a strong one is the goal of the entire field, though in practice it's a solved problem. Your operating system uses such a mechanism to produce random numbers from several weak "natural" sources. – Yuval Filmus – 2016-05-21T07:30:46.430

9

Suppose that $X_1,\ldots,X_n$ is a pseudorandom binary sequence. That is, each $X_i$ is a random variable supported on $\{0,1\}$, and the variables $X_1,\ldots,X_n$ are not necessarily independent. We can think of this sequence being generated in the following way: first we sample a uniformly random key $K$, and then use some function $f(K)$ to generate the pseudorandom sequence.

How do we measure how good the pseudorandom sequence $X_1,\ldots,X_n$ is? While it is possible to measure how good a particular realization is (say using Kolmogorov complexity), here I will concentrate on measures which depend on the entire distribution of the random variable $(X_1,\ldots,X_n)$. One such example is entropy, but we will only require two properties of our measure $L$: (a larger $L(\cdot)$ means a more random sequence)

  • If $y_1,\ldots,y_n$ is a deterministic sequence (i.e., a fixed sequence) then $L(X_1 \oplus y_1, \ldots, X_n \oplus y_n) = L(X_1,\ldots,X_n)$.

  • If $\vec{X^0},\vec{X^1}$ are two independent pseudorandom sequences, $T \in \{0,1\}$ is an independent random bit, and $\vec{Z} = \vec{X^T}$, then $L(\vec{Z}) \geq \min(\vec{X^0},\vec{X^1})$.

The first property means that the measure is invariant under flipping the $i$th bit. The second property means that if we mix two distributions $\vec{X},\vec{Y}$, then the result is at least as good as the worse one.

Any reasonable randomness measure will satisfy the first property. The second property is satisfied by most popular measures such as entropy $H$ and min-entropy $H_\infty$.

We can now state and prove a theorem showing that XORing two pseudorandom sequences is always a good idea.

Theorem. Let $\vec{X},\vec{Y}$ be two independent pseudorandom sequences of the same length, and let $L$ be an admissible randomness measure (one satisfying the two conditions above). Then $$ L(\vec{X} \oplus \vec{Y}) \geq \max(L(X),L(Y)). $$

Proof. Suppose $L(X) \geq L(Y)$. Then $X \oplus Y$ is a mixture of the distributions $X \oplus y$, mixed according to the distribution of $Y$. Since $L(X \oplus y) = L(X)$ and a mixture is at least as good as the worst distribution being mixed, we obtain $L(X \oplus Y) \geq L(X)$. $~\square$

What this theorem means is that if you XOR two pseudorandom sequences generated using two independent keys, the result is always at least as good as the better sequence being XORed, with respect to any admissible randomness measure.

In practice, in order to use two independent keys, we probably expand one key to two keys in a pseudorandom fashion. The two keys are then not independent. However, if we use an "expensive" way to expand the one key into two keys, we expect the resulting two keys to "look" independent, and so for the theorem to hold "morally". In theoretical cryptography there are ways of making this statement precise.


Should we, then, XOR two pseudorandom number generators? If we are not restricted by speed, then that's certainly a good idea. But in practice we have a speed limit. We can then ask the following question. Suppose that we are given two PRNGs, each with a parameter $T$ that controls the running time (and so the strength) of the generator. For example, $T$ could be the length of an LFSR, or the number of rounds. Suppose we use one PRNG with parameter $T_1$, the other with parameter $T_2$, and XOR the result. We can assume that $T_1 + T_2 = t$, so that the total running time is constant. What is the best choice of $T_1,T_2$? Here there is a tradeoff which is hard to answer in general. It may be that the setting $(t/2,t/2)$ is much worse than either $(t,0)$ or $(0,t)$.

The best advice here is to stick to a popular PRNG which is considered strong. If you can spare more time for generating your sequence, XOR several copies, using independent keys (or keys generated by expanding a single key using an expensive PRNG).

Yuval Filmus

Posted 2016-05-20T01:32:54.023

Reputation: 167 283

Comments are not for extended discussion; this conversation has been moved to chat. Once you come to a constructive end, please edit the answer to incorporate the results of your discussion.

– Raphael – 2016-05-21T10:19:26.403

4

I'll give this a shot, since I'm sufficiently disturbed by the advice given in some of the other answers.

Let $\vec{X},\vec{Y}$ be infinite bit sequences generated by two RNGs (not necessarily PRNGs which are deterministic once initial state is known), and we're considering the possibility of using the sequence $\vec{X} \oplus \vec{Y}$ with the hope of improving behavior in some sense. There are lots of different ways in which $\vec{X} \oplus \vec{Y}$ could be considered better or worse compared to each of $\vec{X}$ and $\vec{Y}$; here are a small handful that I believe are meaningful, useful, and consistent with normal usage of the words "better" and "worse":

  • (0) Probability of true randomness of the sequence increases or decreases
  • (1) Probability of observable non-randomness increases or decreases (with respect to some observer applying some given amount of scrutiny, presumably)
  • (2) Severity/obviousness of observable non-randomness increases or decreases.

First let's think about (0), which is the only one of the three that has any hope of being made precise. Notice that if, in fact, either of the two input RNGs really is truly random, unbiased, and independent of the other, then the XOR result will be truly random and unbiased as well. With that in mind, consider the case when you believe $\vec{X},\vec{Y}$ to be truly random unbiased isolated bit streams, but you're not completely sure. If $\varepsilon_X,\varepsilon_Y$ are the respective probabilities that you're wrong about each of them, then the probability that $\vec{X} \oplus \vec{Y}$ is not-truly-random is then $\leq \varepsilon_X \varepsilon_Y \lt min\{\varepsilon_X,\varepsilon_Y\}$, in fact much less since $\varepsilon_X,\varepsilon_Y$ are assumed very close to 0 ("you believe them to be truly random"). And in fact it's even better than that, when we also take into account the possibility of $\vec{X},\vec{Y}$ being truly independent even when neither is truly random: $$ \begin{eqnarray*} Pr(\vec{X} \oplus \vec{Y} \mathrm{\ not\ truly\ random}) \leq \min\{&Pr(\vec{X} \mathrm{\ not\ truly\ random}), \\ &Pr(\vec{Y} \mathrm{\ not\ truly\ random}), \\ &Pr(\vec{X},\vec{Y} \mathrm{\ dependent})\}. \end{eqnarray*} $$ Therefore we can conclude that in sense (0), XOR can't hurt, and could potentially help a lot.

However, (0) isn't interesting for PRNGs, since in the case of PRNGs none of the sequences in question have any chance of being truly random.

Therefore for this question, which is in fact about PRNGs, we must be talking about something like (1) or (2). Since those are in terms of properties and quantities like "observable", "severe", "obvious", "apparent", we're now talking about Kolmogorov complexity, and I'm not going to try to make that precise. But I will go so far as to make the hopefully uncontroversial assertion that, by such a measure, "01100110..." (period=4) is worse than "01010101..." (period=2) which is worse than "00000000..." (constant).

Now, one might guess that (1) and (2) will follow the same trend as (0), and that therefore the conclusion "XOR can't hurt" might still hold. However, note the significant possibility that neither $\vec{X}$ nor $\vec{Y}$ was observably non-random, but that correlations between them cause $\vec{X} \oplus \vec{Y}$ to be observably non-random. The most severe case of this, of course, is when $\vec{X} = \vec{Y}$ (or $\vec{X} = \mathrm{not}(\vec{Y})$), in which case $\vec{X} \oplus \vec{Y}$ is constant, the worst of all possible outcomes; in general, it's easy to see that, regardless of how good $\vec{X}$ and $\vec{Y}$ are, $\vec{X}$ and $\vec{Y}$ need to be "close" to independent in order for their xor to be not-observably-nonrandom. In fact, being not-observably-dependent can reasonably be defined as $\vec{X} \oplus \vec{Y}$ being not-observably-nonrandom.

Such surprise dependence turns out to be a really big problem.


An example of what goes wrong

The question states "I'm excluding the common example of several linear feedback shift registers working together as they're from the same family". But I'm going to exclude that exclusion for the time being, in order to give a very simple clear real-life example of the kind of thing that can go wrong with XORing.

My example will be an old implementation of rand() that was on some version of Unix circa 1983. IIRC, this implementation of the rand() function had the following properties:

  • the value of each call to rand() was 15 pseudo-random bits, that is, an integer in range [0, 32767).
  • successive return values alternated even-odd-even-odd; that is, the least-significant-bit alternated 0-1-0-1...
  • the next-to-least-significant bit had period 4, the next after that had period 8, ... so the highest-order bit had period $2^{15}$.
  • therefore the sequence of 15-bit return values of rand() was periodic with period $2^{15}$.

I've been unable to locate the original source code, but I'm guessing from piecing together a couple of posts from in https://groups.google.com/forum/#!topic/comp.os.vms/9k4W6KrRV3A that it did precisely the following (C code), which agrees with my memory of the properties above:

#define RAND_MAX 32767
static unsigned int next = 1;
int rand(void)
{
    next = next * 1103515245 + 12345;
    return (next & RAND_MAX);
}
void srand(seed)
unsigned int seed;
{
    next = seed;
}

As one might imagine, trying to use this rand() in various ways led to an assortment of disappointments.

For example, at one point I tried simulating a sequence of random coin flips by repeatedly taking:

rand() & 1

i.e. the least significant bit. The result was simple alternation heads-tails-heads-tails. That was hard to believe at first (must be a bug in my program!), but after I convinced myself it was true, I tried using the next-least-significant bit instead. That's not much better, as noted earlier-- that bit is periodic with period 4. Continuing to explore successively higher bits revealed the pattern I noted earlier: that is, each next higher-order bit had twice the period of the previous, so in this respect the highest-order bit was the most useful of all of them. Note however that there was no black-and-white threshold "bit $i$ is useful, bit $i-1$ is not useful" here; all we can really say is the numbered bit positions had varying degrees of usefulness/uselessness.

I also tried things like scrambling the results further, or XORing together values returned from multiple calls to rand(). XORing pairs of successive rand() values was a disaster, of course-- it resulted in all odd numbers! For my purposes (namely producing an "apparently random" sequence of coin flips), the constant-parity result of the XOR was even worse than the alternating even-odd behavior of the original.

A slight variation puts this into the original framework: that is, let $\vec{X}$ be the sequence of 15-bit values returned by rand() with a given seed $s_X$, and $\vec{Y}$ the sequence from a different seed $s_Y$. Again, $\vec{X} \oplus \vec{Y}$ will be a sequence of either all-even or all-odd numbers, which is worse than the original alternating even/odd behavior.

In other words, this is an example where XOR made things worse in the sense of (1) and (2), by any reasonable interpretation. It's worse in several other ways as well:

  • (3) The XORed least-significant-bit is obviously biased, i.e. has unequal frequencies of 0's and 1's, unlike any numbered bit position in either of the inputs which are all unbiased.
  • (4) In fact, for every bit position, there are pairs of seeds for which that bit position is biased in the XOR result, and for every pair of seeds, there are (at least 5) bit positions that are biased in the XOR result.
  • (5) The period of the entire sequence of 15-bit values in the XOR result is either 1 or $2^{14}$, compared to $2^{15}$ for the originals.

None of (3),(4),(5) is obvious, but they are all easily verifiable.


Finally, let's consider re-introducing the prohibition of PRNGs from the same family. The problem here, I think, is that it's never really clear whether two PRNGs are "from the same family", until/unless someone starts using the XOR and notices (or an attacker notices) things got worse in the sense of (1) and (2), i.e. until non-random patterns in the output cross the threshold from not-noticed to noticed/embarrassing/disastrous, and at that point it's too late.

I'm alarmed by other answers here which give unqualified advice "XOR can't hurt" on the basis of theoretical measures which appear to me to do a poor job of modelling what most people consider to be "good" and "bad" about PRNGs in real life. That advice is contradicted by clear and blatant examples in which XOR makes things worse, such the rand() example given above. While it's conceivable that relatively "strong" PRNGs could consistently display the opposite behavior when XORed to that of the toy PRNG that was rand(), thereby making XOR a good idea for them, I've seen no evidence in that direction, theoretical or empirical, so it seems unreasonable to me to assume that happens.

Personally, having been bitten by surprise by XORing rand()s in my youth, and by countless other assorted surprise correlations throughout my life, I have little reason to think the outcome will be different if I try similar tactics again. That is why I, personally, would be very reluctant to XOR together multiple PRNGs unless very extensive analysis and vetting has been done to give me some confidence that it might be safe to do so for the particular RNGs in question. As a potential cure for when I have low confidence in one or more of the individual PRNGs, XORing them is unlikely to increase my confidence, so I'm unlikely to use it for such a purpose. I imagine the answer to your question is that this is a widely held sentiment.

Don Hatch

Posted 2016-05-20T01:32:54.023

Reputation: 179

So how do you explain A5/1 usage by literally billions of people? – Paul Uszak – 2016-05-25T11:37:54.417

@PaulUszak I have no idea. Does A5/1 being used by billions of people contradict something I said? – Don Hatch – 2016-05-25T13:55:15.897

It's three prngs (actually from the same family) xored together to form a better one in the way that disturbs and alarms you... – Paul Uszak – 2016-05-26T01:14:12.437

What I'm disturbed and alarmed by is the unqualified advice "if you're not sure, go ahead and XOR together a bunch of RNGs; it can't make things worse". I didn't mean to say or imply that XOR is bad in all cases, and I don't have any opinion at all about A5/1 or the use of XOR in it. Would it help if I change my final silly summary statement to make this clearer? – Don Hatch – 2016-05-26T02:03:18.180

1I replaced the simplistic "just say no to XORing RNGs" at the end with something more real and hopefully less misleading. – Don Hatch – 2016-05-26T02:32:25.883

1

DISCLAIMER: This answer is strictly about "We are we not doing it" and not "here's mathematical proof why it can or can't work". I don't claim that XOR introduces (or not) any cryptographic vulnerabilities. My point is only that experience shows us that even simplest schemes almost always introduce unforeseen consequences - and this is why we avoid them.

"Randomness" is just a tip of the iceberg when it comes to RNGs and PRNGs. There are other qualities that are important, eg uniformity.

Imagine a common dice which is quite good RNG on it's own. But now let's say you need a 1-5 range instead of 1-6. First thing that comes to mind is to simply erase the 6 face and replace it with an extra 1. The "randomness" remains (results are still truly random), however uniformity suffers greatly: now 1 is twice as likely as other outcomes.

Combining results from multiple RNGs is a similarly slippery slope. Eg. simple adding 2 dice throws completely wipes out any uniformity, as "7" is now 6 times more likely than "2" or "12". I agree that XOR looks better than addition at first glance, but in PRNGs nothing turns out as it looks at first glance.

This is why we tend to stick to known implementations - because someone spent loads of time and money into researching them and all the shortcomings are well known, understood and can be worked around. When you roll out your own, you potentially create vulnerabilities and you should put in similar effort to prove it. As the dice addition example shows, combining can be not much different from creating a new one from scratch.

Security is a chain, as strong as it's weakest component. A rule of thumb in security: whenever you combine 2 things, you usually get a sum of flaws, not a sum of strengths.

Agent_L

Posted 2016-05-20T01:32:54.023

Reputation: 223

6Strongly disagree. If you XOR a truly random sequence with an arbitrary sequence, you still get a truly random sequence. Similarly, if you XOR two independent pseudorandom sequences (i.e., generated with different keys), you get something at least as strong as each one individually. – Yuval Filmus – 2016-05-20T10:28:46.767

3This seems wrong to me. The usual case here is that I think I have two very high quality RNGs producing essentially truly random bits, but there's a tiny chance epsilon that I might be (perhaps grossly) mistaken about one (or, much less likely, both) of them. If I xor them together, as long as I'm right about at least one of them, the result will be truly random, and I'm good. So by combining them I've reduced my chance of having a bad RNG from roughly epsilon/2 to extremely tiny epsilon^2, which is definitely a win. I suspect similar dynamics hold even in less cut-and-try cases. – Don Hatch – 2016-05-20T10:30:07.230

@YuvalFilmus Random - yes. Uniform - probably not. The point is that you have to prove it, not just "feel" it. – Agent_L – 2016-05-20T10:31:18.893

2I'm still not convinced. When I wrote "truly random" I meant "uniformly random". If you XOR a uniformly random sequence with an arbitrary sequence, you get a uniformly random sequence. – Yuval Filmus – 2016-05-20T10:32:38.657

I suppose there might be two potential flaws in my reasoning though: (1) if these are pseudo-random generators, even if they are both excellent, the deterministic process driving both may be correlated. E.g. in the extreme case, they are both the same or one is a trivial transform of the other, in which case their XOR will be zero or something with very low entropy. (2) In the supposedly very-unlikely epsilon-squared case that I'm wrong about both of them, i.e. both low quality, their XOR might expose the error in a far more prominent and embarrassing way than either of them individually. – Don Hatch – 2016-05-20T10:37:56.197

@YuvalFilmus I think I need to clarify it more that my answer is strictly about "why we don't use", and not "XOR is proven to be bad" (because I can't prove it). – Agent_L – 2016-05-20T10:42:48.017

1@Agent_L Assuming that the two keys being used to generate the two pseudorandom sequences are independent, it's hard to see how XORing makes things worse. If you fix one of the keys, you still get the other pseudorandom sequence XORed by some fixed sequence. Does a pseudorandom sequence get worse if you XOR it to a fixed sequence? – Yuval Filmus – 2016-05-20T10:46:23.560

@YuvalFilmus you said "if you XOR two independent pseudorandom sequences (i.e., generated with different keys)". I'm not sure what you mean by different keys. Suppose the two RNGs use the same algorithm, but starting with different seeds. Would that qualify? If so, I don't think I'd trust that at all: I'd suspect it's likely to grossly magnify slight correlations between, say, the sequence Xi and X(i+100) that would otherwise not be a problem. In other words, this is definitely not an example of independent sources. – Don Hatch – 2016-05-20T10:48:18.043

2@DonHatch Certainly, that would qualify. Let's say that your PRNG generates a sequence of length 100, then a noisy version of the same sequence, and so on. Suppose the bitwise correlation of the second copy with the first is $\Pr[X_{i+100} = X_i] = (1+\epsilon)/2$. The XORed sequence $Z_i = X_i \oplus Yi$ satisfies $\Pr[Z{i+100} = Z_i] = (1+\epsilon^2)/2 $. Since $\epsilon^2 \leq |\epsilon|$, it's fair to say that the correlations have not been "grossly magnified", but rather grossly reduced. – Yuval Filmus – 2016-05-20T10:53:41.497

3@YuvalFilmus You are probably correct that the correlation between item i and item i+100 got grossly reduced, but that's not the point. For a very specific and real-life example: I remember the old crappy rand() implementation on unix had periodic behavior in the lowest-order bit of each 31-bit integer returned, which most people didn't notice. Xor that sequence of ints with shifted copy of itself (which is what you get when you use a different seed) of unfortunate shift size, you'll get all even numbers. That's much worse than the problem in the original sequence, for most purposes. – Don Hatch – 2016-05-20T11:09:38.143

2@DonHatch You are confusing the complexity of a given sequence with the complexity of the random distribution. What you say might be true for the complexity of a single realization, but definitely not for the complexity of the random distribution. See my answer for details. – Yuval Filmus – 2016-05-20T11:17:41.293

@YuvalFilmus I don't think I'm confusing anything, but of course that could be my confusion talking. :-) Anyway I replied to your answer, see what you think. – Don Hatch – 2016-05-20T11:33:24.637

1@DonHatch You're right, there's no confusion, just different definitions. It seems much harder to capture your type of definition formally. – Yuval Filmus – 2016-05-20T11:36:44.823

Yuval and Don, please move the discussion to chat. Agent_L, you may want to join them. Please edit the question to incorporate any insight the discussion leads to

– Raphael – 2016-05-21T10:22:54.560