## How is BIP 158 wallet rescanning supposed to work with xpub keys?

1

Now, let's say we have an HD wallet (BIP32), and/or an xpub key. How would we decide whether to download the block, based on the filter? Do we have to iterate over all possible derivations/addresses (potentially thousands) for each filter? Is there a more efficient solution?

1

No, you essentially have to do exactly that: test the block filter for each address that might have gotten used by your wallet. That might not be quite as bad as you seem to think, though.

Deterministic wallet backups usually come with a master secret and the derivation path. If you have both these pieces of information, only one derivation path needs to be searched.

Then, your wallet would usually generate a number of addresses in advance as a form of lookahead. The size of this buffer is often referred to as the gap limit. If any of the wallet's addresses appear in a transaction during the rescan, the buffer is refilled with additional new addresses before continuing to scan from that point on. The underlying assumptions are that addresses are generally used in the order they are generated, and that personal use of a wallet does not generate a lot of unused addresses. Therefore, light client wallets often employ a fairly small gap limit by default.

Some wallet backups additionally store the creation date of the wallet, which allows a rescan to ignore prior blocks than the corresponding blockheight.

Ok, so if I get this, given you have the derivation path, eg. m/44/..., you have to check all derivable addresses within that path, up to the gap limit? Or can you stop searching if the first derivable address isn't a hit? – cloudhead – 2020-12-24T23:45:43.523

1Since the user might give out the first address and then never get paid to it, that would make it likely that you don't find the funds associated with the wallet. You'd want to derive "the first gap limit" addresses in the path and scan the blocks in order for hits. – Murch – 2020-12-24T23:51:35.537

Gotcha. Re-reading your post makes me understand why it's called a gap-limit, since it's the maximum number of unused addresses between two addresses that are used. Hence every time you get a hit, you look-ahead for another gap-limit addresses. – cloudhead – 2020-12-25T00:09:34.937

1Yep, you got it! :) – Murch – 2020-12-25T00:22:38.723