How do you make a carrier better for vocoding (with a plugin)?


There's a lot of synths that sound smooth and well rounded, but all of a sudden I try to use them as a carrier for a vocoder and they sound like complete crap. What gives? How can I improve them? What makes a good vocoder carrier?


Speech is anything but "well-rounded". Vowels start at the vocal cords with basically the most overtone-rich signal possible, a pulse train. If you look at the frequency content of such a pulse train, the Fourier transform domain actually also shows a pulse train with (constant-peak) pulses at the multiples of the fundamental frequency. This is the "carrier" for the natural vocoder you are yourself: this pulse train is then shaped in the frequency domain with intentional attenuation and resonances through the vocal tract. The first few overtones carry most of the content (and are named "formants"), higher ones are characteristic for the sound quality of the speech.

For consonants, you are working either with plosives at various points in the vocal tract (basically single pulses) or with fricatives, again at various points in the vocal tract (basically white noise, energy distribution across frequency similar to pulses but without their phase correlation).

Now when vocoding, you basically extract the mouth shape in form of the formants and some of the voice's sound quality as a filter and supply this "shape" with a new excitation. So you need either a comparable fundamental frequency sound with pulse train characteristic, or pulses or noise (vocoders tend to have special signal paths for consonants so that you don't end up with a complete hot-potato vowels-only result).

So most importantly you are looking for a suitable carrier to work with. "Smooth and well-rounded" isn't it because its frequency content is falling off too fast. So you are looking for similar fundamental frequencies as your original voice, and you are looking for overtone-rich sound with pulse-train qualities. That means looking in the direction of brass and saxophone or accordion without tone chamber (a tone chamber already sculpts its own formants).

For purely synthesized sounds, you'll be better off with harsher-sounding wave shapes since the filter banks will have more material to work with then.

If you use smooth sounds with few overtones, vowels will have no chance to get recognizably impressed on them and you'll basically get a constant "uh" sound of varying volume with whatever consonants the vocoder is prepared to pass through with some different mechanism.


When working with plugins, it might make sense to try working with 96kHz as sample frequency rather than 48kHz (or 88.2kHz if you are sampling for CD).

The kind of filter envelope processing a vocoder does will generally cause frequency bleeding, and a digital synth will often provide waveforms anyway that aren't quite properly lowpass-processed with respect to the sampling frequency.

Using a significantly higher sampling frequency will move a majority of possible problems into inaudible frequency ranges where the final filter/downsampling stage of processing to arrive at your nominal sampling frequency can remove them.


