## Artifacts when exporting sound as 8-bit WAV

1

I lead a game-development team. My sound designer gave me some sound effects as 32-bit stereo sounds with a 44100 Hz sampling rate. The game (actually a mod of an old game) requires sounds to be 8-bit mono sounds with a 22050 Hz sampling rate.

We're trying to export the sounds in the required format, but there are serious artifacts -- the resulting sound has a "ringing" effect, even in areas where the input was silent.

My question is:

1. Why is this happening?
2. How can we export the sounds and obtain a decent result?

Here's an example input sound:

https://www.dropbox.com/s/96d8lmabt8608oa/Taking%20Damage.mp3?dl=0

And here's an output:

https://www.dropbox.com/s/tk6ag6fp2pzb1pd/CBHMWNCE.wav?dl=0

1That doesn't make sense to me: the input is actually 22.05 kHz, and the output 44.1 (simply speed changed without resampling). What have you done there? – leftaroundabout – 2017-05-25T10:01:25.190

Oops. I got the input/output backwards. – James Koppel – 2017-05-25T20:44:05.083

2Part of this will be that 8-bit sound just doesn't sound very good. And a bigger part will be that at a sampling rate of 22.05Hz, you're starting to cut out significant amounts of actual sound information. – Linuxios – 2017-05-26T15:36:00.177

2

Given that your input file is 44.1 kHz/32 bits/stereo file and your desired output is 22.050 kHz/8 bits/mono file, there are actually three processes that are involved in your request :

• resampling
• quantization
• down mixing stereo to mono

Notice that given the potentially audibly important changes that these processes can produce, you might have asked the sound designer to deliver the assets in the required format (ie 22.050 kHz/8 bits/mono), giving him/her the responsibility to check that the delivered assets are sounding as he/she expects.

These three processes can all be done in one pass with FFmpeg or SoX. Nevertheless, for the stereo to mono process, you should check :

• that the delivered files are actually stereo and not double mono, in which case you should only use one of the channels.
• that the summing of left and right channels doesn't lead to audible phase issues or peaks issues.

FFmpeg example :  ffmpeg -i input.wav -acodec pcm_u8 -ar 22050 -ac 1 output.wav 

SoX example :  sox input.wav -r 22050 -b 8 -c 1 out.wav 

0

Well, what you have there is still the same sound data as you originally had. All you've changed is slowed down the playback rate, by specifying half the sample rate in the WAVE header but nowhere else. If you do that, every frequency in the signal halves, i.e. every tone comes out an octave lower and twice as long.

I wonder how you could possibly end up there, since every audio-editing software worth half its salt will do the correct thing instead, namely resample the audio data. In simplest terms this means you leave out every other sample, so despite half sample rate you will still have the same playback speed. Actually it's slightly more complicated than that, but that needn't worry you – just let a standard program do the resampling for you. E.g. ffmpeg:

\$ ffmpeg -i Taking\ Damage.mp3 -ar 22.05k CBHMWNCE.wav