Mixing and resampling systems

The mixing system is the most processor-intensive part of typical audio runtime systems, particularly because of the requirement for high-quality resampling for moving sources and optimised sample replay.

This chapter explains with words, code and diagrams how real-time mixing works, addressing diegetic 3D and non-diegetic sounds like most speech and music, mono, stereo and soundfield assets, sample rate conversion, sub-bass sweeteners, Ambisonic soundfield rotation and filtering, windowing, DFTs and FFTs (discrete or fast Fourier transforms, including constant-Q variants), anti-aliasing and the Nyquist limit.

It fully demonstrates three resampling techniques (Linear, Hermite and Sinc) and ways to refine and combine those, frequency band peak metering to BBC, Nordic, DIN and US standards, and the avoidance of clicks in the output as sources move, start and stop. Examples use C++ and equivalent SIMD vector intrinsics, incorporating optimisations suitable for AltiVec, ARM NEON, ARM64, AVX, SPE, SPU, SSE and VMX parallel acceleration hardware, and compiler-specific tips.

The practical strengths and weaknesses of floating-point arithmetic are revealed in necessary detail, including overflows, underflows, denormalised arithmetic and what makes some games much slower when paused than in normal play!

Ambisonic mixing to 5.1 speakers
A simple scheme for mixing Ambisonic and non-diegetic content to 5.1 channel cinema speakers

Mixing and resampling systems

Further reading