Libvorbis Sample Rate Conversion and Resampling
This article explains how the libvorbis codec library
handles audio sample rates during the encoding and decoding processes.
You will learn whether libvorbis includes native resampling
capabilities or if it relies on external libraries to convert sample
rates.
No Native Sample Rate Conversion in Libvorbis
The libvorbis library does not perform sample rate
conversion. It is designed strictly as an audio compression and
decompression engine. Its core responsibility is to convert raw PCM
audio into compressed Vorbis packets (encoding) and to turn those
packets back into raw PCM audio (decoding).
Because digital signal processing (DSP) operations like resampling
are highly complex and computationally distinct from compression, the
developers of Vorbis chose to keep these systems separate. Consequently,
libvorbis expects the input sample rate to match the
desired output sample rate of the encoded stream.
Handling Resampling During Encoding
When encoding audio with libvorbis, you must configure
the encoder to match the sample rate of your input PCM data. For
example, if you feed 44.1 kHz PCM data into the encoder, you must
initialize the vorbis_info struct to 44.1 kHz. The
resulting Vorbis stream will be flagged and encoded at 44.1 kHz.
If you need to change the sample rate during the encoding process
(for example, downsampling a 96 kHz studio master to a 44.1 kHz Ogg
Vorbis file), you must perform this conversion before
the audio data reaches libvorbis. Developers typically
achieve this by piping the audio through an external resampling library.
Popular choices include:
- SoX Resampler library (libsoxr): A high-quality, fast resampling library.
- Secret Rabbit Code (libsamplerate): A widely used library offering various converter types ranging from high-quality sinc interpolation to fast linear interpolation.
- SpeexDSP: A lightweight resampler often used in embedded or real-time systems.
Handling Resampling During Decoding
During decoding, libvorbis reads the Ogg Vorbis stream
header, identifies the original sample rate, and outputs PCM audio at
that exact rate. If a Vorbis file was encoded at 32 kHz,
libvorbis will output 32 kHz PCM audio.
If the playback hardware or the audio subsystem (such as Windows
WASAPI, macOS CoreAudio, or Linux ALSA) requires a specific sample rate
(like 48 kHz), the decoding application must handle the conversion. The
application must take the decoded PCM buffer from libvorbis
and pass it through an external resampling library or the operating
system’s native audio mixer before sending it to the audio hardware.