What Libvorbis Header Packets Must Be Written First
When encoding audio using the libvorbis library, the
encoder must generate and write three specific header packets before any
compressed audio data packets can be processed or saved. This article
explains these three essential header packets—the Identification,
Comment, and Setup headers—their distinct purposes, and how they are
generated in the initialization sequence to ensure a fully compliant
Vorbis stream.
The Three Required Vorbis Headers
To initialize a Vorbis stream, libvorbis requires three
header packets to be written in a strict, sequential order. Without
these headers, a Vorbis decoder cannot understand how to interpret the
subsequent audio packets.
1. The Identification Header (Info Packet)
The Identification header is always the first packet in the stream. It contains the essential, high-level parameters of the audio stream required to initialize the decoder’s hardware or software state.
- Vorbis Version: Identifies the version of the Vorbis bitstream.
- Audio Channels: Specifies the number of audio channels (e.g., 1 for mono, 2 for stereo).
- Sample Rate: Defines the playback sample rate in Hz (e.g., 44100 or 48000).
- Bitrate Limits: Declares the maximum, nominal, and minimum bitrates for the stream.
2. The Comment Header (Metadata Packet)
The Comment header is the second packet written to the stream. It contains human-readable metadata using the VorbisComment format.
- Vendor String: Identifies the library used to encode the file (e.g., “Xiph.Org libVorbis I”).
- User Comments: Contains standard tag fields such as
track title, artist name, album, and date in a simple
KEY=valueformat.
3. The Setup Header (Codebooks Packet)
The Setup header is the third and final header packet. It is typically the largest of the three headers because it contains the structural decode maps and lookup tables needed to reconstruct the audio.
- Codebooks: The vector quantization codebooks used for entropy decoding.
- Time-Domain Transforms: Floor and residue curve configurations.
- Mappings and Modes: Instructions on how to combine floors and residues to reconstruct the final audio channels.
How to Generate the Headers in Code
In a standard C program utilizing libvorbis, these
headers are generated using the vorbis_analysis_headerout()
function after initializing the vorbis_info and
vorbis_comment structures.
The API populates three distinct ogg_packet structures
representing the headers:
ogg_packet header_id;
ogg_packet header_comment;
ogg_packet header_setup;
// Generate the three header packets
vorbis_analysis_headerout(&vi, &vc, &header_id, &header_comment, &header_setup);These packets must then be written to the bitstream transport layer
(usually Ogg logical bitstreams via ogg_stream_packetin) in
the exact order they were generated: header_id first,
header_comment second, and header_setup third.
Only after these three packets are successfully written and flushed can
the encoder begin processing actual audio frames.