What Libvorbis Header Packets Must Be Written First

When encoding audio using the libvorbis library, the encoder must generate and write three specific header packets before any compressed audio data packets can be processed or saved. This article explains these three essential header packets—the Identification, Comment, and Setup headers—their distinct purposes, and how they are generated in the initialization sequence to ensure a fully compliant Vorbis stream.

The Three Required Vorbis Headers

To initialize a Vorbis stream, libvorbis requires three header packets to be written in a strict, sequential order. Without these headers, a Vorbis decoder cannot understand how to interpret the subsequent audio packets.

1. The Identification Header (Info Packet)

The Identification header is always the first packet in the stream. It contains the essential, high-level parameters of the audio stream required to initialize the decoder’s hardware or software state.

Vorbis Version: Identifies the version of the Vorbis bitstream.
Audio Channels: Specifies the number of audio channels (e.g., 1 for mono, 2 for stereo).
Sample Rate: Defines the playback sample rate in Hz (e.g., 44100 or 48000).
Bitrate Limits: Declares the maximum, nominal, and minimum bitrates for the stream.

2. The Comment Header (Metadata Packet)

The Comment header is the second packet written to the stream. It contains human-readable metadata using the VorbisComment format.

Vendor String: Identifies the library used to encode the file (e.g., “Xiph.Org libVorbis I”).
User Comments: Contains standard tag fields such as track title, artist name, album, and date in a simple KEY=value format.

3. The Setup Header (Codebooks Packet)

The Setup header is the third and final header packet. It is typically the largest of the three headers because it contains the structural decode maps and lookup tables needed to reconstruct the audio.

Codebooks: The vector quantization codebooks used for entropy decoding.
Time-Domain Transforms: Floor and residue curve configurations.
Mappings and Modes: Instructions on how to combine floors and residues to reconstruct the final audio channels.

How to Generate the Headers in Code

In a standard C program utilizing libvorbis, these headers are generated using the vorbis_analysis_headerout() function after initializing the vorbis_info and vorbis_comment structures.

The API populates three distinct ogg_packet structures representing the headers:

ogg_packet header_id;
ogg_packet header_comment;
ogg_packet header_setup;

// Generate the three header packets
vorbis_analysis_headerout(&vi, &vc, &header_id, &header_comment, &header_setup);

These packets must then be written to the bitstream transport layer (usually Ogg logical bitstreams via ogg_stream_packetin) in the exact order they were generated: header_id first, header_comment second, and header_setup third. Only after these three packets are successfully written and flushed can the encoder begin processing actual audio frames.