How Libvorbis Represents Metadata Tags

This article provides a technical overview of how the libvorbis library manages and represents metadata tags like artist, title, and album. It explains the underlying Vorbis Comment specification, the binary structure of the metadata packet, and how key-value pairs are formatted within Ogg Vorbis audio files.

The Vorbis Comment Format

The libvorbis library stores metadata using a format called Vorbis Comments. Unlike ID3 tags used in MP3 files, which have rigid, binary structures for different fields, Vorbis Comments are simple, text-based, and highly flexible.

Metadata in Vorbis is stored as a list of field vectors consisting of key-value pairs. Each pair is formatted as a single string:

FIELD_NAME=Field Value

Structure of the Comment Packet

In an Ogg Vorbis stream, metadata is stored in the second header packet, known as the Comment Header. The libvorbis API reads and writes this packet using the vorbis_comment struct. The binary layout of this packet is structured as follows:

  1. Vendor String Length: A 32-bit unsigned integer (little-endian) indicating the length of the vendor string in bytes.
  2. Vendor String: A UTF-8 string identifying the library that encoded the file (e.g., “Xiph.Org libVorbis I”).
  3. User Comment Count: A 32-bit unsigned integer (little-endian) specifying the total number of metadata tags.
  4. User Comments: A list of comments, where each entry consists of:
    • A 32-bit unsigned integer representing the length of the comment string.
    • The actual comment string in “KEY=Value” format (UTF-8, not null-terminated).
  5. Framing Bit: A single bit (with value 1) at the end of the packet to ensure framing alignment.

Key Features of Vorbis Metadata