Understanding the libvorbis vorbis_comment Structure
This article provides a clear and concise overview of the
vorbis_comment structure within the libvorbis
library. You will learn about the specific fields defined in this C
structure, how metadata is formatted, and how the library handles audio
tagging (such as artist name, track title, and album info) in Ogg Vorbis
files.
In the libvorbis library, the
vorbis_comment structure is the primary container used to
read, write, and manage metadata. Similar to ID3 tags in MP3 files,
Vorbis comments store tag information as human-readable text.
The vorbis_comment Structure Definition
In the C library header (<vorbis/codec.h>), the
structure is defined as follows:
typedef struct vorbis_comment {
char **user_comments;
int *comment_lengths;
int comments;
char *vendor;
} vorbis_comment;Description of Structure Fields
The vorbis_comment structure contains four key
fields:
user_comments: This is an array of character pointers (strings). Each entry in this array represents a single metadata tag.comment_lengths: This is an array of integers that correspond to the length (in bytes) of each string in theuser_commentsarray. This is crucial because Vorbis comment strings are not required to be null-terminated, allowing them to safely contain arbitrary binary data or UTF-8 characters of any length.comments: An integer representing the total count of user comments stored in theuser_commentsarray.vendor: A null-terminated UTF-8 string that identifies the library or encoder used to generate the Vorbis bitstream (for example, “AO; Lancer(20061110)” or “Xiph.Org libVorbis I 20200113”).
How Metadata is Formatted
The metadata strings stored within the user_comments
array follow a strict “FIELD=value” format:
- UTF-8 Encoding: All comments must be encoded in UTF-8.
- Field Names: Field names are case-insensitive
(though conventionally written in uppercase) and can consist of
printable ASCII characters ranging from
0x14to0x7D, excluding the equals sign (=). Common field names includeTITLE,ARTIST,ALBUM,DATE, andGENRE. - Field Values: The value follows the
=sign and contains the actual metadata content. - Duplicate Fields: The format explicitly allows
duplicate field names. For example, a track with two artists can have
two separate comment entries:
ARTIST=Artist AandARTIST=Artist B.