XM File Format
XM File Format
Tracker Music
XM
Vladimir Kameñar 1
The Unofficial XM File Format Specification
FastTracker II, ADPCM and Stripped Module Subformats
Vladimir Kameñar
CelerSMS
Bogota, Colombia
The author and publisher have taken care in the preparation of this book, but make no expressed or
implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed
for incidental or consequential damages in connection with or arising out of the use of the information
contained herein.
Kameñar, Vladimir
The Unofficial XM File Format Specification / Vladimir Kameñar – 1st ed.
ISBN 978-958-53602-0-4
004.6
ISBN 978-958-53602-0-4
OCLC 1262695345
Preface ......................................................................................................................................................... 5
Introduction .................................................................................................................................................. 6
General layout of the XM file ...................................................................................................................... 8
XM header ................................................................................................................................................... 9
Pattern header ............................................................................................................................................ 12
Instrument header ...................................................................................................................................... 13
Sample header ............................................................................................................................................ 15
Sample data ................................................................................................................................................ 15
Pattern format ............................................................................................................................................ 17
Annex A: Volumes and envelopes............................................................................................................. 20
Annex B: Periods and frequencies ............................................................................................................. 21
Preface
XM is one of the major module formats. Module files, also known as tracker modules or tracker music,
are a family of music files originated in the times of the Amiga computer in the late 80s. They became an
important part of the demoscene subculture. The modules are still popular among the professional
musicians, demosceners and enthusiasts. Many module songs are available for free. For example, the
Mod Archive1 has one of the largest online collections of free tracker music.
Module files are different from the popular audio formats like MP3 or MIDI. The contents of a module file
include both the song data (sequences of notes, loops, effects) and the sample data. The sample data
describe the instruments used to reproduce the tracks. The module player software uses this data to
recreate a virtual set of instruments and make them play the song. The MIDI files don’t include the sample
data. Therefore, only a standard set of instruments is available and the playback can sound different
depending on the synthesizer hardware or software. On the other hand, the prerecorded audio formats
like MP3 or OGG, when uncompressed, contain the actual PCM audio data. Therefore, it is possible to
convert MIDI to module file and module file to prerecorded audio format like MP3. However, the opposite
process is almost impossible.
Module files can be very compact. A full song can fit into less than 1Kb. For example, the song Minimal
III by SofT MANiAC in XM format is just 905 bytes without compression.
There are module file players for any operating system, even for mobiles.
This specification explains the bits and the bytes of the XM file format. There are several open source
XM player implementations in different languages, from low-level assembly to high-level C++ and Java
and even JavaScript. Looking into the source code and reading the specification can help you understand
how it works.
There are other popular module file formats, for example: MOD, IT, S3M, V2M. 2
Vladimir Kameñar
Bogota, Colombia
August 2021
1
The Mod Archive https://modarchive.org
2
Farbrausch demo tools https://github.com/farbrausch/fr_public
5
Introduction
This specification describes the structure of regular FastTracker II, ADPCM-compressed and Stripped
Module files. It is unofficial because the author doesn’t hold the copyright to the XM file format. There are
many extensions to the original XM format, for example: ADPCM-compressed, OggVorbis-compressed,
Stripped Module and others. Many XM files aren’t compliant with the original standard because of
accidental or intentional modifications. Therefore, it’s hardly possible to describe a global XM standard.
The original specification doesn’t cover the extensions. That’s why the µFMOD developers decided to
write and maintain this unofficial specification.
The original FastTracker II file format was introduced in 1994 by Triton, a famous demoscene team.3 It is
commonly referred as the “eXtended Module” (hence the file extension). The XM format is an extended
version of the original MOD format introducing several new features, for example: multi-sample
instruments,4 volume and panning envelopes,5 sample looping,6 portamento frequency tables, new
extended effect commands, basic pattern compression, among other improvements.
Then, in the early 2000, ByteRaven and Wodan from TNT/NO-ID corrected and extended the original XM
specification.
Another popular extension is OXM – OggVorbis-compressed XM.7 It preserves the original XM file
structure. The instrument samples are compressed in OggVorbis stream format. There are at least 2
known OXM subformats. None of them are covered in this document. The Firelight FMOD library supports
OXM files.
3
Mr.H, The XM module format description for XM files version $0104, Triton, 1994
4
Sawyer, Ben et. al., Game Developer's Marketplace, Coriolis Group Books, ISBN 1576101770, 1998
5
Parekh, Ranjan, Principles of Multimedia, Tata McGraw-Hill Education, ISBN 9780070588332, 2006
6
Alves de Abreu, Valter Miguel, Analysing trackers and their formats, University of Porto, S2CID 192364225, 2018
7
Sweet, Michael, MOD File Sequencing, Addison-Wesley, ISBN 9780321961587, 2014
6
The Stripped Module file format is another non-standard XM subformat. It was introduced in µFMOD in
2006. A Stripped Module file is smaller than a regular XM, because its headers are more compact. The
Stripped Module format is a superset of the original XM. Therefore, a player supporting stripped modules
can also support regular XM files. The audio content is unaffected while converting a regular XM file to
stripped format. There is a free open source tool XMStrip,8 which converts regular XM to Stripped Module
format and vice versa. The same tool can recover damaged or otherwise corrupt XM files.
There are more non-standard XM extensions. For example, some trackers use proprietary effect
commands to triggers software events. There are other extensions adding Text2Speech (TTS) metadata,
watermarks and so on. Unfortunately, very little or no documentation is provided to support these features
or at least ignore them safely while loading a non-standard XM file.
This document describes only the original FastTracker II file format, the ADPCM extension and the
Stripped Module. A comprehensive description of the effect commands is available in Thunder’s
MODFIL10.TXT.
8
μFMOD Guide https://ufmod.sourceforge.io/Win32/en.htm
7
General layout of the XM file
XM header
1st sampleheader
Additional information
8
XM header
ID text
Should read 'Extended module: ' in a normal XM file. In a Stripped XM this field usually contains
just nulls. Some people clear or scramble this magic text in their XM files when embedding into an
EXE to prevent others from ripping the track. Don't rely on this string when checking an XM file for
validity.
Module name
Should be an ASCII string padded with spaces. Might be zero padded or empty as well (all spaces
or all nulls). Some authors store random data here. Don't rely on Module name being a valid ASCII
string.
0x1A
The hex value 0x1A in a normal XM file or 0x00 in a Stripped Module. Since most players check
this field, the XMStrip tool clears it to prevent players not actually supporting the stripped format
from incorrectly loading a Stripped XM. Apparently the value 0x1A has a special “escape” meaning.
For example, if you print the contents of an XM file using the shell’s cat command, it will stop after
dumping “Extended module: “ and the module’s name. None of the following binary contents will
be printed. That’s how it was supposed to be if everybody respected the standards.
Tracker name
Should read 'FastTracker v2.00 ' or 'FastTracker II ' but some trackers (e.g. DigiTracker) use
this field for other purposes (DigiTracker stores the Composer's name here). Should contain nulls
in a Stripped XM. If this field doesn’t contain valid data, it doesn’t mean that the XM file is corrupt.
Version number
Hi-byte major and low-byte minor version numbers in a normal XM or 0x0000 in a Stripped XM. For
example, 0x0104 means v1.4. None of the extensions use this field for identifying themselves. So,
v1.4 doesn’t mean that it’s a standard XM file.
Header size
Total size of the following header data until the header of the 1st pattern. Using this value is the
only way to locate the header of the 1st pattern. In a normal XM the minimal value is 20 + 256 =
9
0x00000114. This is so because the pattern order table in a normal XM has a fixed size of 256
bytes. For example, if the pattern order table consists of the following indexes:
0 1 4 1
00 01 04 01 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Instead of zeroes, some trackers store negative values. The actual pattern order table length is
given in the [Song length] field (see below). So, you can easily identify the padding bytes. That’s
why these bytes are not stored in a Stripped XM file and the [Header size] could be lower than
0x00000114. The minimal value is 20 + 1 (at least 1 pattern should exist). In both normal and
Stripped XM files [Header size] + 60 is the exact offset where the 1st pattern header begins. The
only difference is that no padding exists in a Stripped XM. The same example pattern order in a
Stripped XM would be:
00 01 04 01
Using only 4 bytes instead of 256. However, the formula is exactly the same in both cases.
Additional data may exist between the pattern order table and the 1st pattern header (normally you
would just skip this additional data). Some trackers store the composer's name here, metadata and
so on.
There is no practical maximum header size (except the obvious 232-1 value), because nothing limits
the amount of additional data. That’s why you should always use the [Header size] value + 60 to
jump directly to the 1st pattern header.
Song length
This is the pattern order table size in bytes, as mentioned earlier. Maximum value is 256. Minimum
is 1, because a zero-length song doesn’t make sense.
Restart position
Zero-based index to the pattern order table where the track should continue at after the end of the
table is reached. Some trackers append an empty (silence) pattern at the end and set restart
position to that pattern to prevent looping. When restart position holds an invalid index (greater than
[Song length]), a zero value should be used instead.
10
Number of channels
The number of mixing channels. This has nothing to do with mono/stereo. Defines the number of
columns in the pattern packets. In the original FastTracker II specification the maximum value was
32. However, XM files containing more than 32 channels do exist. µFMOD suggests 64 as a more
realistic maximum value.
Number of patterns
The number of patterns stored in the file. There should be at least 1 pattern. The maximum value
is 256. Don’t confuse this with [Song length]!
Number of instruments
In the range from 0 to 128.
Flags
Only LSB (bit 0) is used. It defines the frequency table:
0 = Amiga
1 = Linear
05 08 03 08 ...
Pattern #5 will be the first one to play, if it exists in the file. If it doesn’t exist, an empty pattern
(dummy pattern) should be used instead. After #5 finishes playback, #8 will be triggered (once
again, if it exists). Then, #3 will start playing. Then, #8 will be used once again and so on.
11
Pattern header
Packing type
Almost always 0. Doesn't mean anything.
Also note that whenever a pattern number in the pattern sequence table is higher than the actual
number of patterns (this is common for converted S3M's), you should play the standard empty
pattern. If packed pattern data size is not 0, it might not match the exact size in bytes. This is more
like a boolean value. So, it's recommended to parse the following pattern data until you read in a
total of (number of rows) * (number of channels) patterns.
12
Instrument header
Instrument size
The instrument header size in bytes. The standard minimal value is 29, meaning a totally empty
instrument. Stripped minimal value is 4 and any of the following fields where the offset is greater
than the instrument size should be set to zero.
Instrument name
A space-padded or zero-padded ASCII string, identifying the current instrument’s name, but could
be an arbitrary sequence of bytes.
Instrument type
Almost always 0. Doesn't mean anything.
13
Sample header size
The size of the sample header in bytes. You should use the "sample header size" (first field of
second part of instrumentheader) to compute the total size of the 2 headers stored in the file.
14
Sample header
Note: If the sample loop length is 0, the sample is *NOT* a looping one, even if the "Forward loop" bit is
set in the "TYPE" field.
Sample data
Regular samples are stored as either 8- or 16-bit signed delta values. The size of the sample buffer in
bytes is given in the [Sample header].[Sample length] field. To convert to real data:
old = 0;
for(i = 0; i < len; i++){
old += sample[i];
real_sample[i] = old;
}
00 01 FF 02 01 FE ...
In decimal:
0 1 –1 2 1 –2 ...
Let’s convert it to real data. The first sample value will be 0. The second one will be the previous (0) plus
the delta value (1): 0 + 1 = 1. Next one: 1 + (-1) = 0 and so on:
0 1 0 2 3 1 ...
15
ADPCM-compressed samples are stored in the following format:
The compression table is a 16-byte array, similar to a color palette in a bitmap file. Every byte in the array
represents a delta value. Instead of storing the delta values directly in the sample data buffer, an ADPCM
sample consists of 4-bit indexes into the compression table, where the actual deltas are stored. The
contents of the compression table are almost always the same:
00 01 02 04 08 10 20 40 FF FE FC F8 F0 E0 D0 C0
or in decimal:
However, the actual values may be different (at least in theory). The compressed sample data is located
at offset 16 just after the compression table. Its size should be calculated as follows:
old = 0;
for(i = 0, j = 0; i < len; i++, j += 2){
index = sample[16 + i];
old += sample[index & 0xF];
real_sample[j] = old;
old += sample[index >> 4];
real_sample[j + 1] = old;
}
An ADPCM example:
00 01 02 04 08 10 20 40 FF FE FC F8 F0 E0 D0 C0 00 08 00 18 31 02 81 ...
Bytes in red are the compression table. The 4-bit indexes begin at offset 16. Every sample byte holds 2
indexes. For example, 0x31 holds index 1 in the low-order nibble and index 3 in the high-order nibble.
Let’s extract all indexes starting at byte offset 16 (low-order nibble goes first):
0 0 8 0 0 0 8 1 1 3 2 0 1 8 ...
The corresponding delta values (from the preceding compression table) are:
00 00 FF 00 00 00 FF 01 01 04 02 00 01 FF ...
In decimal:
0 0 –1 0 0 0 –1 1 1 4 2 0 1 –1 ...
Now, let’s perform the delta conversion, as we did in the previous example:
0 0 –1 –1 –1 –1 –2 –1 0 4 6 6 7 6 ...
16
Pattern format
Note
Possible values (when MSB = 0):
0 No note
1 C-1 Do
2 C#1 Di (Sharp)
3 D-1 Re
4 D#1 Ri (Sharp)
5 E-1 Mi
6 F-1 Fa
7 F#1 Fi (Sharp)
8 G-1 Sol
9 G#1 Si (Sharp)
10 A-1 La
11 A#1 Li (Sharp)
12 B-1 Ti
13 C-2 Do 2nd octave
...
96 B-8 Ti 8th octave
97 Key off (aka 'Note off')
A simple packing scheme is also adopted, so that the patterns don’t become too large. The MSB
in the note value is used for the compression. If the bit is set, then the other bits are interpreted as
follows:
01 01 00 00 00
and
83 01 01
17
0x83 in binary is 10000011. Since the MSB is set, the low-order nibble is interpreted as follows:
note follows, instrument follows, volume column not present, effect type not present, effect
parameter not present. So, the volume column, the effect type and the effect parameter should be
cleared to 0.
Effects
0 Arpeggio
1 (*) Porta up
2 (*) Porta down
3 (*) Tone porta
4 (*) Vibrato
5 (*) Tone porta+Volume slide
6 (*) Vibrato+Volume slide
7 (*) Tremolo
8 Set panning
9 Sample offset
A (*) Volume slide
B Position jump
C Set volume
D Pattern break
E1 (*) Fine porta up
E2 (*) Fine porta down
E3 Set gliss control
E4 Set vibrato control
E5 Set finetune
E6 Set loop begin/loop
E7 Set tremolo control
E9 Retrig note
EA (*) Fine volume slide up
EB (*) Fine volume slide down
18
EC Note cut
ED Note delay
EE Pattern delay
F Set tempo/BPM
G (010h) Set global volume
H (*) (011h) Global volume slide
I (012h) Unused
J (013h) Unused
K (014h) Unused
L (015h) Set envelope position
M (016h) Unused
N (017h) Unused
O (018h) Unused
P (*) (019h) Panning slide
Q (01ah) Unused
R (*) (01bh) Multi retrig note
S (01ch) Unused
T (01dh) Tremor
U (01eh) Unused
V (01fh) Unused
W (020h) Unused
X1 (*) (021h) Extra fine porta up
X2 (*) (021h) Extra fine porta down
_______________________________________________________________________
(*) = If the command byte is zero, the last nonzero byte for the command should be used.
19
Annex A: Volumes and envelopes
The envelopes are processed once per frame, instead of every frame where no new notes are read.
This is also true for the instrument vibrato and the fadeout.
20
Annex B: Periods and frequencies
Don’t attempt to implement the 2N operation as a bit shift, because N is actually a real number!
Most players use floating point arithmetic to implement the latter formula. If you can’t or don’t want
to use floating point operations, you can use a 768 doubleword array, like ModPlug does. No other
known way exists to compute the linear Frequency values.
When using linear Freq Tables, the distance between two octaves is 255 portamento units. e.g. the
effect "1FF" at speed 1 will slide the pitch one octave up.
WORD PeriodTab[] = {
907,900,894,887,881,875,868,862,856,850,844,838,832,826,820,814,
808,802,796,791,785,779,774,768,762,757,752,746,741,736,730,725,
720,715,709,704,699,694,689,684,678,675,670,665,660,655,651,646,
640,636,632,628,623,619,614,610,604,601,597,592,588,584,580,575,
570,567,563,559,555,551,547,543,538,535,532,528,524,520,516,513,
508,505,502,498,494,491,487,484,480,477,474,470,467,463,460,457
};
21