Last Updated March 29, 2011

FMOD

FMOD is a commercial audio library made by Firelight Technologies that plays music files of diverse formats on many different platforms. It is used in games and software applications to provide audio functionality. FMOD supports a range of audio formats and numerous different operating system platforms.

FMOD is composed of 3 main parts, we will be using the low level API known as "FMOD EX".

Overview of the FMOD EX API and its features.

TERMINOLOGY / BASIC CONCEPTS (taken from fmodex.chm)

Introduction

Throughout FMOD documentation certain terms and concepts will be used. This section will explain some of these to alleviate confusion.

It is recommended when you see an API function highlighted as a link, that you check the API reference for more detail.

Samples vs bytes vs milliseconds

Within FMOD functions you will see references to PCM samples, bytes and milliseconds.
To understand what the difference is a diagram has been provided to show how raw PCM sample data is stored in FMOD buffers.

In this diagram you will see that a stereo sound has its left/right data interleaved one after the other.

A left/right pair (a sound with 2 channels) is called a sample.
Because this is made up of 16bit data, 1 sample = 4 bytes.
If the sample rate, or playback rate is 44.1khz, or 44100 samples per second, then 1 sample is 1/44100th of a second, or 1/44th of a millisecond. Therefore 44100 samples = 1 second or 1000ms worth of data.

To convert between the different terminologies, the following formulas can be used.
ms = samples * 1000 / samplerate.
samples = ms * samplerate / 1000.
samplerate = samples * 1000 / ms.
bytes = samples * bits * channels / 8.
samples = bytes * 8 / bits / channels.

Some functions like Sound::getLength provide the length in milliseconds, bytes and samples to avoid needing to do these calculations.

Sounds. Samples vs compressed samples vs streams.
When a sound is loaded, it is either decompressed as a static sample into memory as PCM (samples), loaded into memory in its native format and decompressed at runtime (compressed samples), or streamed and decoded in realtime (in chunks) from an external media such as a harddisk or CD (streams).
"Samples" are good for small sounds that need to be played more than once at a time, for example sound effects. These generally use little or no CPU to play back and can be hardware accelerated.
"Streams" are good for large sounds that are too large to fit into memory and need to be streamed from disk into a small ringbuffer that FMOD manages. These take a small amount of CPU and disk bandwidth based on the file format. For example mp3 takes more cpu power to decode in real-time than a PCM decompressed wav file does. A streaming sound can only be played once, not multiple times due to it only having 1 file handle per stream and 1 ringbuffer to decode into.

Hardware vs Software
FMOD Ex has its support for either hardware accelerated sound playback, via DirectSound or console hardware API's, but FMOD also has its own fallback software mixing mechanism.

Software sounds (created with FMOD_SOFTWARE sometimes have higher CPU impact, but can do much more, for example complex looping, realtime analysis, effects and sample accurate synchronization.

Hardware vs Software.

Hardware Pros.
Usually lower latency. (Although on consoles or ASIO output in windows, using FMOD_SOFTWARE can have extremely low latency as low as 2-5ms)
Less CPU time. (Although on Windows software is a lot faster due to bad hardware sound card driver design, and inefficiencies in the DirectSound API).
On Windows, access to hardware reverb per voice.
Free hardware obstruction / occlusion on Windows

Hardware Cons.
No point to point looping on win32.
No access to hardware effects per voice. Most PC sound cards and consoles do not support hardware accelerated effects such as lowpass, distortion, flange, chorus etc.
No loop count control. A sound can only be looped infinitely or not at all.
Sometimes a lot slower than FMOD software mixing on Windows.

Software Pros.
Consistent sound on every platform, there is no variation in playback.
Sample accurate synchronization callbacks and events.
Cross platform reverb.
Complex looping and loop counts.
Reverse sample playback.
Spectrum analysis.
Filters per channel or for the global mix, to perform effects such as lowpass, distortion, flange, chorus etc.
Complex DSP network construction for realtime sound synthesis.
Access to final mix buffer to allow analyzing, drawing to screen, or saving to file.

Software Cons.
Latency on some sound devices can be high.
Memory usage is higher due to allocation of mix units and mix buffers, or simply the fact of having to store sounds in main ram rather than sound ram.
Channels and sounds.
When you have loaded your sounds, you will want to play them. When you play them you will use System::playSound, which will return you a pointer to a channel / FMOD_CHANNEL handle.
The index that System::playSound requires is generally recommended to always be FMOD_CHANNEL_FREE. This will mean FMOD will choose a non playing channel for you to play on.

2D vs 3D.
A 3D sound source is a channel that has a position and a velocity. When a 3D channel is playing, its volume, speaker placement and pitch will be affected automatically based on the relation to the listener.
A listener is the player, or the game camera. It has a position, velocity like a sound source, but it also has an orientation.

The listener and the source distance from each other determine the volume.
The listener and the source relative velocity determines the pitch (doppler effect).
The orientation of the listener to the source determines the pan or speaker placement.

A 2D sound is simply different in that it is not affected by the 3D sound listener, and does not have doppler or attenuation or speaker placement affected by it.
A 2D sound can call Channel::setSpeakerMix, Channel::setSpeakerLevels or Channel::setPan, whereas a 3D sound cannot.
A 3D sound can call any function with the word 3D in the function name, whereas a 2D sound cannot.

For a more detailed description of 3D sound, read the tutorial in the documentation on 3D sound.

FMOD

TERMINOLOGY / BASIC CONCEPTS (taken from fmodex.chm)

Introduction

Samples vs bytes vs milliseconds

Sounds. Samples vs compressed samples vs streams.

Hardware vs Software

Channels and sounds.

2D vs 3D.