DirectX/10.0/Direct3D/Direct Sound

This tutorial will cover the basics of using Direct Sound in DirectX 11 as well as how to load and play .wav audio files. This tutorial is based on the code in the previous DirectX 11 tutorials. I will cover a couple basics about Direct Sound in DirectX 11 as well as a bit about sound formats before we start the code part of the tutorial.

The first thing you will notice is that in DirectX 11 the Direct Sound API is still the same one from DirectX 8. The only major difference is that hardware sound mixing is generally not available on the latest Windows operating systems. The reason being is for security and operating system consistency all hardware calls now have to go through a security layer. The older sound cards used to have DMA (direct memory access) which was very fast but doesn't work with this new Windows security model. So all sound mixing is now done at the software level and hence no hardware acceleration is directly available to this API.

The nice thing about Direct Sound is that you can play any audio format you want. In this tutorial I cover the .wav audio format but you can replace the .wav code with .mp3 or anything you prefer. You can even use your own audio format if you have created one. Direct Sound is so easy to use you just create a sound buffer with the play back format you would like and then copy your audio format into the buffer's format and then it is ready to play. You can see why so many applications use Direct Sound due to its pure simplicity.

Note that Direct Sound does use two different kinds of buffers which are primary and secondary buffers. The primary buffer is the main sound memory buffer on your default sound card, USB headset, and so forth. Secondary buffers are buffers you create in memory and load your sounds into. When you play a secondary buffer the Direct Sound API takes care of mixing that sound into the primary buffer which then plays the sound. If you play multiple secondary buffers at the same time it will mix them together and play them in the primary buffer. Also note that all buffers are circular so you can set them to repeat indefinitely.

To start the tutorial we will first look at the updated frame work. The only new class is the SoundClass which contains all the DirectSound and .wav format functionality. I have removed the other classes to simplify this tutorial.

Soundclass.h

The SoundClass encapsulates the DirectSound functionality as well as the .wav audio loading and playing capabilities.

///////////////////////////////////////////////////////////////////////////////
// Filename: soundclass.h
///////////////////////////////////////////////////////////////////////////////
#ifndef _SOUNDCLASS_H_
#define _SOUNDCLASS_H_

The following libraries and headers are required for DirectSound to compile properly.

/////////////
// LINKING //
/////////////
#pragma comment(lib, "dsound.lib")
#pragma comment(lib, "dxguid.lib")
#pragma comment(lib, "winmm.lib")
 
 
//////////////
// INCLUDES //
//////////////
#include <windows.h>
#include <mmsystem.h>
#include <dsound.h>
#include <stdio.h>
 
 
///////////////////////////////////////////////////////////////////////////////
// Class name: SoundClass
///////////////////////////////////////////////////////////////////////////////
class SoundClass
{
private:

The WaveHeaderType structure used here is for the .wav file format. When loading in .wav files I first read in the header to determine the required information for loading in the .wav audio data. If you are using a different format you will want to replace this header with the one required for your audio format.

	struct WaveHeaderType
	{
		char chunkId[4];
		unsigned long chunkSize;
		char format[4];
		char subChunkId[4];
		unsigned long subChunkSize;
		unsigned short audioFormat;
		unsigned short numChannels;
		unsigned long sampleRate;
		unsigned long bytesPerSecond;
		unsigned short blockAlign;
		unsigned short bitsPerSample;
		char dataChunkId[4];
		unsigned long dataSize;
	};
 
public:
	SoundClass();
	SoundClass(const SoundClass&);
	~SoundClass();

Initialize and Shutdown will handle everything needed for this tutorial. The Initialize function will initialize DirectSound and load in the .wav audio file and then play it once. Shutdown will release the .wav file and shutdown DirectSound.

	bool Initialize(HWND);
	void Shutdown();
 
private:
	bool InitializeDirectSound(HWND);
	void ShutdownDirectSound();
 
	bool LoadWaveFile(char*, IDirectSoundBuffer8**);
	void ShutdownWaveFile(IDirectSoundBuffer8**);
 
	bool PlayWaveFile();
 
private:
	IDirectSound8* m_DirectSound;
	IDirectSoundBuffer* m_primaryBuffer;

Note that I only have one secondary buffer as this tutorial only loads in one sound.

	IDirectSoundBuffer8* m_secondaryBuffer1;
};
 
#endif

Soundclass.cpp

///////////////////////////////////////////////////////////////////////////////
// Filename: soundclass.cpp
///////////////////////////////////////////////////////////////////////////////
#include "soundclass.h"

Use the class constructor to initialize the private member variables that are used inside the sound class.

SoundClass::SoundClass()
{
	m_DirectSound = 0;
	m_primaryBuffer = 0;
	m_secondaryBuffer1 = 0;
}
 
 
SoundClass::SoundClass(const SoundClass& other)
{
}
 
 
SoundClass::~SoundClass()
{
}
 
 
bool SoundClass::Initialize(HWND hwnd)
{
	bool result;

First initialize the DirectSound API as well as the primary buffer. Once that is initialized then the LoadWaveFile function can be called which will load in the .wav audio file and initialize the secondary buffer with the audio information from the .wav file. After loading is complete then PlayWaveFile is called which then plays the .wav file once.

	// Initialize direct sound and the primary sound buffer.
	result = InitializeDirectSound(hwnd);
	if(!result)
	{
		return false;
	}
 
	// Load a wave audio file onto a secondary buffer.
	result = LoadWaveFile("../Engine/data/sound01.wav", &m_secondaryBuffer1);
	if(!result)
	{
		return false;
	}
 
	// Play the wave file now that it has been loaded.
	result = PlayWaveFile();
	if(!result)
	{
		return false;
	}
 
	return true;
}

The Shutdown function first releases the secondary buffer which held the .wav file audio data using the ShutdownWaveFile function. Once that completes this function then called ShutdownDirectSound which releases the primary buffer and the DirectSound interface.

void SoundClass::Shutdown()
{
	// Release the secondary buffer.
	ShutdownWaveFile(&m_secondaryBuffer1);

	// Shutdown the Direct Sound API.
	ShutdownDirectSound();
 
	return;
}

InitializeDirectSound handles getting an interface pointer to DirectSound and the default primary sound buffer. Note that you can query the system for all the sound devices and then grab the pointer to the primary sound buffer for a specific device, however I've kept this tutorial simple and just grabbed the pointer to the primary buffer of the default sound device.

bool SoundClass::InitializeDirectSound(HWND hwnd)
{
	HRESULT result;
	DSBUFFERDESC bufferDesc;
	WAVEFORMATEX waveFormat;
 
 
	// Initialize the direct sound interface pointer for the default sound device.
	result = DirectSoundCreate8(NULL, &m_DirectSound, NULL);
	if(FAILED(result))
	{
		return false;
	}
 
	// Set the cooperative level to priority so the format of the primary sound buffer can be modified.
	result = m_DirectSound->SetCooperativeLevel(hwnd, DSSCL_PRIORITY);
	if(FAILED(result))
	{
		return false;
	}

We have to setup the description of how we want to access the primary buffer. The dwFlags are the important part of this structure. In this case we just want to setup a primary buffer description with the capability of adjusting its volume. There are other capabilities you can grab but we are keeping it simple for now.

	// Setup the primary buffer description.
	bufferDesc.dwSize = sizeof(DSBUFFERDESC);
	bufferDesc.dwFlags = DSBCAPS_PRIMARYBUFFER | DSBCAPS_CTRLVOLUME;
	bufferDesc.dwBufferBytes = 0;
	bufferDesc.dwReserved = 0;
	bufferDesc.lpwfxFormat = NULL;
	bufferDesc.guid3DAlgorithm = GUID_NULL;
 
	// Get control of the primary sound buffer on the default sound device.
	result = m_DirectSound->CreateSoundBuffer(&bufferDesc, &m_primaryBuffer, NULL);
	if(FAILED(result))
	{
		return false;
	}

Now that we have control of the primary buffer on the default sound device we want to change its format to our desired audio file format. Here I have decided we want high quality sound so we will set it to uncompressed CD audio quality.

	// Setup the format of the primary sound buffer.
	// In this case it is a .WAV file recorded at 44,100 samples per second in 16-bit stereo (cd audio format).
	waveFormat.wFormatTag = WAVE_FORMAT_PCM;
	waveFormat.nSamplesPerSec = 44100;
	waveFormat.wBitsPerSample = 16;
	waveFormat.nChannels = 2;
	waveFormat.nBlockAlign = (waveFormat.wBitsPerSample / 8) * waveFormat.nChannels;
	waveFormat.nAvgBytesPerSec = waveFormat.nSamplesPerSec * waveFormat.nBlockAlign;
	waveFormat.cbSize = 0;
 
	// Set the primary buffer to be the wave format specified.
	result = m_primaryBuffer->SetFormat(&waveFormat);
	if(FAILED(result))
	{
		return false;
	}
 
	return true;
}

The ShutdownDirectSound function handles releasing the primary buffer and DirectSound interfaces.

void SoundClass::ShutdownDirectSound()
{
	// Release the primary sound buffer pointer.
	if(m_primaryBuffer)
	{
		m_primaryBuffer->Release();
		m_primaryBuffer = 0;
	}
 
	// Release the direct sound interface pointer.
	if(m_DirectSound)
	{
		m_DirectSound->Release();
		m_DirectSound = 0;
	}
 
	return;
}

The LoadWaveFile function is what handles loading in a .wav audio file and then copies the data onto a new secondary buffer. If you are looking to do different formats you would replace this function or write a similar one.

bool SoundClass::LoadWaveFile(char* filename, IDirectSoundBuffer8** secondaryBuffer)
{
	int error;
	FILE* filePtr;
	unsigned int count;
	WaveHeaderType waveFileHeader;
	WAVEFORMATEX waveFormat;
	DSBUFFERDESC bufferDesc;
	HRESULT result;
	IDirectSoundBuffer* tempBuffer;
	unsigned char* waveData;
	unsigned char *bufferPtr;
	unsigned long bufferSize;

To start first open the .wav file and read in the header of the file. The header will contain all the information about the audio file so we can use that to create a secondary buffer to accommodate the audio data. The audio file header also tells us where the data begins and how big it is. You will notice I check for all the needed tags to ensure the audio file is not corrupt and is the proper wave file format containing RIFF, WAVE, fmt, data, and WAVE_FORMAT_PCM tags. I also do a couple other checks to ensure it is a 44.1KHz stereo 16bit audio file. If it is mono, 22.1 KHZ, 8bit, or anything else then it will fail ensuring we are only loading the exact format we want.

	// Open the wave file in binary.
	error = fopen_s(&filePtr, filename, "rb");
	if(error != 0)
	{
		return false;
	}
 
	// Read in the wave file header.
	count = fread(&waveFileHeader, sizeof(waveFileHeader), 1, filePtr);
	if(count != 1)
	{
		return false;
	}
 
	// Check that the chunk ID is the RIFF format.
	if((waveFileHeader.chunkId[0] != 'R') || (waveFileHeader.chunkId[1] != 'I') || 
	   (waveFileHeader.chunkId[2] != 'F') || (waveFileHeader.chunkId[3] != 'F'))
	{
		return false;
	}
 
	// Check that the file format is the WAVE format.
	if((waveFileHeader.format[0] != 'W') || (waveFileHeader.format[1] != 'A') ||
	   (waveFileHeader.format[2] != 'V') || (waveFileHeader.format[3] != 'E'))
	{
		return false;
	}
 
	// Check that the sub chunk ID is the fmt format.
	if((waveFileHeader.subChunkId[0] != 'f') || (waveFileHeader.subChunkId[1] != 'm') ||
	   (waveFileHeader.subChunkId[2] != 't') || (waveFileHeader.subChunkId[3] != ' '))
	{
		return false;
	}
 
	// Check that the audio format is WAVE_FORMAT_PCM.
	if(waveFileHeader.audioFormat != WAVE_FORMAT_PCM)
	{
		return false;
	}
 
	// Check that the wave file was recorded in stereo format.
	if(waveFileHeader.numChannels != 2)
	{
		return false;
	}
 
	// Check that the wave file was recorded at a sample rate of 44.1 KHz.
	if(waveFileHeader.sampleRate != 44100)
	{
		return false;
	}
 
	// Ensure that the wave file was recorded in 16 bit format.
	if(waveFileHeader.bitsPerSample != 16)
	{
		return false;
	}
 
	// Check for the data chunk header.
	if((waveFileHeader.dataChunkId[0] != 'd') || (waveFileHeader.dataChunkId[1] != 'a') ||
	   (waveFileHeader.dataChunkId[2] != 't') || (waveFileHeader.dataChunkId[3] != 'a'))
	{
		return false;
	}

Now that the wave header file has been verified we can setup the secondary buffer we will load the audio data onto. We have to first set the wave format and buffer description of the secondary buffer similar to how we did for the primary buffer. There are some changes though since this is secondary and not primary in terms of the dwFlags and dwBufferBytes.

	// Set the wave format of secondary buffer that this wave file will be loaded onto.
	waveFormat.wFormatTag = WAVE_FORMAT_PCM;
	waveFormat.nSamplesPerSec = 44100;
	waveFormat.wBitsPerSample = 16;
	waveFormat.nChannels = 2;
	waveFormat.nBlockAlign = (waveFormat.wBitsPerSample / 8) * waveFormat.nChannels;
	waveFormat.nAvgBytesPerSec = waveFormat.nSamplesPerSec * waveFormat.nBlockAlign;
	waveFormat.cbSize = 0;
 
	// Set the buffer description of the secondary sound buffer that the wave file will be loaded onto.
	bufferDesc.dwSize = sizeof(DSBUFFERDESC);
	bufferDesc.dwFlags = DSBCAPS_CTRLVOLUME;
	bufferDesc.dwBufferBytes = waveFileHeader.dataSize;
	bufferDesc.dwReserved = 0;
	bufferDesc.lpwfxFormat = &waveFormat;
	bufferDesc.guid3DAlgorithm = GUID_NULL;

Now the way to create a secondary buffer is fairly strange. First step is that you create a temporary IDirectSoundBuffer with the sound buffer description you setup for the secondary buffer. If this succeeds then you can use that temporary buffer to create a IDirectSoundBuffer8 secondary buffer by calling QueryInterface with the IID_IDirectSoundBuffer8 parameter. If this succeeds then you can release the temporary buffer and the secondary buffer is ready for use.

	// Create a temporary sound buffer with the specific buffer settings.
	result = m_DirectSound->CreateSoundBuffer(&bufferDesc, &tempBuffer, NULL);
	if(FAILED(result))
	{
		return false;
	}
 
	// Test the buffer format against the direct sound 8 interface and create the secondary buffer.
	result = tempBuffer->QueryInterface(IID_IDirectSoundBuffer8, (void**)&*secondaryBuffer);
	if(FAILED(result))
	{
		return false;
	}
 
	// Release the temporary buffer.
	tempBuffer->Release();
	tempBuffer = 0;

Now that the secondary buffer is ready we can load in the wave data from the audio file. First I load it into a memory buffer so I can check and modify the data if I need to. Once the data is in memory you then lock the secondary buffer, copy the data to it using a memcpy, and then unlock it. This secondary buffer is now ready for use. Note that locking the secondary buffer can actually take in two pointers and two positions to write to. This is because it is a circular buffer and if you start by writing to the middle of it you will need the size of the buffer from that point so that you don't write outside the bounds of it. This is useful for streaming audio and such. In this tutorial we create a buffer that is the same size as the audio file and write from the beginning to make things simple.

	// Move to the beginning of the wave data which starts at the end of the data chunk header.
	fseek(filePtr, sizeof(WaveHeaderType), SEEK_SET);
 
	// Create a temporary buffer to hold the wave file data.
	waveData = new unsigned char[waveFileHeader.dataSize];
	if(!waveData)
	{
		return false;
	}
 
	// Read in the wave file data into the newly created buffer.
	count = fread(waveData, 1, waveFileHeader.dataSize, filePtr);
	if(count != waveFileHeader.dataSize)
	{
		return false;
	}
 
	// Close the file once done reading.
	error = fclose(filePtr);
	if(error != 0)
	{
		return false;
	}
 
	// Lock the secondary buffer to write wave data into it.
	result = (*secondaryBuffer)->Lock(0, waveFileHeader.dataSize, (void**)&bufferPtr, (DWORD*)&bufferSize, NULL, 0, 0);
	if(FAILED(result))
	{
		return false;
	}
 
	// Copy the wave data into the buffer.
	memcpy(bufferPtr, waveData, waveFileHeader.dataSize);
 
	// Unlock the secondary buffer after the data has been written to it.
	result = (*secondaryBuffer)->Unlock((void*)bufferPtr, bufferSize, NULL, 0);
	if(FAILED(result))
	{
		return false;
	}
	
	// Release the wave data since it was copied into the secondary buffer.
	delete [] waveData;
	waveData = 0;
 
	return true;
}

ShutdownWaveFile just does a release of the secondary buffer.

void SoundClass::ShutdownWaveFile(IDirectSoundBuffer8** secondaryBuffer)
{
	// Release the secondary sound buffer.
	if(*secondaryBuffer)
	{
		(*secondaryBuffer)->Release();
		*secondaryBuffer = 0;
	}

	return;
}

The PlayWaveFile function will play the audio file stored in the secondary buffer. The moment you use the Play function it will automatically mix the audio into the primary buffer and start it playing if it wasn't already. Also note that we set the position to start playing at the beginning of the secondary sound buffer otherwise it will continue from where it last stopped playing. And since we set the capabilities of the buffer to allow us to control the sound we set the volume to maximum here.

bool SoundClass::PlayWaveFile()
{
	HRESULT result;
 
 
	// Set position at the beginning of the sound buffer.
	result = m_secondaryBuffer1->SetCurrentPosition(0);
	if(FAILED(result))
	{
		return false;
	}
 
	// Set volume of the buffer to 100%.
	result = m_secondaryBuffer1->SetVolume(DSBVOLUME_MAX);
	if(FAILED(result))
	{
		return false;
	}
 
	// Play the contents of the secondary sound buffer.
	result = m_secondaryBuffer1->Play(0, 0, 0);
	if(FAILED(result))
	{
		return false;
	}
 
	return true;
}

Systemclass.h

////////////////////////////////////////////////////////////////////////////////
// Filename: systemclass.h
////////////////////////////////////////////////////////////////////////////////
#ifndef _SYSTEMCLASS_H_
#define _SYSTEMCLASS_H_
 
 
///////////////////////////////
// PRE-PROCESSING DIRECTIVES //
///////////////////////////////
#define WIN32_LEAN_AND_MEAN
 
 
//////////////
// INCLUDES //
//////////////
#include <windows.h>
 
 
///////////////////////
// MY CLASS INCLUDES //
///////////////////////
#include "inputclass.h"
#include "graphicsclass.h"

Here we include the new SoundClass header file.

#include "soundclass.h"
 
 
////////////////////////////////////////////////////////////////////////////////
// Class name: SystemClass
////////////////////////////////////////////////////////////////////////////////
class SystemClass
{
public:
	SystemClass();
	SystemClass(const SystemClass&);
	~SystemClass();
 
	bool Initialize();
	void Shutdown();
	void Run();
 
	LRESULT CALLBACK MessageHandler(HWND, UINT, WPARAM, LPARAM);
 
private:
	void Frame();
	void InitializeWindows(int&, int&);
	void ShutdownWindows();
 
private:
	LPCWSTR m_applicationName;
	HINSTANCE m_hinstance;
	HWND m_hwnd;
 
	InputClass* m_Input;
	GraphicsClass* m_Graphics;

We create a new private variable for the SoundClass object.

	SoundClass* m_Sound;
};
 
 
/////////////////////////
// FUNCTION PROTOTYPES //
/////////////////////////
static LRESULT CALLBACK WndProc(HWND, UINT, WPARAM, LPARAM);
 
 
/////////////
// GLOBALS //
/////////////
static SystemClass* ApplicationHandle = 0;
 
 
#endif

Systemclass.cpp

I will just cover the functions that have changed since the previous tutorial.

////////////////////////////////////////////////////////////////////////////////
// Filename: systemclass.cpp
////////////////////////////////////////////////////////////////////////////////
#include "systemclass.h"
 
 
SystemClass::SystemClass()
{
	m_Input = 0;
	m_Graphics = 0;

Initialize the new SoundClass object to null in the class constructor.

	m_Sound = 0;
}
 
 
bool SystemClass::Initialize()
{
	int screenWidth, screenHeight;
	bool result;


	// Initialize the width and height of the screen to zero before sending the variables into the function.
	screenWidth = 0;
	screenHeight = 0;

	// Initialize the windows api.
	InitializeWindows(screenWidth, screenHeight);

	// Create the input object.  This object will be used to handle reading the keyboard input from the user.
	m_Input = new InputClass;
	if(!m_Input)
	{
		return false;
	}

	// Initialize the input object.
	result = m_Input->Initialize(m_hinstance, m_hwnd, screenWidth, screenHeight);
	if(!result)
	{
		MessageBox(m_hwnd, L"Could not initialize the input object.", L"Error", MB_OK);
		return false;
	}

	// Create the graphics object.  This object will handle rendering all the graphics for this application.
	m_Graphics = new GraphicsClass;
	if(!m_Graphics)
	{
		return false;
	}

	// Initialize the graphics object.
	result = m_Graphics->Initialize(screenWidth, screenHeight, m_hwnd);
	if(!result)
	{
		return false;
	}

Here is where we create the SoundClass object and then initialize it for use. Note that in this tutorial the initialization will also start the wave file playing.

	// Create the sound object.
	m_Sound = new SoundClass;
	if(!m_Sound)
	{
		return false;
	}
 
	// Initialize the sound object.
	result = m_Sound->Initialize(m_hwnd);
	if(!result)
	{
		MessageBox(m_hwnd, L"Could not initialize Direct Sound.", L"Error", MB_OK);
		return false;
	}
 
	return true;
}
 
 
void SystemClass::Shutdown()
{

In the SystemClass::Shutdown we also shutdown the SoundClass object and release it.

	// Release the sound object.
	if(m_Sound)
	{
		m_Sound->Shutdown();
		delete m_Sound;
		m_Sound = 0;
	}
 
	// Release the graphics object.
	if(m_Graphics)
	{
		m_Graphics->Shutdown();
		delete m_Graphics;
		m_Graphics = 0;
	}

	// Release the input object.
	if(m_Input)
	{
		m_Input->Shutdown();
		delete m_Input;
		m_Input = 0;
	}

	// Shutdown the window.
	ShutdownWindows();
	
	return;
}

Summary

The engine now supports the basics of Direct Sound. It currently just plays a single wave file once you start the program.

To Do Exercises

1. Recompile the program and ensure it plays the wave file in stereo sound. Press escape to close the window after.

2. Replace the sound01.wav file with your own 44.1KHz 16bit 2channel audio wave file and run the program again.

3. Rewrite the program to load two wave files and play them simultaneously.

4. Change the wave to loop instead of playing just once.