Game Creation with XNA/Audio Sound/Creation

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Creation[edit | edit source]

Notes: decibel, frequency, oscillators, DFT, FFT(dissect a tone in sine waves), ASDR-Envelopes, MIDI, Well temperament, overtones, timbre, pitch, amplitude, phase, 3D sound, ear anatomy, sound tutorials, free software, sequencer, noise & tones,

Creating a sound is easy and almost anything we do creates sound. In musical contexts sound is created by acoustic or electric instruments or analog or digital hardware. To use sounds in a game they must first be recorded and digitized, either in the recording process itself or afterwards. It is increasingly difficult to find places on earth that are absent of man made sound, so it is easy to understand that games trying to imitate reality should have sound in almost every sequence, even if only in the background. Filmmakers record background noise repeatedly over the course of a shoot to increase the authenticity of a film. There are several basic steps in capturing sound: recording, manipulation/ effecting and playing/ reproduction. XNA Game Studio 4 added classes for handling MP3s and capturing and playing back sound from a headset, so even a user's voice can be processed in the same way as a normal recording.

Recording[edit | edit source]

In general, sound is recorded in analog or digital form. Because of its low start up cost and easy, precise editing, digital recording is the more popular form of recording.

A typical computer based recording studio setup

Digital audio recording is the act of recording a sound by taking discrete samples of its wave form and turning them into digital information that can be stored or processed. Digital recording is typically done on a computer, but can also be done with a stand-alone recorder with a hard drive, or a handheld device with flash memory.

A hand held digital recorder

The sampling rate is measured in Hertz, and is the number of times a second a sound is sampled. The bit depth , measured in bits, is how much information is sampled, each time a sample is taken. Higher bit depths offer a more accurate approximation of a wave form. A "CD quality" audio recording is 16 bits, at a 44.1 kHz sampling rate. Generally, the highest quality digital recordings are 24 bit at 192 kHz. Historically, due to space limitations, games were limited to 8 bit recordings. These "classic" game sound effects and music are easily distinguished from their more modern counterparts. It is comparatively easy to record digitally, for several reasons. Digital recording, in its most basic form, requires only a computer. With the use of plugins, a computer can generate most of the sound a user might need. More elaborate setups might include an audio interface, for recording live instruments or midi signals. Live (microphone or instrument input) and computer generated sound can be seamlessly mixed in audio software. Editing is nonlinear and is also simple.

a mixture of slide guitar, bass guitar and software plugins

A user can cut, copy and paste pieces of a recording and arrange them as desired. These functions can also be performed across projects and platforms.

Compression Until recently, MP3 was by far the most popular form compressing an audio file. MP3s are satisfactory for a game if they are primary compressions (i.e. the first time a full quality audio file has been compressed) above a 160 kbit/s bit rate. Any bit rate below that begins to sound "lossy." As of version 4, XNA Game Studio has WAV and MP3 importer classes, meaning a game's sound quality is basically up to the creator.

Analog audio recording is the act of recording a sound wave in its entirety, as an electronic signal, typically onto magnetic tape. Before an analog recording can be put on CD or used in a game, it must be digitized. The signal can be recorded with less noise if this conversion is done during recording rather than as a separate step. This form of recording is typically ruled out by modern musicians, due to the expense and the time it requires. The need for an engineer, mixing board, tape machine, tape reels and sound room, contribute to the cost. Editing is more laborious because it is linear. That is, an engineer cannot simply copy one good part of a recording to multiple parts of a song. Editing means physically cutting the tape, or rerecording part by part.

A condenser microphone

Microphones use a similar principle the human eardrum to receive sound. Inside a microphone, a membrane or set of ribbons is displaced by a sound wave and triggers an electrical signal, also a wave form. That is, a microphone translates the sound wave (most often vibrating air) into an electrical wave form, using magnets to generate the electrical signal. There are two general kinds of microphone. Dynamic microphones, which are passive, needing no external power to send electrical signals. Condenser microphones need an external power source called phantom power to function. This is commonly 48 volts and is sent to the microphone through its cable from a mixer or microphone amplifier.

MIDI allows separate external synthesizers and other audio equipment to communicate with each other and was an essential part of any studio until USB began replacing its hardware in the early 2000s.

Acoustic instruments are the predecessors to electric instruments and need no amplification to be heard. They are recorded by using a microphone to pick up their sound.

Electric instruments (e.g. guitars and bass guitars) use the vibration of strings over magnetic coils to generate an electrical signal. To be heard, these signals must be amplified and sent through loudspeakers, which vibrate the air. When struck without amplification, the strings also make sound waves but they are not strong enough to be heard more than a few meters away from the instrument being played. The overtones and harmonics created by stringed instruments, especially by a piano, are extremely difficult to emulate using digital technology.

A rack mountable audio interface

An audio interface (AI), or sound card, converts the analog signals it receives into digital information a computer can process. These analog signals are usually generated by microphones, electric instruments or synthesizers. Computers can generate digital sound signals and do not need to be sent through an AI in order to be processed. In order for the signals being processed by the computer (analog or digital) to be heard, they need to be sent back out through an AI that converts the digital signals back into analog signals and then through loudspeakers or headphones.

Recording software or sequencer processes the signals that are generated by a computer or converted using an AI and can produce signals using plugins. These plugins can also emulate analog effects or instruments. The sound options available to a game creator, have increased with recording software performance. Historically, creators were limited to very small sound file size. Modern game stations have more processing power and random access memory and can handle much larger, higher quality sound files. It is commonplace for bands to license songs to video game makers for game soundtracks.

Traditionally, sound effects were recorded in much the same way as music; in a studio with someone performing the sound (e.g. breaking glass or footsteps) in front a microphone. In recent years, with the availability of innumerable sound sample libraries, game makers, like filmmakers, use mostly prerecorded samples for sound effects. Sound effects are extremely important to a players experience of a game, especially in realistic games where sounds are required to be as authentic as possible.

Reproduction[edit | edit source]

Sound reproduction uses much the same process as recording, but in reverse. A tape or record is played or digital file read and converted back into sound waves. This is usually done with speakers or headphones. Accurate sound reproduction is vital to the experience of a game.

around-ear and in-ear headphones

Speakers and headphones are the rough equivalent of microphones but are used for sound output, instead of sound input. The electrical signals being played back are sent through an amplifier, which strengthens the signal, through a cable to speakers, where a magnet is used to set the speaker's membrane in motion. This membrane vibrates the air, sending sound waves into the space in in front of and behind the membrane. Speakers are usually contained in some sort of housing, which needs to be tuned for accurate sound reproduction. Housings for headphone speakers come in three general types: over-ear, around-ear, and in-ear. These types have two configurations. They can be open, which projects sound outward, as well as into the ear, or closed which blocks outside noise and keeps sound from escaping.

A typical "nearfield" studio monitor

Audio Effects[edit | edit source]

Audio effects are used to change existing sounds which are recorded or generated by software or by synthesizers and are usually user configurable. Traditionally they were encased in boxes, or pedals, that could be activated with the foot of a musician during a musical performance or in larger rack mountable formats for use in a recording studio. Software plugins are able to emulate most formerly hardware based effects.

A distortion pedal
  • Filter

The filter is a commonly used effect. Its function is to cut off frequencies above or below a defined frequency, known as the cutoff. The resulting frequency can be amplified and is known as resonance. There a different types of filters and there are many different approaches to build these with many individual characteristics. We only differ between their cutoff types:

  1. Lowpass filter

Allow lower frequencies through to the output stage, cutting higher frequencies.

  1. Highpass filter

Allow higher frequencies through to the output stage, cutting lower frequencies.

  1. Bandpass filter
  2. Notch filter
  • Equalizer

Boosts or cuts certain frequency bands in a signal.

  • Delay

Repeats an incoming signal to the output stage making the output sound like and echo of the original input.

  • Reverb
  • Flanger
  • Phaser
  • Chorus
  • Unisono
  • Distortion

Manipulates or deforms an incoming signal.

  • Waveshaping

Synthesizer[edit | edit source]

Synthesizers use electronic circuits to generate electric signals. They can be analog, digital or a combination of both.

  • Subtractive synthesis

Most analog and digital synthesizers use this common approach of subtractive synthesis. The essence of these synthesizers are one ore more oscillators with a rich filled frequency spectrum of overtones. These sounds can be filtered by low-pass, band-pass, high-pass or notch filter.

  • Additive synthesis

Instead of filtering overtones like the subtractive synthesis does, we are adding overtones to the base note.

  • FM synthesis

Also called frequency modulation synthesis is an approach which has its origin in telecommunications engineering. The main idea is to create overtones by manipulating a carrier wave's frequency by an other modulating wave. So the carrier wave's frequency gets higher, where the modulation wave's position is positive and gets lower, where the modulation wave'position is negative.

  • PM synthesis

Phase modulation synthesis is very similar in its acoustic results to frequency modulation. Instead of manipulating the frequency of the carrier wave, its phase gets manipulated by a modulation wave.

  • Wavetable synthesis

A wavetable is mostly a bunch of samples and an oscillator picks a small window of these samples and repeats this part of information. This window can be moved while it's playing.

  • Granular synthesis

The granular synthesis is also based on an existing sample wave file like the wavetable synthesis, but this wave sample is cut it many small pieces also called grains which are between 1 and 50 milliseconds.

Mood in games (with examples)[edit | edit source]

  • Action game
In action games there are only sounds with a simple background music. These has a catchy melody. That means that you have to avoid big score leaps and that the backgroundmusic has to be singable. To get an exciting mood you have to take fast tempo and take . The key has to be in major. So that the melody sounds happy.
In addition there has to be a soundnotification, when you get a point or removed a line and so on.
E.g. Tetris
The melody of the backgroundmusic is very catchy simple and singable. There are no big score leaps.
  1. Soundsnotification:
-Removing a line:
Here could be a space sound. Something like that http://www.flashkit.com/soundfx/Electronic/Other/Spacely_-Daniel_D-8815/index.php
-turning a shape
Here could be a short sound. This sound could be a little tick.


  • Shooter game


  • Adventure game


  • Role playing game


  • Strategy games


  • Simulation game


Links and sources[edit | edit source]

http://msdn.microsoft.com/en-us/library/bb417503.aspx