The History of Audio and Music in Video Games
Written by David Weaver
Since the early days in the field, music has held an important place in video game audio. Both technical and creative restraints featured as strong influences in early video game audio, with the earliest game music tracks being relegated to a tiny chip on an onboard console. Chiptunes, as they were called, contained simple melodies, with the common setting being 4 tracks, 3 synthesized waveforms, and one noise channel for sound effects. The sound still features quite heavily in pop culture and throwback genres, with “8-bit” still being a popular style among music producers for some sound textures.
Music was usually monophonic, looped or used only between stages or at the start of a new game. There simply was not enough memory to use it much more than that. The first game to use a continuous background soundtrack was Space Invaders, released in 1978. So popular that it claimed arcade’s as “Spacies”, it featured a four-note descending chromatic passacaglia, repeating in a loop, though it was dynamic and interacted with the player, increasing pace as the enemies descended on the player. From the first steps in video games, it was obvious that interactive music could increase the level of enjoyment for the player.
From there, music began to develop quite rapidly, into fan favourites that would permeate both Japanese and Western culture quite radically. 1981 classic arcade game Frogger created one of the most primitive dynamic approaches to video game music, with more eleven different gameplay tracks, in addition to level-starting and game over themes. The home computer Commodore 64 released in 1982 was capable of early forms of filtering effects, meaning that even greater levels of dynamic audio could be created. At this stage, it was more of an “underwater effect” or not, but it was still an incredibly versatile system when compared with the past. With all this processing power, it seemed only a matter of time until different types of waveforms and eventually the ability to play 4-bit samples on a fourth sound channel. This housed much of the noisy sound effects in early games that found huge audiences.
As home consoles moved into the 16-bit era, the hybrid approach (sampled and tone) to music composing found higher prominence in usage. The Sega Genesis offered advanced graphics over the NES and improved sound synthesis features, but largely held the same approach to sound design. Ten channels in total for tone generation with one for PCM samples were available in stereo instead of the NES’s five channels in mono, one for PCM. As before, it was often used for percussion samples and noisy sound effects.
Even the game boy in its initial iteration managed to have 2 pulse wave generators, 1 PCM 4-bit wave sample, 1 noise generator, and one audio input from the cartridge. Game Boy’s only had one speaker, but the headphone port did output stereo sound.
Though the Mega CD/Sega CD, gamers gained a preview of the direction video game music would take in streaming music, and the use of both sampled and sequenced music continues in game consoles even today. The huge data storage benefit of optical media would be coupled with progressively more powerful audio generation hardware and higher quality samples in the Fifth Generation. In 1994, the CD-ROM equipped PlayStation supported 24 channels of 16-bit samples of up to 44.1 kHz sample rate, samples equal to CD audio in quality. It also sported a few hardware DSP effects like reverb. Many Square titles continued to use sequenced music, such as Final Fantasy VII, Legend of Mana, and Final Fantasy Tactics.
In 1996, the Nintendo 64, still using a solid state cartridge, actually supported an integrated and scalable sound system that was potentially capable of 100 channels of PCM, and an improved sample rate of 48 kHz. Games for the N64, because of the cost of the solid state memory, typically had samples of lesser quality than the other two, however, and music tended to be simpler in construction. Much of this audio was played through tiny speakers in an arcade with a hundred other large machines, however, the volume and audio content was still highly sought after when it came to game development. After all, the theme tunes and background music in some of the most well-known video games of all time were beginning to become known to players.
Musical concerts and renditions of popular themes, as anyone who’s ever seen Super Mario Theme played on PVC pipes have become a huge part of the musical repertoire, with concerts from the Symphony Orchestra playing “The Music of Final Fantasy” and the “The Music of Zelda” playing to sold-out crowds in concert halls all over the world.
Taking entirely pre-recorded music had many advantages over sequencing for sound quality. Music could be produced freely with any kind and number of instruments, allowing developers to simply record one track to be played back during the game. Quality was only limited by the effort put into mastering the track itself. Memory space costs that were previously a concern was somewhat addressed with optical media becoming the dominant media for software games. CD quality audio allowed for music and voice that had the potential to be truly indistinguishable from any other source or genre of music.
As processing power increased dramatically in the 6th generation of home consoles, it became possible to apply special effects in realtime to streamed audio. In a snowboarding game for example, if a snowboarder takes to the air after jumping from a ramp, the music softens or muffles a bit, and the ambient noise of wind and air blowing becomes louder to emphasize being airborne. When the snowboarder lands, the music resumes regular playback until its next “cue”. Action games such as these will change dynamically to match the amount of danger. Stealth-based games will sometimes rely on such music, either by handling streams differently or dynamically changing the composition of a sequenced soundtrack.
The Xbox 360 supported sampling and playback rate of 16-bit @ 48 kHz, streaming, and potential of 256 audio simultaneous channels, and the PlayStation 4 handles a huge number of audio streams and supports hardware samples rates up to 192kHz (though the main audio delivery still sits around the 44kHz and 48kHz mark.) The later generations are all capable of multiple instances of heavy convolution reverbs and incredibly detailed sound experiences, with nothing really standing in the way other than an audio programmer’s experience and creativity.