>>> 2024-01-21 multi-channel audio part 1 (PDF)
Stereophonic or two-channel audio is so ubiquitous today that we tend to refer to all kinds of pieces of consumer audio reproduction equipment as "a stereo." As you might imagine, this is a relatively modern phenomenon. While stereo audio in concept dates to the late 19th century, it wasn't common in consumer settings until the 1960s and 1970s. Those were very busy decades in the music industry, and radio stations, records, and film soundtracks all came to be distributed primarily in stereo.
Given the success of stereo, though, one wonders why larger numbers of channels have met more limited success. There are, as usual, a number of factors. For one, two-channel audio was thought to be "enough" by some, considering that humans have two ears. Now it doesn't quite work this way in practice, and we are more sensitive to the direction from which sound comes than our binaural system would suggest. Still, there are probably diminishing returns, with stereo producing the most notable improvement in listening experience over mono.
There are also, though, technical limitations at play. The dominant form of recorded music during the transition to stereo was the vinyl record. There is a fairly straightforward way to record stereo on a record, by using a cartridge with coils on two opposing axes. This is the limit, though: you cannot add additional channels as you have run out of dimensions in the needle's free movement.
This was probably the main cause of the failure of quadraphonic sound, the first music industry attempt at pushing more channels. Introduced almost immediately after stereo in the 1970s, quadraphonic or four-channel sound seemed like the next logical step. It couldn't really be encoded on records, so a matrix encoding system was used in which the front-rear difference was encoded as phase shift in the left and right channels. In practice this system worked poorly, and especially early quadraphonic systems could sound noticeably worse than the stereo version. Wendy Carlos, an advocate of quadraphonic sound but harsh critic of musical electronics, complained bitterly about the inferiority of so-called quadraphonic records when compared to true four-channel recordings, for example on tape.
Of course, four-channel tape players were vastly more expensive than record players in the 1970s, as they ironically remain today. Quadraphonic sound was in a bind: it was either too expensive or too poor of quality to appeal to consumers. Quadraphonic radio using the same matrix encoding, while investigated by some broadcasters, had its own set of problems and never saw permanent deployment. Alan Parsons famously produced Pink Floyd's "Dark Side of the Moon" in quadraphonic sound; the effort was a failure in several ways but most memorably because, by the time of the album's release in 1973, the quadraphonic experiment was essentially over.
Three-or-more-channel-sound would have its comeback just a few years later, though, by the efforts of a different industry. Understanding this requires backtracking a bit, though, to consider the history of cinema prints.
Many are probably at least peripherally aware of Cinerama, an eccentric-seeming film format that used three separate cameras, and three separate projectors, to produce an exceptionally widescreen image. Cinerama's excess was not limited to the picture: it involved not only the three 35mm film reels for the three screen panels, but also a fourth 35mm film that was entirely coated with a magnetic substrate and was used to store seven channels of audio. Five channels were placed behind the screen, effectively becoming center, left, right, left side, and right side. The final two tracks were played back behind the audience, as the surround left and surround right.
Cinerama debuted in 1952, decades before 35mm films would typically carry even stereo audio. Like quadraphonic sound later, Cinerama was not particularly successful. By the time stereo records were common, Cinerama had been replaced by wider film formats and anamorphic formats in which the image was horizontally compressed by the lens of the camera, and expanded by the lens of the projector. Late Cinerama films like 2001: A Space Odyssey were actually filmed Super Panavision 70 and projected onto Cinerama screens from a single projector with a specialized lens.
There's a reason people talk so much about Cinerama, though. While it was not a commercial success, it was influential on the film industry to come. Widescreen formats, mostly anamorphic, would become increasingly common in the following decades. It would take years longer, but so would seven-channel theatrical sound.
"Surround sound," as these multi-channel formats came to be known in the late '50s, would come and go in theatrical presentations throughout the mid-century even as the vast majority of films were presented monaurally, with only a single channel. Most of these relied on either a second 35mm reel for audio only, or the greater area for magnetic audio tracks allowed by 70mm film. Both of these options were substantially more expensive for the presenting theater than mono, limiting surround sound mostly to high-end theaters and premiers. For surround sound to become common, it had to become cheap.
1971's A Clockwork Orange (I will try not to fawn over Stanley Kubrick too much but you are learning something about my film preferences here) employed a modest bit of audio technology, something that was becoming well established in the music industry but was new to film. The magnetic recordings used during the production process employed Dolby Type A noise reduction, similar to what became popular on compact cassette tapes, for a slight improvement in audio quality. The film was still mostly screened in magnetic mono, but it was the beginning of a profitable relationship between Dolby Labs and the film industry. Over the following years a number of films were released with Dolby Type A noise reduction on the actual distribution print, and some theaters purchased decoders to use with these prints. Dolby had bigger ambitions, though.
Around the same time, Kodak had been experimenting with the addition of stereo audio to 35mm release prints, using two optical tracks. They applied Dolby noise reduction to these experimental prints, and brought Dolby in to consult. This presented the perfect opportunity to implement an idea Dolby had been considering. Remember the matrix encoded quadraphonic recording that had been a failure for records? Dolby licensed a later-generation matrix decoder design from Sansui, and applied it to Kodak's stereo film soundtracks, allowing separation into four channels. While the music industry had placed the four channels at the four corners of the soundstage, the film industry had different tastes, driven mostly by the need to place dialog squarely in the center of the field. Dolby's variant of quadraphonic audio was used to present left, right, center, and a "surround" or side channel. This audio format went through several iterations, including much improved matrix decoding, and along the way picked up a name that is still familiar today: Dolby Stereo.
That Dolby Stereo is, in fact, a quadraphonic format reflects a general atmosphere of terminological confusion in the surround sound industry. Keep this in mind.
One of Dolby Stereo's most important properties was its backwards compatibility. The two optical tracks could be played back on a two-channel (or actually stereo) system and still sound alright. They could even be placed on the print alongside the older magnetic mono audio, providing compatibility with mono theaters. This compatibility with fewer channels became one of the most important traits in surround sound systems, and somewhat incidentally served to bring them to the consumer. Since the Dolby Stereo soundtrack played fine on a two-channel system, home releases of films on formats like VHS and Laserdisc often included the original Dolby Stereo audio from the print. A small industry formed around these home releases, licensing the Dolby technology to sell consumer decoders that could recover surround sound from home video.
For cost reasons these decoders were inferior to Dolby's own in several ways, and to avoid the hazard of damage to the Dolby Stereo brand, Dolby introduced a new marketing name for consumer Dolby Stereo decoders: Dolby Surround.
By the 1980s, Dolby Stereo, or Dolby Surround, had become the most common audio format on theatrical presentations and their home video releases. Even some television programs and direct-to-video material was recorded in Dolby Surround. Consumer stereo receivers, in the variant that came to be known as the home theater receiver, often incorporated Dolby Surround decoders. Improvements in consumer electronics brought the cost of proper Dolby Stereo decoders down, and so the home systems came to resemble the theatrical systems as well. Seeking a new brand to unify the whole mess of Dolby Stereo and Dolby Surround (which, confusingly, were often 4 and 3 channel, respectively), Dolby seems to have turned to the "Advanced Logic" and "Full Logic" terms once used by manufacturers of quadraphonic decoders. Dolby's theatrical sound solution came to be known as Dolby Pro Logic. A Dolby Pro Logic decoder processed two audio channels to produce a four-channel output. According to a modern naming convention, Dolby Pro Logic is a 4.0 system: four full-bandwidth channels.
This entire thing, so far, has been a preamble to the topic I actually meant to discuss. It's an interesting preamble, though! I just want to apologize that I didn't mean to write a history of multi-channel audio distribution and so this one isn't especially complete. I left out a number of interesting attempts at multi-channel formats, of which the film industry produced a surprising number, and instead focused on the ones that were influential and/or used for Kubrick films .
Dolby Pro Logic, despite its impressive name, was still an analog format, based on an early '70s technique. Later developments would see an increase in the number of channels, and the transition to digital audio formats.
Recall that 70mm film provided six magnetic audio channels, which were often used in an approximation of the seven-channel Cinerama format. Dolby experimented with the six-channel format, though, confusingly also under the scope of the Dolby Stereo product. During the '70s, Dolby observed that the ability of humans to differentiate the source of a sound is significantly reduced as the sound becomes lower in frequency. This had obvious potential for surround sound systems, enabling something analogous to chroma subsampling in video. The lower-frequency component of surround sound does not need to be directional, and for a sense of directionality the high frequencies are most important.
Besides, bassheads were coming to the film industry. The long-used Academy response curve fell out of fashion during the '70s, in part due to Dolby's work, in part due to generally improved loudspeaker technology, and in part due to the increasing popularity of bass-heavy action films. Several 70mm releases used one or more of the audio channels as dedicated bass channels.
For the 1979 film Apocalypse Now in its 70mm print, Dolby premiered a 5.1 format in which three full-bandwidth channels were used for center, left, and right, two channels with high-pass filtering were used for surround left and surround right, and one channel with low-pass filtering was used for bass. Apocalypse Now was not, in fact, the first film to use this channel configuration, but Dolby promoted it far more than the studios had.
Interestingly, while I know less about live production history, the famous cabaret Moulin Rouge apparently used a 5.1 configuration during the 1980s. Moulin Rouge was prominent enough to give the 5.1 format a boost in popularity, perhaps particularly important because of the film industry's indecision on audio formats.
The seven-channel concept of the original Cinerama must have hung around in the film industry, as there was continuing interest in a seven-channel surround configuration. At the same time, the music industry widely adopted eight-channel tape recorders for studio use, making eight-channel audio equipment readily available. The extension to 7.1 surround, adding left and right side channels to the 5.1 configuration, was perhaps obvious. Indeed, what I find strangest about 7.1 is just how late it was introduced to film. Would you believe that the first film released (not merely remastered or mixed for Blu-Ray) in 7.1 was 2010's Toy Story 3?
7.1 home theater systems were already fairly common by then, a notable example of a modern trend afflicting the film industry: the large installed base and cost avoidance of the theater industry means that consumer home theater equipment now evolves more quickly than theatrical systems. Indeed, while 7.1 became the gold standard in home theater audio during the 2000s, 5.1 remains the dominant format in theatrical sound systems today.
Systems with more than eight channels are now in use, but haven't caught on in the consumer setting. We'll talk about those later. For most purposes, eight-channel 7.1 surround sound is the most complex you will encounter in home media. The audio may take a rather circuitous route to its 7.1 representation, but, well, we'll get to that.
Let's shift focus, though, and talk a bit about the actual encodings. Audio systems up to 7.1 can be implemented using analog recording, but numerous analog channels impose practical constraints. For one, they are physically large, making it infeasible to put even analog 5.1 onto 35mm prints. Prestige multi-channel audio formats like that of IMAX often avoided this problem by putting the audio onto an entirely separate film reel (much like Cinerama back at the beginning), synchronized with the image using a pulse track and special equipment. This worked well but drove up costs considerably. Dolby Stereo demonstrated that it was possible to matrix four channels into two channels (with limitations), but considering the practical bandwidth of the magnetic or optical audio tracks on film you couldn't push this technique much further.
Remember that the theatrical audio situation changed radically during the 1970s, going from almost universal mono audio to four channels as routine and six channels for premiers and 70mm. During the same decade, the music reproduction industry, especially in Japan, was exploring another major advancement: digital audio encoding.
In 1980, the Compact Disc launched. Numerous factors contributed to the rapid success of CDs over vinyl and, to a lesser but still great extent, the compact cassette. One of them was the quality of the audio reproduction. CDs were a night and day change: records could produce an excellent result but almost always suffered from dirt and damage. Cassette tapes were better than most of us remember but still had limited bandwidth and a high noise floor, requiring Dolby noise reduction for good results. The CD, though, provided lossless digital audio.
Audio is encoded on an audio CD in PCM format. PCM, or pulse code modulation, is a somewhat confusing term that originated in the telephone industry. If we were to reinvent it today, we would probably just call it digital modulation. To encode a CD, audio is sampled (at 44.1 kHz for historic reasons) and quantized to 16 bits. A CD carries two channels, stereo, which was by then the universal format for music. Put together, those add up to 1.4Mbps. This was a very challenging data rate in 1980, and indeed, practical CD players relied on the fact that the data did not need to be read perfectly (error correcting codes were used) and did not need to be stored (going directly to a digital to analog converter). These were conveniently common traits of audio reproduction systems, and the CD demonstrated that digital audio was far more practical than the computing technology of the time would suggest.
The future of theatrical sound would be digital. Indeed, many films would be distributed with their soundtracks on CD.
There remained a problem, though: a CD could encode two channels. Even four channels wouldn't fit within the data rate CD equipment was capable of, much less six or eight. The film industry would need to formats that could encode six or eight channels of audio into either the bandwidth of a two-channel signal or into precious unused space on 35mm film prints.
Many ingenious solutions were developed. A typical 35mm film print today contains three distinct representations of the audio: a two-channel optical signal outside of the sprocket holes (which could encode Dolby Stereo), a continuous 2D barcode between the frame and sprocket holes which carries the SDDS (Sony Dynamic Digital Sound) digital signal, and individual 2D barcodes between the sprocket holes which encode the Dolby digital signal. Finally, a small pulse pattern at the very edge of the film provides a time code used for synchronization with audio played back from a CD, the DTS system.
But then, a typical 35mm film print today wouldn't exist, as 35mm film distribution has all but disappeared. Almost all modern film is played back entirely digitally from some sort of flexible stream container. You would think, then, that the struggles of encoding multi-channel audio are over. Many media container formats can, after all, contain an arbitrary number of audio channels.
Nothing is ever so simple. Much like a dedicated audio reel adds cost, multiple audio channels inflate file sizes, media cost, and in the era of playback from optical media, could stress the practical read rate. Besides, constraints of the past have a way of sticking around. Every multichannel audio format to find widespread success in the film industry has done so by maintaining backwards compatibility with simple mono and stereo equipment. That continues to be true today: modern multi-channel digital audio formats are still mostly built as extensions of an existing stereo encoding, not as truly new arbitrary-channel formats.
At the same time, the theatrical sound industry has begun a transition away from channel-centric audio formats and towards a more flexible system that is much further removed from the actual playback equipment.
Another trend has emerged since 1980 as well, which you probably already suspected from the multiple formats included in 35mm prints. Dolby's supremacy in multi-channel audio was never as complete as I made it sound, although they did become (and for some time remained) the most popular surround sound solution. They have always had competition, and that's still true today. Just as 35mm prints came with the audio in multiple formats, current digitally distributed films often do as well.
In Part 2, I'll get to the topic I meant to write about today before I got distracted by history: the landscape of audio formats included in digitally distributed films and common video files today, and some of the ways they interact remarkably poorly with computers. We're going to talk about:
- Dolby Digital/AC-3/AC-4
- Dolby Atmos
- MPEG Surround/MPEG-H 3D
- HDMI (ugh)
- And more!
Postscript: Film dweebs will of course wonder where George Lucas is in this story. His work on the Star Wars trilogy lead to the creation of THX, a company that will long be remembered for its distinctive audio identity. The odd thing is that THX was never exactly a technology company, although it was closely involved in sound technology developments of the time. THX was essentially a certification agency: THX theaters installed equipment by others (Altec Lansing, for much of the 20th century), and used any of the popular multi-channel audio formats.
To be a THX-certified theater, certain performance requirements had to be met, regardless of the equipment and format in use. THX certification requirements included architectural design standards for theaters, performance specifications for audio equipment, and a specific crossover configuration designed by Lucasfilm.
In 2002, Lucasfilm spun out THX and it essentially became a rental brand, shuffled into the ownership of gamer headphone manufacturer Razer today. THX certification still pops up in some consumer home theater equipment but is no longer part of the theatrical audio industry.
 Incidentally, Kubrick did not adapt to Dolby Stereo. Despite his early experience with Dolby noise reduction, all of his films would be released in mono except for 2001 (six-channel audio only in the Cinerama release) and Eyes Wide Shut (edited in Dolby Stereo after Kubrick's death).