Originally written by Fred Ginsburg for publication in Student Filmmakers Magazine
Everyone who has ever taken a film class has heard about the importance of recording a few seconds of Room Tone on the set. However, unless you are an editor, you may not fully understand just what good Room Tone is, why we need it, how it differs from background tracks, and how to manipulate it to maximize the power of your noise-cancelling software.
Let’s start with what Room Tone is, and is not. Simply defined, Room Tone (or RT as editors often abbreviate it in track naming) is the sound on the set that is present within the dialogue scene during those moments when the actors are not speaking. It is the actual background noise or ambiance that is recorded behind the dialogue. It is often very subtle or feint, but includes the sounds found at the location or in the set. Examples include ventilation, lighting hum, camera fans, human presence (cast & crew), appliances, traffic rumble, and so on.
So why do we need it? Because sometimes we need to fill in gaps that occur when we have to make edits in the soundtrack. Maybe a distracting noise goes off while we are filming, and the editor snips it out. Or mutes a cough, sneeze, or bad clothing noise. Or the Director snaps their fingers as a cue, or shouts a word or two of “motivation”.
These edits can leave holes in the soundtrack that leap out at us as sudden, unexpected silence and void. Editors need to plug these gaps with what we would have heard on the soundtrack if the actors had paused speaking for a moment.
The key to recording usable room tone is to provide the editor with material that will readily match the surrounding dialogue scene. That means that the room tone should be recorded no louder and no quieter than what the mics would have heard during the actual take. The room tone should be recorded with the same dominant microphone (usually the boom, but sometimes the key talent’s lavalier), at the same pickup angle, same height, and roughly the same location in the set.
Good room tone often includes bad sounds. There may be numerous unwanted or minimally distracting ambiance present in the shooting set, but these are precisely the sounds that will also be present during the take.
Imagine that you are repairing a small hole in the wall of your office. Step one, get some spackling putty and fill in the hole. Spread it smoothly around, so that the surface of the wall is even. Let it dry, step back, and take a look. What you will see is a splotch of white.
So find an old paint can in the closet, note the brand and color name, and then go down to the paint store and order up a small batch. Return to your spackled wall, and paint over the patched area with the correct color of paint. Let it dry, step back, and take a look.
It is still not a match, and your painted patch remains very obvious to the eye. Why? It is the same exact paint color, but it is CLEAN. It has not accumulated the aging from airborne contaminants. It lacks DIRT.
Lucky for you, your friend is a set decorator, and knows how to make your clean patch blend it with the rest of the wall by rubbing some DIRT onto the patch and feathering it in gradually with the surrounding wall.
Room Tone is kind of like that. CLEAN SILENCE sticks out. But imperfect silence can be blended in.
Keep in mind that CLEAN SILENCE does not only occur when you snip out entire frames from your soundtrack, but also becomes a factor when you replace live dialogue with ADR (looped or replacement dialogue), ISO tracks (such as a “clean” lavalier devoid of everything other than “close-up” dialogue), or a mash-up of varying angles & takes.
There are two ways to record usable Room Tone on the set.
But first, here is how to NOT record RT. Inform the Assistant Director that you need to capture some room tone for the editor. The AD will then advise you to wait a little bit until they finish filming the scene in progress. Eventually the Director will pronounce that it is a “wrap”, and then the AD will call out that “sound is going to record some room tone”. However, during the milliseconds between the Director’s lips uttering “wra” and before the “p” – it will be as if a tornado has struck the set! Cast will scurry off; noisy lights & camera systems put to sleep; and impatient crew members darting off to attack craft services over at base camp. As the dust clears, the sound mixer will discover that the newly emptied set sounds nothing at all like the crowded, busy, and noisy soundstage it was during the take.
What we needed to record was an RT that perfectly matched the conditions on the set DURING filming.
To achieve this, try and record a couple of seconds at the beginning of every take. After you announce the scene/take into your personal slate mic -- quickly close your slate mic, open up your primary or dominant mic, set the fader or volume level to where you normally ballpark it for dialogue, and then enjoy a “zen moment”. Give it a couple of seconds before you re-open your slate mic and shout “Speed!”
When the camera operator hears you shout “Speed,” that is when the operator should echo “Mark It” to the clapstick holder. Note that the sound mixer is the person designated to verbally identify the scene/take number; NOT the clapper person. The clapper person should only utter “Marker” just prior to closing the sticks and NOT repeat the scene/take number aloud.
At the end of each take, the sound mixer should also record a few moments of room tone before hitting the STOP button on the recorder. This provides the editor with a little bit of tail for audio transitions (fades, dissolves) as well as additional RT. It also is a good practice on the set in case the Director suddenly has a change of heart about cutting and asks talent to repeat their last bit. I always wait and make sure that the camera operator has cut the camera before I turn off the recorder.
In addition to capturing those zen moments of room tone at the beginning of the take, you should also record up to 30 seconds of good RT during the first take of each major scene (distinct audio location, not just a minor change of camera angle). Wait until the actors are in position, camera & lights active, clapstick in place. At the command to “Roll sound”, speak the scene/slate number into your mic, but do not shout “speed”. Instead, announce loudly so that the whole set can hear (or have your boomperson echo it on the set) for everyone to remain quiet so that we can record 30 seconds of room tone, as required by the studio executives. Always invoke the highest deity to inspire compliance.
Close your slate mic, open your dominant mic, and proceed to record your 30 seconds of “silence”. Of course, no one on the set will remain quiet for 30 whole seconds; you will be lucky to capture several seconds before the set is awash with whispering and soft conversation. So after your room tone has turned to walla, loudly announce that you are done with room tone and that are good to go. Shout “speed”.
You should only attempt this 30 seconds of room tone “gotcha” at the beginning of Take One. After the first take, the actors are into their character and stoked for upcoming takes. The Director is in the zone, and usually not fond of delays. For takes two and above, just rely on sneaking in your RT during your “zen moment”.
Background track, unlike room tone, is not the sound of the set in between the actors’ words, but the sound of the SCENE as the editors envision it. Sometimes a background track is the real sound of the location; other times it is from an effects library, perhaps recorded at a different time/location, or even created in a recording lab.
Background tracks (BG) are usually continuous under a sequence, and serve to solidify numerous cuts and angles within the finished scene. Background tracks create the universe that surrounds the action. They are recorded for clarity of detail, at optimum volume levels, with the best microphones and placement possible. They are sound effects.
During the scene, background tracks are mixed in and their levels actively adjusted to set the mood without overpowering the dialogue or sound effects.
Sometimes, editors might remove as much existing RT as possible and just rely on the BG to serve in its place. But this happens later during the sound design/editing.
When you record Background tracks, your goal is to get the audio as perfect and clean as possible. Room tone is recorded low & feint, at the same level that it would have been during dialogue recording. But BG is recorded at hearty levels, and left to post-production to mix.
Try to record a series of background tracks from varying perspectives. This allows the editors and post-production mixers to stitch together a multi-dimensional surround that evolves as the characters move through the set, or at least establishes relative locations of background track elements in the surround mix.
An important tool in sound editing is noise reduction. Short, interruptive noises can be snipped out and replaced by room tone. If the noise takes place concurrent with dialogue – software such as Izotope RX might allow the editor to surgically remove simple sounds without disturbing the sounds around them (think of it as Photoshop for audio).
Most of the time, though, noise reduction is performed by intelligent software that uses sophisticated algorithms to differentiate between bad ambiance and good dialogue. If you have properly recorded room tone, the software can sample it and “learn” how to filter it out during dialogue.
Some software only samples ongoing room tone and tries to ease it out during the scene, sort of like a high tech noise gate where you set a threshold level that remains “open” in the presence of (louder) dialogue, but cuts the (lower level) ambiance that fails to exceed the threshold levels. But this fancy software looks at more than just raw loudness in order to separate dialogue from ambiance, although things can get dicey when dealing with low level speech or unusual background noises.
This becomes a serious problem when the background noise is rapidly changing in a random pattern. Examples include running water, and nearby traffic noise with Doppler affecting the sound of vehicles as they approach and then zoom away.
A technique that can help software deal with these ever changing ambient noises is to actively capture them during the take onto an ISO track. The ISO track becomes a living reference track for the software to use while cleaning the take.
The key to making this work is to record this ambient background track so that it accurately reflects what your main dialogue track is also hearing naturally combined with the dialogue.
You need to capture the background with the exact same make/model of microphone that is picking up the (combined) track of dialogue + noise. That microphone must be angled identical as your primary mic, and at the same height. Use the same volume level as well.
You want both tracks to be perfect twins of each other, except that one track includes noise + dialogue and the other track only noise. Both mics should be equidistant from the offending noise. The more perfect of a match that you can achieve in the field, the more perfect of a reference track for the noise reduction software to monitor during the take!
Bear in mind that sound files obtained from different model mics, different angles, different heights, and different Doppler timing will NOT equate to the same noise profile. Good quality software tries to account for these differences, but may only be able to improve things partially without affecting the dialogue profiles themselves. You improve your chances by improving the reference samples.