Although the sensitivity and performance of microphones has improved quite a bit since Alexander Graham Bell first patented them, they still have one big drawback that researchers from Carnegie Mellon University may have finally overcome by using a pair of run-of-the-mill video cameras.
Put a microphone in a room with a bunch of musicians, and while you’ll capture every last note and nuance of their individual performances, you’ll be left with a single recording with everything all mixed together. But to make that performance sound even better, you ideally want every instrument and musician captured separately, so each performance can be re-mixed back together by a sound engineer with a skilled ear.
Software tools have been developed for extracting individual sounds from an audio recording, but the results simply aren’t as good as what you’d get when capturing one sound source with one microphone. That’s why mixing boards are often so gigantic and elaborate: countless microphones with limited pickup patterns need to be set up to properly capture every component of a musical performance, from vocals to instruments, which amounts to a lot of gear to do things properly.
There’s really no way to redesign microphones to differentiate the captured sound vibrations moving through the air, which is why the researchers at Carnegie Mellon University’s School of Computer Science’s Robotics Institute have turned to video cameras instead. Strum the strings of a guitar, and it will not only produce sound waves vibrating through the air, but it will also make the guitar itself vibrate in the process. With the right equipment, those vibrations can be visualized and analyzed to recreate the sounds being produced, even if no sounds are recorded.
Optical microphones, as these camera systems are called, are not a new idea, but what the CMU researchers have come up with, and shared in a recently published paper, ‘Dual-Shutter Optical Vibration Sensing,’ is a way to make them work using low-end, and more affordable, camera gear.
The new system shines a bright laser light source on a vibrating surface, like a guitar’s body, and captures the movements of the resulting speckled pattern of light. Since the range of human hearing can detect sounds oscillating as fast as 20,000 times every second, optical microphones have typically relied on expensive high-speed cameras to capture physical vibrations oscillating just as quickly. But the new CMU system uses cameras running at just 63 frames per second, which would seemingly miss the high-speed movements of a vibration happening 20,000 times every second.
The clever breakthrough here is the use of two different types of cameras at the same time: one with a global shutter that captures entire frames of video, resulting in distinct speckled patterns, and one with a rolling shutter that captures frames line-by-line from the top to the bottom of the sensor, resulting in distorted speckled patterns that actually contain more information about how they move back and forth over time.
Using a custom algorithm, the captured frames from each camera can be compared against each other to more accurately determine the movements of the vibrating speckled laser patterns up to 63,000 times every second—or as fast as a pricey high-speed camera could.
The approach allows audio to be individually extracted from several sources in a single video, such as multiple musicians each playing their own guitar, or even several speakers all playing different music.
The extracted audio isn’t as clear or high-fidelity as what a traditional microphone can capture, but the optical microphone could provide mixing engineers with an easy to way monitor individual instruments during a live performance, and over time there’s little doubt the quality of the extracted audio will continue to be improved. The system has other interesting applications outside of music. A video camera monitoring all the machines on a factory floor, or pointed at the engine of a running car, could determine when individual parts or components are making an abnormal sound, indicating maintenance may be required before a problem actually becomes a problem.