towards automation for sound effects

MIT/CSAIL researchers add realistic sounds to silent videos, a step toward automating sound effects for movies?

Screen Shot 2016-06-16 at 08.35.26MIT researchers have developed a computer system that independently adds realistic sounds to silent videos. Although the technology is nascent, it’s a step toward automating sound effects for movies.

“From the gentle blowing of the wind to the buzzing of laptops, at any given moment there are so many ambient sounds that aren’t related to what we’re actually looking at,” says MITPhD student Andrew Owens . “What would be really exciting is to somehow simulate sound that is less directly associated to the visuals.”

The notion of artificial sound generation has been around for sometime now, with concepts such as procedural audio, and in many ways its long overdue that the same amount of attention and computing power that is afforded to visual effects, be directed towards sound generation. CSAIL is directed by Tim Berners-Lee and is the largest research laboratory at MIT and one of the world’s most important centres of information technology research. I have found several articles which discuss this new development and have selected sections of them here :

see demonstration video here 

the following is a selection from articles:

“Researchers envision future versions of similar algorithms being used to automatically produce sound effects for movies and TV shows, as well as to help robots better understand objects’ properties.

“When you run your finger across a wine glass, the sound it makes reflects how much liquid is in it,” says CSAIL PhD student Andrew Owens, who was lead author on an upcoming paper describing the work. “An algorithm that simulates such sounds can reveal key information about objects’ shapes and material types, as well as the force and motion of their interactions with the world.”

Screen Shot 2016-06-16 at 08.34.20

The team used techniques from the field of “deep learning,” which involves teaching computers to sift through huge amounts of data to find patterns on their own. Deep learning approaches are especially useful because they free computer scientists from having to hand-design algorithms and supervise their progress.

The paper’s co-authors include recent PhD graduate Phillip Isola and MIT professors Edward Adelson, Bill Freeman, Josh McDermott, and Antonio Torralba. The paper will be presented later this month at the annual conference on Computer Vision and Pattern Recognition (CVPR) in Las Vegas.

In a series of videos of drumsticks striking things — including sidewalks, grass and metal surfaces — the computer learned to pair a fitting sound effect, such as the sound of a drumstick hitting a piece of wood or of rustling leaves.

The findings are an example of the power of deep learning, a type of artificial intelligence whose application is trendy in tech circles. With deep learning, a computer system learns to recognize patterns in huge piles of data and applies what it learns in useful ways.

In this case, the researchers at MIT’s Computer Science and Artificial Intelligence Lab recorded about 1,000 videos of a drumstick scraping and hitting real-world objects. These videos were fed to the computer system, which learns what sounds are associated with various actions and surfaces. The sound of the drumstick hitting a piece of wood is different than when it disrupts a pile of leaves.
Screen Shot 2016-06-16 at 08.34.52

Once the computer system had all these examples, the researchers gave it silent videos of the same drumstick hitting other surfaces, and they instructed the computer system to pair an appropriate sound with the video.
To do this, the computer selects a pitch and loudness that fits what it sees in the video, and it finds an appropriate sound clip in its database to play with the video.

To demonstrate their accomplishment, the researcher then played half-second video clips for test subjects, who struggled to tell apart whether the clips included an authentic sound or one that a computer system had added artificially.
But the technology is not perfect, as MIT PhD candidate Andrew Owens, the lead author on the research, acknowledged. When the team tried longer video clips, the computer system would sometimes misfire and play a sound when the drumstick was not striking anything. Test subjects immediately knew the audio was not real.

And the researchers were able to get the computer to produce fitting sounds only when they used videos with a drumstick. Creating a computer that automatically provides the best sound effect for any video — the kind of development that could disrupt the sound-effects industry — remains out of reach for now.

Although the technology world has seen significant strides of late in artificial intelligence, there are still big differences in how humans and machines learn. Owens wants to push computer systems to learn more similarly to the way an infant learns about the world, by physically poking and prodding its environment. He sees potential for other researchers to use sound recordings and interactions with materials such as sidewalk cement as a step toward machines’ better understanding our physical world.

 

taken from this article
and this webpage

csail_logo

The Computer Science and Artificial Intelligence Laboratory – known as CSAIL ­– is the largest research laboratory at MIT and one of the world’s most important centers of information technology research.
CSAIL and its members have played a key role in the computer revolution. The Lab’s researchers have been key movers in developments like time-sharing, massively parallel computers, public key encryption, the mass commercialization of robots, and much of the technology underlying the ARPANet, Internet and the World Wide Web.  
CSAIL members (former and current) have launched more than 100 companies, including 3Com, Lotus Development Corporation, RSA Data Security, Akamai, iRobot, Meraki, ITA Software, and Vertica. The Lab is home to the World Wide Web Consortium (W3C), directed by Tim Berners-Lee, inventor of the Web and a CSAIL member.

 

LSM Research Seminar – Marie Thompson

Screen Shot 2013-10-30 at 18.55.47

Blog Post by Dr Dean Lockwood:

Staff and students from the Lincoln School of Media (including Audio Production) welcomed Marie Thompson for the LSM Research Seminar series which took place Wednesday, 30th October.

Marie is an artist and researcher based in Newcastle upon Tyne. She is currently a PhD candidate at Newcastle University, based in the International Centre for Music Studies. Her thesis uses a Spinozist notion of affect to critically rethink the correlation between noise, ‘unwantedness’ and ‘badness’, so to more fully allow for the use of noise as a musical resource. She is the co-editor of the collection, Sound, Music, Affect: Theorizing Sonic Experience (New York: Bloomsbury, 2013). Marie is also regularly audible as a noisemaker and improviser. She plays individually as Tragic Cabaret and in the band, Beauty Pageant. Here is Marie’s abstract for her talk for our research seminar:

‘Rethinking noise, rethinking noise music: Affect, relationality and the poetics of transgression’:

‘In this paper, I outline a relational, ethico-affective approach to noise that works to disrupt the definitive correlation between noise, ‘unwantedness’ and ‘badness’. Rather than defining noise as a type of sound, or a subjective judgement of sound, noise is posited as a productive, transformative force and a necessary component of material relations. This approach to noise, I argue, is advantageous: firstly, because it allows for the noise that occurs out of (human) earshot, insofar as it no longer relies upon a constitutive listening subject; and secondly, because it allows for noise’s capacity to be good as well as bad, generative as well as destructive. A greater space is thus made for noise’s positively productive capacity, which has been readily explored within the arts.

In the second half of this paper, I discuss how a relational, ethico-affective approach to noise provides a means of (re)conceptualising noise music that moves away from the language of failure, taboo and contradiction. Rather than approaching noise music in terms of transgression, which is underlined by a dualistic conceptualisation of the relationship between (wanted, ‘good’) music and (unwanted, ‘bad’) noise, I suggest that noise music can be understood as an act of exposure, in that it foregrounds the presence of noise that is always already within the technical-musical system.’