Neural Networks Elevate Granular Synthesis

In the ever-evolving landscape of sound synthesis, a groundbreaking approach is making waves. Granular synthesis, a technique that involves rearranging sequences of small waveform windows, has long been a staple in audio generation. However, it’s not without its limitations. The quality of the grain space, which is determined by a set of acoustic descriptors, can often be lacking. Moreover, traversing this space isn’t continuously invertible to signal, and it doesn’t render any structured temporality.

Enter Adrien Bitton, Philippe Esling, and Tatsuya Harada, who have demonstrated that generative neural networks can implement granular synthesis while mitigating many of its shortcomings. Their approach involves replacing the audio descriptor basis with a probabilistic latent space, learned through a Variational Auto-Encoder. This learned grain space is invertible, meaning that sound can be continuously synthesized when traversing its dimensions. Additionally, original grains aren’t stored for synthesis, and structured paths can be learned within this latent space by training a higher-level temporal embedding over arranged grain sequences.

The model’s versatility is noteworthy. It can be applied to a wide range of libraries, from pitched notes to unpitched drums and environmental noises. The researchers have reported experiments on common granular synthesis processes, as well as novel ones like conditional sampling and morphing. This innovative approach to sound synthesis could potentially revolutionize the way we create and manipulate audio, opening up new possibilities for musicians, sound designers, and audio engineers alike.

Scroll to Top