AI for sound: an ERC grant for prof. Gaël Richard

3d audio soundwave. Colorful music pulse oscillation. Glowing impulse pattern

Professor Gaël Richard, executive director of Hi! Paris and Professor at Télécom Paris, an IMT (Institut Mines-Télécom) school, Institut Polytechnique de Paris, is awarded of an advanced ERC grant of the European Union for a project on machine listening and more specifically on artificial intelligence for sound.

Read also: Audio and machine learning: Gaël Richard’s award-winning project (on I’m Tech, IMT’s science and technology news)

Since its creation, Télécom Paris has become a major French player in the field of training and innovation in digital technologies. One of the major challenges facing today’s society is the transition to a low-carbon society. Télécom Paris proudly welcomes the ERC advanced grant of Gaël Richard, one of his prominent professor recently awarded Grand Prix IMT Académie des Sciences.

Machine Listening, or AI for Sound, is defined as the general field of Artificial Intelligence applied to audio analysis, understanding and synthesis by a machine. The access to ever increasing super-computing facilities, combined with the availability of huge data repositories (although largely unannotated), has led to the emergence of a significant trend with pure data-driven machine learning approaches.

The field has rapidly moved towards end-to-end neural approaches, which aim to directly solve the machine learning problem for raw acoustic signals but often only loosely taking into account the nature and structure of the processed data.

The main consequences are that the models are:

  • overly complex, require massive amounts of data to be trained and extreme computing power to be efficient (in terms of task performance)
  • remain largely unexplainable and non-interpretable.

To overcome these major shortcomings, we believe that our prior knowledge about the nature of the processed data, their generation process and their perception by humans should be explicitly exploited in neural-based machine learning frameworks.

The aim of HI-Audio is to build such hybrid deep approaches combining parameter-efficient and interpretable signal, musicological and physics-based models, with highly tailored, deep neural architectures.

The research directions pursued in HI-Audio will exploit novel deterministic and statistical audio and sound environment models with dedicated neural auto-encoders and generative networks and target specific applications including speech and audio scene analysis, music information retrieval and sound transformation and synthesis.

About Télécom Paris

Télécom Paris is one of the top-four French general engineering schools. Recognized for its close work with companies, this public school guarantees excellent employability in all sectors and ranks as the number one major digital engineering school (Le Figaro 2020 and 2021 ranking). With excellent teaching and innovative practices and techniques, Télécom Paris is at the heart of a unique innovation ecosystem, based on the interaction and transversality of its training, interdisciplinary research, its business incubator based in Paris, and its recent location in Palaiseau in the heart of the Institut Polytechnique de Paris campus. Its LTCI laboratory is promoted by the HCERES as a flagship unit in the field of digital sciences with a remarkable international influence, an exceptional volume of activities geared towards the socio-economic world and companies, and a strong commitment to training. A founding member of the Institut Polytechnique de Paris, an IMT (Institut Mines-Télécom) school, Télécom Paris is positioning itself as one of the leaders in digital innovation on the Saclay plateau.

Press contact

Stéphane Potelle, chief of staff, Télécom Paris
Tel. +33 6 13 25 18 51

Header image source GarryKillian/Freepik