Project: Rapid and Accurate Audio Browsing by Structural Pattern Modeling

The CO goal of the project is to develop methods for sound comparison and searching as well as several end-user applications that exploits these methods to retrieve sound or music samples based on specified properties of their content. The methods will be implemented in software that can quickly and accurately search a large database of audio samples and retrieve samples that are similar to a given query. It will also search for selected other properties of the sample content. The audio samples may be any kind of music, song, movie sound tracks, radio recordings, speeches or other types of sounds. Based on similarity of the audio signals, the software will be able to identify the most similar sound samples in the database. The database may contain millions of sound samples. The patterns we use are relevant to musical structure modelling and allow searching for spectral similarity as well as relevant succession of timbres._x000D__x000D_Audionamix has developed technology that allows the separation of many different parts of a monaural sound signal. The signal is initially separated into packets containing data from a certain interval in time. Each of these packets are decomposed on spectral shape dictionaries that allow an efficient modelling of audio data. Dictionaries are made of hundred to thousands of Power Spectral Density (PSD) vectors. The PSD vectors described different properties of the sound within the time window of the packet. By comparing the PSD vector series of two sounds, it will be possible to identify similar sound samples._x000D__x000D_Sencel Bioinformatics AS has for many years been in the forefront in the development of rapid and sensitive software for searching huge public databases of DNA and protein sequences. Sencel has employed a range of parallel computing technologies in order to achieve high speed searches and is currently developing the fastest search tool that uses the gold-standard Smith-Waterman sequence comparison algorithm on common microprocessors._x000D__x000D_It is clear that comparison of sound signals and genetic sequence signals has a lot in common. Several common general signal processing techniques may be employed in their analysis. In both cases we are interested in identifying parts of the signals that resembles a part of another signal. In both cases the general sequential order of the signals need to be conserved. And in both cases, parts of the signals may be missing or a new part might have been inserted in between the conserved parts. All of this is common to both types of signals and are different types of local alignment problems. The optimal solution to a local alignment problem may be found using a type of dynamic programming approach. For genetic sequences, the well known algorithm developed by Smith and Waterman will identify the optimal local alignment. The local alignment algorithm is based on a scoring system where similar pieces of the signal (one residue in a genetic sequence, or one time slice of an audio signal) are scored according to their degree of similarity using a scoring matrix. Deletion or insertion of signals should be penalized. We believe that the algorithms developed for use on genetic sequences can be adapted for use in comparison of sound samples. We need to develop appropriate scoring systems for comparing the different parts of the sounds. This work will be done during the first 12 months of the project. As we already prototype such scoring systems, we are strongly confident that this task will be successful._x000D__x000D_The core algorithms for comparing sounds can be used in different types of software products for various user groups. We hence plan to develop and commercialize several software products. Each software will address a clearly identified need in the market. We will target three applications :_x000D_- Sample recommendation for Computer Assisted Composition_x000D_- Enhance music recommendation system for online audio broadcasting_x000D_- Music to video aggregation for user User Generative Content web platform _x000D__x000D_The digital content creation (DCC) market has seen a healthy period of growth. The total DCC market grew 16% from $2.6 billion to reach more than $3 billion in 2006. The fastest growing segments in the future will be interactive development and video as the web offers new distribution networks and new programming approaches to enable small, compelling applications to be developed that extend the power of individual web sites. John Peddie Research expects to see the total Digital Content Creation market to reach $4.8 billion in 2012. (Source: JPR). The two CO markets targeted by our project are those of audio creation software and the market of internet audio broadcast . Both markets are very fast growing. Two of our software applications (Sample recommendation for Computer Assisted Composition and Music to video aggregation) are B2C oriented and the third one is B2B oriented (Enhance music recommendation system for internet audio broadcasting)._x000D_

Acronym RAABSPM (Reference Number: 5189)
Duration 01/01/2010 - 31/12/2012
Project Topic This project aims at developing a software suite for accurate and structural audio modeling. We intend to design a new generation of tools for audio information retrieval. The software will be based on Audionamix and Sencels respective expertise in audio signal processing and parallel computing.
Project Results
(after finalisation)
Two web-based applications for efficient query-by-example searches in sound and music databases have been developed: AudioHelix prototype and Simitunes. Research about new uses of bioinformatics methods for audio applications has been performed and published, and integrated into the Audiohelix prototype. The applicative results with respect to the use cases are not high enough to envisage direct commercialisation.
Network Eurostars
Call Eurostars Cut-Off 3

Project partner

Number Name Role Country
2 Audionamix Partner France
2 Sencel Bioinformatics AS Coordinator Norway