March 7, 2017

Advances in Nonlinear Speech Processing: International by Bertrand Rivet, Jonathon Chambers (auth.), Jordi

This quantity includes the court cases of NOLISP 2009, an ISCA instructional and Workshop on Non-Linear Speech Processing held on the collage of Vic (- talonia, Spain) in the course of June 25-27, 2009. NOLISP2009wasprecededbythreeeditionsofthisbiannualeventheld2003 in Le Croisic (France), 2005 in Barcelona, and 2007 in Paris. the most suggestion of NOLISP workshops is to give and talk about new principles, concepts and effects regarding substitute ways in speech processing which can leave from the mainstream. with the intention to paintings on the front-end of the topic quarter, the subsequent domain names of curiosity were de?ned for NOLISP 2009: 1. Non-linear approximation and estimation 2. Non-linear oscillators and predictors three. Higher-order facts four. self sustaining part research five. Nearest buddies 6. Neural networks 7. determination bushes eight. Non-parametric types nine. Dynamics for non-linear structures 10. Fractal equipment eleven. Chaos modeling 12. Non-linear di?erential equations The initiative to arrange NOLISP 2009 on the college of Vic (UVic) got here from the UVic learn staff on sign Processing and used to be supported by way of the Hardware-Software learn team. we wish to recognize the ?nancial aid bought from the M- istry of technology and Innovation of Spain (MICINN), collage of Vic, ISCA, and EURASIP. All contributions to this quantity are unique. They have been topic to a doub- blind refereeing process prior to their reputation for the workshop and have been revised after being offered at NOLISP 2009.

The main principle is to project face images (viewed as intensity vectors) in a space where data scattering is maximized. Such a space is obtained by applying Principal Component Analysis (PCA) over a training set composed of numerous face images. Its direction vectors are called eigenfaces as they refer to eigenvectors of the training data covariance matrix. Such a method may easily be extended to any visual object given that enough learning data are available. It has thus been applied to lips (eigenlips) and tongues (eigentongues) within the framework of our experiments concerning audiovisual speech recognition (OUISPER project).

PCA has already been presented within the section concerning eigenfaces. LDA is much more appropriate for classification. 2) since classes are then known (phones). On the difference of PCA which tends only to maximize intra-classes scattering, LDA also tends in the same time to minimize interclasses scattering. This method is a multivariate statistical analysis that aims at jointly transforming two signal (the acoustic and the visual one when performing audiovisual synchrony analysis) in order to maximise their covariance.

A) Audio-visual decision-fusion results at -5db (b) Audio-visual parameter-fusion results Fig. 11. Audiovisual ASR results SVD-based matching method was introduced for spatial matching between keypoints [36] and relies on the proximity and exclusion principles enunciated by Ullman [40], which impose one-to-one correspondences. Let us consider two sets of keypoints and R be the distance matrix between them. The matching consists in searching for pairs (i, j) that minimize Rij . Searching for one-to-one correspondences may be facilitated if some projection matrix Q allows to make R closer to the identity matrix I.

