MSc SS Thesis Presentation

The cocktail party problem: GSVD-beamformers in reverberant environments

Derk-Jan Hulsinga

Hearing aids as a form of audio preprocessing is increasingly common in everyday life. The goal of this thesis is to implement a blind approach to the cocktail party problem and challenge some of the regular assumptions made in literature. We approach the problem as wideband FD-BSS. From this field of research, the common assumption of continuous activity is dropped. Instead a number of users detection is implemented as a preprocessing step and ensure the appropriate number of demixing vectors for each time frequency bin. The validity of the standard mixing model used for STFT’s is challenged by looking at the response of a linear array.

Source separation is achieved by demixing vectors based on the GSVD, derived in a model-based approach. While most permutation solvers offer an a posteriori solution for all users, we looked at finding local solutions for a single user. Combining this with the user identification called the alignment step, we conclude that the permutation problem can be reduced to selecting a demixing vector for each discrete time-frequency instance. The correlation coefficient proves to be a sufficient metric to couple reconstructions to the original data as it selects most of the active time-frequency bins.

In simulations, our demixing vectors achieve comparable inteligibility, measured by STOI, as the compared techniques and it is more robust against smaller sample sizes than the theoretically SINR optimal MVDR.

Additional information ...

Overview of MSc SS Thesis Presentation