Speaker independent visual-only language identification

Newman, Jacob and Cox, Stephen (2010) Speaker independent visual-only language identification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2010-03-14 - 2010-03-19.

Full text not available from this repository. (Request a copy)

Abstract

We describe experiments in visual-only language identification (VLID), in which only lip shape, appearance and motion are used to determine the language of a spoken utterance. In previous work, we had shown that this is possible in speaker-dependent mode, i.e. identifying the language spoken by a multi-lingual speaker. Here, by appropriately modifying techniques that have been successful in audio language identification, we extend the work to discriminating two languages in speaker-independent mode. Our results indicate that even with viseme accuracy as low as about 34%, reasonable discrimination can be obtained. A simulation of degraded accuracy viseme recognition performance indicates that high VLID accuracy should be achievable with viseme recognition errors of the order of 50%.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Faculty of Science > Research Groups > Data Science and AI
Depositing User: Nicola Talbot
Date Deposited: 14 Mar 2011 09:47
Last Modified: 10 Dec 2024 01:14
URI: https://ueaeprints.uea.ac.uk/id/eprint/26045
DOI: 10.1109/ICASSP.2010.5495071

Actions (login required)

View Item View Item