Speaker independent visual-only language identification

Newman, Jacob; Cox, Stephen

doi:10.1109/ICASSP.2010.5495071

Speaker independent visual-only language identification

Tools

Newman, Jacob and Cox, Stephen (2010) Speaker independent visual-only language identification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2010-03-14 - 2010-03-19.

Full text not available from this repository. (Request a copy)

Abstract

We describe experiments in visual-only language identification (VLID), in which only lip shape, appearance and motion are used to determine the language of a spoken utterance. In previous work, we had shown that this is possible in speaker-dependent mode, i.e. identifying the language spoken by a multi-lingual speaker. Here, by appropriately modifying techniques that have been successful in audio language identification, we extend the work to discriminating two languages in speaker-independent mode. Our results indicate that even with viseme accuracy as low as about 34%, reasonable discrimination can be obtained. A simulation of degraded accuracy viseme recognition performance indicates that high VLID accuracy should be achievable with viseme recognition errors of the order of 50%.

Item Type:	Conference or Workshop Item (Paper)
Faculty \ School:	Faculty of Science > School of Computing Sciences
UEA Research Groups:	Faculty of Science > Research Groups > Visual Computing and Signal Processing Faculty of Science > Research Groups > Smart Emerging Technologies (former - to 2025) Faculty of Science > Research Groups > Data Science and AI Faculty of Science > Research Groups > Health Technologies
Depositing User:	Nicola Talbot
Date Deposited:	14 Mar 2011 09:47
Last Modified:	12 Oct 2025 15:34
URI:	https://ueaeprints.uea.ac.uk/id/eprint/26045
DOI:	10.1109/ICASSP.2010.5495071

Actions (login required)

View Item