Resolution Limits on visual speech recognition

Bear, Helen L.; Harvey, Richard; Lan, Yuxuan

doi:10.1109/ICIP.2014.7025274

Resolution Limits on visual speech recognition

Tools

Bear, Helen L., Harvey, Richard ORCID: https://orcid.org/0000-0001-9925-8316 and Lan, Yuxuan (2014) Resolution Limits on visual speech recognition. In: International Conference on Image Processing, 2008-10-12 - 2008-10-15.

Full text not available from this repository. (Request a copy)

Abstract

Visual-only speech recognition is dependent upon a number of factors that can be difficult to control, such as: lighting; identity; motion; emotion and expression. But some factors, such as video resolution are controllable, so it is surprising that there is not yet a systematic study of the effect of resolution on lip-reading. Here we use a new data set, the Rosetta Raven data, to train and test recognizers so we can measure the affect of video resolution on recognition accuracy. We conclude that, contrary to common practice, resolution need not be that great for automatic lip-reading. However it is highly unlikely that automatic lip-reading can work reliably when the distance between the bottom of the lower lip and the top of the upper lip is less than four pixels at rest.

Item Type:	Conference or Workshop Item (Paper)
Faculty \ School:	Faculty of Science Faculty of Science > School of Computing Sciences
UEA Research Groups:	Faculty of Science > Research Groups > Visual Computing and Signal Processing (former - to 2025) Faculty of Science > Research Groups > Smart Emerging Technologies (former - to 2025)
Depositing User:	Pure Connector
Date Deposited:	09 Mar 2015 07:39
Last Modified:	18 Jun 2026 21:12
URI:	https://ueaeprints.uea.ac.uk/id/eprint/52353
DOI:	10.1109/ICIP.2014.7025274

Actions (login required)

View Item