Inferring the structure of a tennis game using audio information

Huang, Qiang and Cox, Stephen (2010) Inferring the structure of a tennis game using audio information. IEEE Transactions on Audio, Speech, and Language Processing, 19 (7). pp. 1925-1937. ISSN 1558-7916

Full text not available from this repository. (Request a copy)


We describe a novel framework for inferring the low-level structure of a sports game (tennis) using only the information available on the audio track of a video recording of the game. Our goal is to segment the games into a sequence of points, the natural unit for describing a tennis match. The framework is hierarchical, consisting of, at the lowest level, identification of audio events, followed by match (i.e. semantic) events, and at the highest level, game points. Different techniques that are appropriate to the characteristics of each of these events are used to detect them, and these techniques are coupled in a probabilistic framework. The techniques consist of Gaussian mixture models and a hierarchical language model to detect sequences of audio events, a maximum entropy Markov model to infer match events from these audio events, and multigrams to infer the segmentation of a sequence of match events into sequences of points in a a tennis game. Our results are promising, giving an F-score for the final detection of points of > 0:7.

Item Type: Article
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Smart Emerging Technologies
Faculty of Science > Research Groups > Interactive Graphics and Audio
Depositing User: Nicola Talbot
Date Deposited: 14 Mar 2011 10:15
Last Modified: 12 Jan 2024 01:20
DOI: 10.1109/TASL.2010.2103059

Actions (login required)

View Item View Item