Liu, Li and Shao, Ling (2013) Learning discriminative representations from RGB-D video data. In: IJCAI '13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence. UNSPECIFIED, 1493–1500. ISBN 978-1-57735-633-2
Full text not available from this repository.Abstract
Recently, the low-cost Microsoft Kinect sensor, which can capture real-time high-resolution RGB and depth visual information, has attracted increasing attentions for a wide range of applications in computer vision. Existing techniques extract hand-tuned features from the RGB and the depth data separately and heuristically fuse them, which would not fully exploit the complementarity of both data sources. In this paper, we introduce an adaptive learning methodology to automatically extract (holistic) spatio-temporal features, simultaneously fusing the RGB and depth information, from RGB-D video data for visual recognition tasks. We address this as an optimization problem using our proposed restricted graph-based genetic programming (RGGP) approach, in which a group of primitive 3D operators are first randomly assembled as graph-based combinations and then evolved generation by generation by evaluating on a set of RGB-D video samples. Finally the best-performed combination is selected as the (near-)optimal representation for a pre-defined task. The proposed method is systematically evaluated on a new hand gesture dataset, SKIG, that we collected ourselves and the public MSR Daily Activity 3D dataset, respectively. Extensive experimental results show that our approach leads to significant advantages compared with state-of-the-art hand-crafted and machine-learned features.
Item Type: | Book Section |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
Related URLs: | |
Depositing User: | Pure Connector |
Date Deposited: | 10 Feb 2017 02:27 |
Last Modified: | 31 Oct 2022 14:30 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/62412 |
DOI: |
Actions (login required)
View Item |