Shen, Yuming, Liu, Li, Shao, Ling and Song, Jingkuan (2017) Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval. In: IEEE International Conference on Computer Vision. The Institute of Electrical and Electronics Engineers (IEEE), ITA, pp. 4117-4126.
Preview |
PDF (Accepted manuscript)
- Accepted Version
Download (9MB) | Preview |
Abstract
Cross-modal hashing is usually regarded as an effective technique for large-scale textual-visual cross retrieval, where data from different modalities are mapped into a shared Hamming space for matching. Most of the traditional textual-visual binary encoding methods only consider holistic image representations and fail to model descriptive sentences. This renders existing methods inappropriate to handle the rich semantics of informative cross-modal data for quality textual-visual search tasks. To address the problem of hashing cross-modal data with semantic-rich cues, in this paper, a novel integrated deep architecture is developed to effectively encode the detailed semantics of informative images and long descriptive sentences, named as Textual-Visual Deep Binaries (TVDB). In particular, region-based convolutional networks with long short-term memory units are introduced to fully explore image regional details while semantic cues of sentences are modeled by a text convolutional network. Additionally, we propose a stochastic batch-wise training routine, where high-quality binary codes and deep encoding functions are efficiently optimized in an alternating manner. Experiments are conducted on three multimedia datasets, i.e. Microsoft COCO, IAPR TC-12, and INRIA Web Queries, where the proposed TVDB model significantly outperforms state-of-the-art binary coding methods in the task of cross-modal retrieval.
Item Type: | Book Section |
---|---|
Faculty \ School: | Faculty of Science > School of Computing Sciences |
Depositing User: | Pure Connector |
Date Deposited: | 16 Aug 2017 05:08 |
Last Modified: | 22 Oct 2022 00:03 |
URI: | https://ueaeprints.uea.ac.uk/id/eprint/64523 |
DOI: | 10.1109/ICCV.2017.441 |
Downloads
Downloads per month over past year
Actions (login required)
View Item |