Document retrieval using image features

Smith, Dan and Harvey, Richard ORCID: (2010) Document retrieval using image features. In: SAC ACM, 2010-01-01.

Full text not available from this repository. (Request a copy)


This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted series of experiments using cosine distance as our similarity measure, selecting varying numbers local interest points per page, and varying numbers of nearest neighbour points in the similarity calculations. We have found that a distance-based measure of similarity outperforms a rank-based measure except when there are few interest points. We show that using visual features substantially outperforms text-based approaches for noisy text, giving average precision in the range 0.4--0.43 in several experiments retrieving scientific papers.

Item Type: Conference or Workshop Item (Paper)
Faculty \ School: Faculty of Science > School of Computing Sciences
UEA Research Groups: Faculty of Science > Research Groups > Interactive Graphics and Audio
Faculty of Science > Research Groups > Smart Emerging Technologies
Depositing User: Vishal Gautam
Date Deposited: 11 Mar 2011 16:38
Last Modified: 22 Apr 2023 02:43
DOI: 10.1145/1774088.1774098

Actions (login required)

View Item View Item