21 February 2013
Improving Bag-of-visual-Words Model with Spatial-Temporal Correlation for Video Retrieval
Meeting Room 10, 2nd Floor, JLB
12:30pm - 13:45pm
A interesting video information retrieval modality is to query by a video which describes the intention of the user. Most of the state-of-art approaches to Query-by-Example (QBE) video retrieval are based on the Bag-of-visual-Words (BovW) representation of visual content. It, however, ignores the spatial-temporal information, which is important for similarity measurement between videos. Direct incorporation of such information into the video data representation for a large scale data set is computationally expensive in terms of storage and similarity measurement. It is also static regardless of the change of discriminative power of visual words for different queries. To tackle these limitations, in this paper, we propose to discover Spatial-Temporal Correlations (STC) imposed by the query example to improve the BovW model for video retrieval. The STC, in terms of spatial proximity and relative motion coherence between different visual words, is crucial to identify the discriminative power of the visual words. We develop a novel technique to emphasize the most discriminative visual words for similarity measurement, and incorporate this STC-based approach into the standard inverted index architecture.
I am a PhD student of Robert Gordon University and my PhD research is supervised by Professor Dawei Song. My research focus on content based multimedia information retrieval and the progress has been published in some mainstream information retrieval conference, such as: CIKM, SIGIR and ECIR. My research Interests includes: Information Retrieval, Multimedia Information Retrieval, Computer Vision, Image Processing and video understanding.
Save to your Calendar