Computer vision and multimedia area has growth rapidly during the past decade, since there is an increasing need for semantically understanding the content of image and video data from the prevailing images and videos sharing communities such as Flickr, YouTube, etc. However, the performance of vision and multimedia understanding system is heavily dependent on the choice of data representation. During the past years, researchers have proposed a plethora of different visual features spanning a wide spectrum, from very local to full-image and from low-level to high-level semantic features. In practice, sometimes low-level computer vision features are too primitive to capture usefully semantic information in images or videos. On the other hand, high-level features are too semantic to capture inherent images or videos meanings. Therefore, mid-level intermediate representation usually can be considered as a trade-off between low-level and high-level features. Intermediate representation performs well in practice for recognition tasks based on recent research of computer vision and multimedia. Moreover, the success of deep learning in vision and multimedia problems proves the effectiveness of these intermediate representations because deep learning can be seen as a system that learns a set of intermediate representations for the specific tasks. Developing optimal intermediate feature representation for images or videos is the crucial step for the computer vision and multimedia applications.
This special issue serves as a forum for researchers all over the world to discuss their works and recent advances in intermediate representation for the computer vision and multimedia applications. Both state-of-the-art works, as well as literature reviews, are welcome for submission. Papers addressing interesting computer vision and multimedia applications are especially encouraged. Topics of interest include, but are not limited to,
Different intermediate representation strategies for vision and multimedia problems, e.g., mid-level feature learning for object recognition/detection, attribute learning for object recognition/detection, classifier-specific intermediate representation for multimedia event detection.
Novel machine learning approach for learning intermediate representation
Intermediate representation via deep learning framework
Multi-modal intermediate representation
Survey papers regarding the topic of intermediate representation for vision and multimedia problems (go through the regular review process)
Real-world computer vision and multimedia applications based on intermediate representation, e.g., event detection, cross-media retrieval, object recognition, object detection, action recognition, object tracking, location-based services.
Paper Submission: February 1, 2016
First Notification: April 15, 2016
Revised Manuscript: June 30, 2016
Notification of Acceptance: September 30, 2016
Final Manuscript Due: December, 2016
Yan Yan, Department of Information Engineering and Computer Science, University of Trento, Italy, Email: [email protected]
Yahong Han, Department of Computer Science, Tianjin University, China, Email: [email protected]
Petia Radeva, Faculty of Mathematics, Universitat de Barcelona, Spain, Email: [email protected]
Qi Tian, Department of Computer Science, University of Texas at San Antonio, USA, Email: [email protected]