Sun, 24 Jun 2012
Audio and Multimedia Methods for Large-Scale Video Analysis

First ACM International Workshop at ACM Multimedia 2012 
29 October-2 November in Nara, Japan

***Submission deadline: July 1st 2012 ***

Media sharing sites on the Internet and the one-click upload capability of smartphones have led to a deluge of online multimedia content.  Everyday, thousands of videos are uploaded into the web creating an ever-growing demand for methods to make them easier to retrieve,  search,  and  index. While visual information is a very important part of a video, acoustic information often complements  it.  This is especially true for the analysis of consumer-produced, unconstrained videos from social media networks, such as YouTube uploads or Flickr content.

The diversity in content, recording equipment, environment, quality, etc. poses significant challenges to the current state of the art in multimedia analytics. The fact that this data is from non-professional and consumer sources means that it often has little or no manual labeling. Large-scale multimodal analysis of audio-visual material can help overcome this problem, and provide training and testing material across modalities for language understanding, human action recognition, and scene identification algorithms, with applications in robotics, interactive agents, etc. Speech and audio provide a natural modality to summarize and interact with the content of videos. Therefore, speech and audio processing is critical for multimedia analysis that goes beyond traditional classification and retrieval applications.

The goal of the 1st ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis (AMVA) is to bring together researchers and practitioners in this newly emerging field, and to foster discussion on future directions of the topic by providing a forum for focused exchanges on new ideas, developments, and results. The aim is to build a strong community and a venue that at some point can become its own conference.

Topics include novel acoustic and multimedia methods for
  * video retrieval, search, and organization
  * video navigation and interactive services
  * information extraction and summarization
  * combination, fusion, and integration of the audio, 
    visual, and other streams
  * feature extraction and machine learning on "wild" data

Submissions: Workshop submissions of 4-6 pages should be formatted according to the ACM Multimedia author kit. Submission system link:

Important dates: 
Workshop paper submission: July 1st, 2012 
Notification of acceptance: August 7th, 2012 
Camera-ready submission to Sheridan: August 15, 2012

Gerald Friedland, ICSI Berkeley (USA) 
Daniel P. W. Ellis, Columbia University (USA)
Florian Metze,  Carnegie Mellon  University (USA) 

Panel Chair: 
Ajay Divakarian, SRI/Sarnoff (USA)

