This is a joint research project with the Centre for Vision Speech and Signal Processing (CVSSP) at the University of Surrey.

The aim of the project is to research methods that will lead to machines that are capable of making sense of interactions between humans and understanding real-world events.  We are beginning by using a human activity, a sports game, that is "easy" to understand because: 

  • the goals are well-defined;
  • there are a small number of well-defined events;
  • the syntax of these events is tightly circumscribed by a set of rules.
     

The game that we have chosen to use is tennis.  The project is nearly complete and we have made important research contributions in the following areas:

  • detection of salient audio events;
  • use of contextual information to disambiguate audio events
  • detecting anomalous events

References

  1. Qiang Huang and Stephen Cox Inferring the Structure of a Tennis Game using Audio Information
    IEEE Transactions on Audio, Speech & Language Processing, Vol. 19 No 7, pp. 1925–1937, September 2011.
  2. Zhou, X., Huang, Q., Xie, L., Cox, S.J., A Two Layered Data Association Approach For Ball Tracking, Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Vancouver, 2013 (submitted)
  3. Huang, Q., Cox, S.J., Detection Of Anomalous Events In A Tennis Game Using Multimodal Information, Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Vancouver, 2013 (submitted)
  4. Huang, Q., Cox, S.J., Zhou, X., Xie, L., Detection of Ball Hits in a Tennis Game Using Audio and Visual Information, Proc. Asia-Pacific Signal and Information Processing Association (APSIPA), Hollywood, December 2012
  5. Huang, Q., Cox, S.J., Improved Audio Event Detection By Use Of Contextual Noise, Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Kyoto, 2012
  6. Huang, Q., Cox, S.J., Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information
    Proc. International Conference on Auditory-Visual Speech Processing (AVSP) 2011
  7. Huang, Q., Cox, S.J., Learning Score Structure from Spoken Language for A Tennis Game, Proc. 14th International Conference on Spoken Language Processing (Interspeech), Florence, August 2011
  8. Huang, Q., Cox, S.J., Shallow Parsing of a Tennis Game from Audio Events, Fourth International Conference on Intelligent Information Technology Application (IITA 2010), Qinhuangdao, China, November 5 - 7, 2010
  9. Huang, Q., Cox, S.J., Using High-level Information to Detect Key Audio Events in a Tennis Game, Proc. 13th International Conference on Spoken Language Processing (Interspeech), Makuhari, September 2010
  10. Huang, Q., Cox, S.J., Hierarchical Language Modeling for Audio Events Detection in a Sports Game, Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010

Research Team

Dr. Quiang Huang, Prof. Stephen Cox