Career History

Stephen Cox trained firstly as a physicist and then as an electronic engineer, and began his career at the UK Government Communications Centre developing signal-processing algorithms. He then joined British Telecoms's research laboratories to work on speech recognition, and spent two years at the speech research unit of the Royal Signals and Radar Establishment (now Qinetiq) at Malvern, where he researched into adaptation of speech recognition algorithms to new speakers. He returned to BT to lead a team of researchers developing speech recognition algorithms for use on the UK telephone network. He joined the School of Computing Sciences at UEA as a lecturer in 1991 and was appointed professor in 2003. His research interests include speech recognition, music processing, audio identification and automatic lip-reading and he is the author and co-author of over 100 publications in these fields.  He was an invited consultant at AT&T Bell Labs, New Jersey in 1994, a visiting scientist at Nuance Communications Inc., CA, in 2000, and an invited researcher at Apple Inc., CA, in 2010.  He has acted as a consultant and reviewer for several national governments as well as the European Commission, and also consults for industry.  He is a senior member of the Institute of Electrical and Electronic Engineers and an ex committee member of the IEEE Speech and Language Technical Committee.

For a full list of my publications, most downloadable, go to http://www2.cmp.uea.ac.uk/~sjc/

 

All Publications

<- Page 1 of 4 ->

Zhou, X., Huang, Q., Xie, L., Cox, S.

(2013)

A two-layered data association approach for ball tracking

(Poster)


Howell, D., Theobald, B., Cox, S.

(2013)

Confusion modelling for automated lip-reading using weighted finite-state transducers.

UEA Repository

(Paper)


DeMarco, A., Cox, S.

(2013)

Native accent classification via I-Vectors and speaker compensation fusion

UEA Repository

(Paper)


Newman, J., Cox, S.

(2012)

Language Identification Using Visual Features

in IEEE Transactions on Audio, Speech and Language Processing

20.

pp. 1936-1947

Full Text UEA Repository

(Article)


Zhou, X., Huang, Q., Xie, L., Cox, S.

(2012)

Detection of Ball Hits in a Tennis Game using audio and visual information

(Poster)


DeMarco, A., Cox, S.

(2012)

Iterative classification of Regional British Accents via I-Vector Space

(Poster)


Huang, Q., Cox, S.

(2012)

Improved Audio Event Detection by use of Contextual Noise

(Poster)


Bowden, R., Cox, S., Harvey, R., Lan, Y., Ong, E., Owen, G., Theobald, B.

(2012)

Is automated conversion of video to text a reality?

In: Proceedings of SPIE - The International Society for Optical Engineering.

ISBN 9780819492876

Full Text UEA Repository

(Chapter)


Demarco, A., Cox, S.

(2012)

Iterative Classification Of Regional British Accents In I-Vector Space

UEA Repository

(Paper)


Demarco, A., Cox, S.

(2011)

An Accurate and Robust Gender Identification Algorithm

(Poster)


Huang, Q., Cox, S.

(2011)

Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2011)

Iterative Improvement of Speaker Segmentation using High-level Knowledge

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2011)

Learning Score Structure from Spoken Language for A Tennis Game

(Poster)


Huang, Q., Cox, S.

(2010)

Inferring the Structure of a Tennis Game using Audio Information

in IEEE Transactions on Audio, Speech, and Language Processing

19.

pp. 1925-1937

Full Text UEA Repository

(Article)


Huang, Q., Cox, S.

(2010)

Using High-Level Information to Detect Key Audio Events in a Tennis Game

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2010)

Hierarchical language modeling for audio events detection in a sports game

Full Text UEA Repository

(Paper)


Newman, J., Cox, S.

(2010)

Speaker independent visual-only language identification

Full Text UEA Repository

(Paper)


West, K., Cox, S.

(2010)

Incorporating cultural representations of features into audio music similarity estimation

in IEEE Transactions on Audio, Speech, and Language Processing

18.

pp. 625-637

Full Text UEA Repository

(Article)


Huang, Q., Cox, S.

(2010)

Shallow Parsing of a Tennis Game from Audio Events

UEA Repository

(Paper)


Newman, J., Theobald, B., Cox, S.

(2010)

Limitations of Visual Speech Recognition

UEA Repository

(Paper)


<- Page 1 of 4 ->

Key Research Interests

Stephen Cox is part of the Speech, Language and Audio Processing Group

His principal research interest is in speech processing, especially automatic speech recognition. Current research projects are in the use of speaker adaptation for speech recognition, speech synthesis, confidence measures for speech recognisers and automatic routing of telephone enquiries. He is the author of over 60 papers in the field of speech processing.
 

Publications:

Caballero Morales, S.O. and Cox, S.J., Modelling Confusion-Matrices to Improve Speech Recognition Accuracy, with an Application to Dysarthric Speech. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Read, I. and Cox, S.J., Automatic Pitch Accent Prediction for Text-To-Speech Synthesis. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Cox, S.J., On Estimation of Speakers’ Confusion Matrices from Sparse Data. Proc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Caballero Morales, S.O. and Cox, S.J., Application of Weighted Finite-State Transducers to Improve Recognition Accuracy for Dysarthric Speech. Proc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Cox, S.J., Harvey, R., Lan, Y., Newman, J.L. and Theobald, B.J., The challenge of multispeaker lip-reading. Proc. International Conference on Auditory-Visual Speech Processing 2008, Tangalooma, Australia.

Theobald, B.J., Harvey, R., Cox, S.J., Lewis, C. and Owen, G.P., Lip-reading enhancement for law enforcement. Proc. SPIE conference on Optics and Photonics for Counterterrorism and Crime Fighting, G. Owen and C. Lewis, Eds., vol. 6402, September 2006, pp. 205–9.

Newman, J.L. and Cox, S.J., Automatic Visual-Only Language Identification: A Preliminary Study. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Taiwan, 2009.

Caballero Morales, S.O. and Cox, S.J., Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers. EURASIP Journal on Advances in Signal Processing. Volume 2009 (2009), Article ID 308340, 14 pages. doi:10.1155/2009/308340.

Caballero Morales, S.O. and Cox, S.J., On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy. Proc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Watkins, C. and Cox,S.J.,Example-Based Speech Recognition using Formulaic PhrasesProc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Newman, J.L. and Cox, S.J., Speaker Independent Visual-Only Language Identification. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 Huang, Q. and Cox, S.J., Hierarchical Language Modeling for Audio Events Detection in a Sports Game. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 

Selected Publications:

Read, I. and Cox, S. J., Stochastic and Syntactic Techniques for Predicting Phrase Breaks. Computer Speech and Language, Volume 21, Issue 3, Page(s) 519-542, 2007.

Huang, Q. and Cox, S. J., Task-Independent Call-Routing. Speech Communication, Volume 48, Issues 3-4, Page(s) 374-389, 2006.

Cox, S. J., Lincoln, M., Nakisa, M., Wells, M., Tutt, M. and Abbott, S., The Development and Evaluation of a Speech to Sign Translation System to Assist Transactions. Int. Journal of Human Computer Interaction, Volume 16, Issue 2, Page(s) 141-161, 2003.

Cox, S. J. and Dasmahapatra, S., High Level Approaches to Confidence Estimation in Speech Recognition. IEEE Transactions on Speech and Audio, Volume 10, Issue 7, Page(s) 460-471, 2002.
 

External Activities and Indicators of Esteem

  • Member IEEE Speech and Language Processing Technical Committee, 2006
  • Chairman, UK Institute of Acoustics Speech Group, 1998-2003
  • Keynote speech, AMI Workshop on “Multimodal Interaction and Related Machine Learning Algorithms”, Martigny, Switzerland, 2004

Key Responsibilities

Director of Graphics, Vision and Speech Language Laboratory
Head of Speech Language Group
Director of Research