Biography

Website: https://www.uea.ac.uk/computing/speech-language-and-audio-processing

Follow this link for details of current PhD opportunities in Computing Sciences. But feel free to email me to discuss projects outside these areas and alternative sources of funding.

Career History

Stephen Cox trained firstly as a physicist and then as an electronic engineer, and began his career at the UK Government Communications Centre developing signal-processing algorithms. He then joined British Telecoms's research laboratories to work on speech recognition, and spent two years at the speech research unit of the Royal Signals and Radar Establishment (now Qinetiq) at Malvern, where he researched into adaptation of speech recognition algorithms to new speakers. He returned to BT to lead a team of researchers developing speech recognition algorithms for use on the UK telephone network. He joined the School of Computing Sciences at UEA as a lecturer in 1991 and was appointed professor in 2003. His research interests include speech recognition, music processing, audio identification and automatic lip-reading and he is the author and co-author of over 100 publications in these fields.  He was an invited consultant at AT&T Bell Labs, New Jersey in 1994, a visiting scientist at Nuance Communications Inc., CA, in 2000, and an invited researcher at Apple Inc., CA, in 2010.  He has acted as a consultant and reviewer for several national governments as well as the European Commission, and also consults for industry.  He is a senior member of the Institute of Electrical and Electronic Engineers and an ex committee member of the IEEE Speech and Language Technical Committee.

For a full list of my publications, most downloadable, go to http://www2.cmp.uea.ac.uk/~sjc/

 

All Publications

<- Page 1 of 4 ->

Howell, D., Cox, S., Theobald, B.

(2016)

Visual units and confusion modelling for automatic lip-reading

in Image and Vision Computing

51.

pp. 1-12

Full Text UEA Repository

(Article)


Zhou, X., Xie, L., Huang, Q., Cox, S., Zhang, Y.

(2015)

Tennis Ball Tracking using a Two-Layered Data Association Approach

in IEEE Transactions on Multimedia

17.

pp. 145-156

Full Text UEA Repository

(Article)


Bear, H. L., Cox, S., Harvey, R.

(2015)

Speaker-independent machine lip-reading with speaker-dependent viseme classifiers

UEA Repository

(Paper)


Yan, F., Kittler, J., Windridge, D., Christmas, W., Mikolajczyk, K., Cox, S., Huang, Q.

(2014)

Automatic annotation of tennis games: An integration of audio, vision, and learning

in Image and Vision Computing

32.

pp. 896-903

Full Text UEA Repository

(Article)


Najafian, M., de Marco, A., Cox, S., Russell, M.

(2014)

Supervised and Unsupervised Adaptation to Regional Accented Speech using Limited Data for Automatic Speech Recognition

(Conference contribution)


Zhou, X., Huang, Q., Xie, L., Cox, S.

(2013)

A two-layered data association approach for ball tracking

UEA Repository

(Paper)


Howell, D., Theobald, B., Cox, S.

(2013)

Confusion modelling for automated lip-reading using weighted finite-state transducers.

UEA Repository

(Paper)


DeMarco, A., Cox, S.

(2013)

Native accent classification via I-Vectors and speaker compensation fusion

UEA Repository

(Paper)


Cox, S., Harvey, R., Lan, Y., Bowden, R., Ong, E., Owen, G., Theobald, B.

(2013)

Recent developments in automated lip-reading

UEA Repository

(Paper)


Newman, J., Cox, S.

(2012)

Language Identification Using Visual Features

in IEEE Transactions on Audio, Speech and Language Processing

20.

pp. 1936-1947

Full Text UEA Repository

(Article)


Zhou, X., Huang, Q., Xie, L., Cox, S.

(2012)

Detection of Ball Hits in a Tennis Game using audio and visual information

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2012)

Improved Audio Event Detection by use of Contextual Noise

UEA Repository

(Paper)


Bowden, R., Cox, S., Harvey, R., Lan, Y., Ong, E., Owen, G., Theobald, B.

(2012)

Is automated conversion of video to text a reality?

ISBN 9780819492876

Full Text UEA Repository

(Chapter)


Demarco, A., Cox, S.

(2012)

Iterative Classification Of Regional British Accents In I-Vector Space

UEA Repository

(Paper)


Demarco, A., Cox, S.

(2011)

An Accurate and Robust Gender Identification Algorithm

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2011)

Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2011)

Iterative Improvement of Speaker Segmentation using High-level Knowledge

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2011)

Learning Score Structure from Spoken Language for A Tennis Game

UEA Repository

(Paper)


Huang, Q., Cox, S.

(2010)

Inferring the Structure of a Tennis Game using Audio Information

in IEEE Transactions on Audio, Speech, and Language Processing

19.

pp. 1925-1937

Full Text UEA Repository

(Article)


Huang, Q., Cox, S.

(2010)

Using High-Level Information to Detect Key Audio Events in a Tennis Game

UEA Repository

(Paper)


<- Page 1 of 4 ->

Key Research Interests

Stephen Cox is part of the Speech, Language and Audio Processing Group

His principal research interest is in speech processing, especially automatic speech recognition. Current research projects are in the use of speaker adaptation for speech recognition, speech synthesis, confidence measures for speech recognisers and automatic routing of telephone enquiries. He is the author of over 60 papers in the field of speech processing.
 

Publications:

Caballero Morales, S.O. and Cox, S.J., Modelling Confusion-Matrices to Improve Speech Recognition Accuracy, with an Application to Dysarthric Speech. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Read, I. and Cox, S.J., Automatic Pitch Accent Prediction for Text-To-Speech Synthesis. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Cox, S.J., On Estimation of Speakers’ Confusion Matrices from Sparse Data. Proc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Caballero Morales, S.O. and Cox, S.J., Application of Weighted Finite-State Transducers to Improve Recognition Accuracy for Dysarthric Speech. Proc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Cox, S.J., Harvey, R., Lan, Y., Newman, J.L. and Theobald, B.J., The challenge of multispeaker lip-reading. Proc. International Conference on Auditory-Visual Speech Processing 2008, Tangalooma, Australia.

Theobald, B.J., Harvey, R., Cox, S.J., Lewis, C. and Owen, G.P., Lip-reading enhancement for law enforcement. Proc. SPIE conference on Optics and Photonics for Counterterrorism and Crime Fighting, G. Owen and C. Lewis, Eds., vol. 6402, September 2006, pp. 205–9.

Newman, J.L. and Cox, S.J., Automatic Visual-Only Language Identification: A Preliminary Study. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Taiwan, 2009.

Caballero Morales, S.O. and Cox, S.J., Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers. EURASIP Journal on Advances in Signal Processing. Volume 2009 (2009), Article ID 308340, 14 pages. doi:10.1155/2009/308340.

Caballero Morales, S.O. and Cox, S.J., On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy. Proc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Watkins, C. and Cox,S.J.,Example-Based Speech Recognition using Formulaic PhrasesProc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Newman, J.L. and Cox, S.J., Speaker Independent Visual-Only Language Identification. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 Huang, Q. and Cox, S.J., Hierarchical Language Modeling for Audio Events Detection in a Sports Game. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 

Selected Publications:

Read, I. and Cox, S. J., Stochastic and Syntactic Techniques for Predicting Phrase Breaks. Computer Speech and Language, Volume 21, Issue 3, Page(s) 519-542, 2007.

Huang, Q. and Cox, S. J., Task-Independent Call-Routing. Speech Communication, Volume 48, Issues 3-4, Page(s) 374-389, 2006.

Cox, S. J., Lincoln, M., Nakisa, M., Wells, M., Tutt, M. and Abbott, S., The Development and Evaluation of a Speech to Sign Translation System to Assist Transactions. Int. Journal of Human Computer Interaction, Volume 16, Issue 2, Page(s) 141-161, 2003.

Cox, S. J. and Dasmahapatra, S., High Level Approaches to Confidence Estimation in Speech Recognition. IEEE Transactions on Speech and Audio, Volume 10, Issue 7, Page(s) 460-471, 2002.
 

External Activities and Indicators of Esteem

  • Member IEEE Speech and Language Processing Technical Committee, 2006
  • Chairman, UK Institute of Acoustics Speech Group, 1998-2003
  • Keynote speech, AMI Workshop on “Multimodal Interaction and Related Machine Learning Algorithms”, Martigny, Switzerland, 2004

Key Responsibilities

Director of Graphics, Vision and Speech Language Laboratory
Head of Speech Language Group
Director of Research