Find us on: University of East Anglia on Facebook Follow University of East Anglia news on Twitter University of East Anglia's YouTube channel

Faculty

Prof Stephen Cox

Stephen Cox
Job Title Contact Location
Professor of Computing Science  S dot J dot Cox at uea dot ac dot uk
Tel: +44 (0)1603 59 2582  
Biology 2.20 
  • Personal
  • Research
  • Publications
  • External
  • Admin

Career

Stephen Cox trained firstly as a physicist and then as an electronic engineer, and began his career at the UK Government Communications Centre developing signal-processing algorithms. He then joined British Telecoms's research laboratories to work on speech recognition, and spent two years at the speech research unit of the Royal Signals and Radar Establishment (now Qinetiq) at Malvern, where he researched into adaptation of speech recognition algorithms to new speakers. He returned to BT to lead a team of researchers developing speech recognition algorithms for use on the UK telephone network. He joined the School of Computing Sciences at UEA as a lecturer in 1991 and was appointed professor in 2003. His research interests include speech recognition, music processing, audio identification and automatic lip-reading and he is the author and co-author of over 100 publications in these fields.  He was an invited consultant at AT&T Bell Labs, New Jersey in 1994, a visiting scientist at Nuance Communications Inc., CA, in 2000, and an invited researcher at Apple Inc., CA, in 2010.  He has acted as a consultant and reviewer for several national governments as well as the European Commission, and also consults for industry.  He is a senior member of the Institute of Electrical and Electronic Engineers and an ex committee member of the IEEE Speech and Language Technical Committee.

For a full list of my publications, most downloadable, go to http://www2.cmp.uea.ac.uk/~sjc/

PhD Projects

The following opportunity exists to pursue a PhD with Prof. Cox on the projects:

Multimodal Understanding of Events

Reducing the Footprint and Increasing the Robustness of Speech Recognition Systems

Face-mapping for Lip Reading

Website

Key Research Interests

Stephen Cox is part of the Speech, Language and Music Group

His principal research interest is in speech processing, especially automatic speech recognition. Current research projects are in the use of speaker adaptation for speech recognition, speech synthesis, confidence measures for speech recognisers and automatic routing of telephone enquiries. He is the author of over 60 papers in the field of speech processing.
 
Publications:

Caballero Morales, S.O. and Cox, S.J., Modelling Confusion-Matrices to Improve Speech Recognition Accuracy, with an Application to Dysarthric Speech. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Read, I. and Cox, S.J., Automatic Pitch Accent Prediction for Text-To-Speech Synthesis. Proc. 10th International Conference on Spoken Language Processing (Interspeech), Antwerp, August 2007

Cox, S.J., On Estimation of Speakers’ Confusion Matrices from Sparse Data. Proc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Caballero Morales, S.O. and Cox, S.J., Application of Weighted Finite-State Transducers to Improve Recognition Accuracy for Dysarthric SpeechProc. 11th International Conference on Spoken Language Processing (Interspeech), Brisbane, September 2008

Cox, S.J., Harvey, R., Lan, Y., Newman, J.L. and Theobald, B.J., The challenge of multispeaker lip-reading.
Proc. International Conference on Auditory-Visual Speech Processing 2008, Tangalooma, Australia.

Theobald, B.J., Harvey, R., Cox, S.J., Lewis, C. and Owen, G.P., Lip-reading enhancement for law enforcement. Proc. SPIE conference on Optics and Photonics for Counterterrorism and Crime Fighting, G. Owen and C. Lewis, Eds., vol. 6402, September 2006, pp. 205–9.

Newman, J.L. and Cox, S.J., Automatic Visual-Only Language Identification: A Preliminary Study. Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Taiwan, 2009.

Caballero Morales, S.O. and Cox, S.J., Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers. EURASIP Journal on Advances in Signal Processing. Volume 2009 (2009), Article ID 308340, 14 pages. doi:10.1155/2009/308340.

Caballero Morales, S.O. and Cox, S.J., On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy. Proc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Watkins, C. and Cox,
S.J., Example-Based Speech Recognition using Formulaic PhrasesProc. 12th International Conference on Spoken Language Processing (Interspeech), Brighton, September 2009.

Newman, J.L. and Cox, S.J., Speaker Independent Visual-Only Language Identification.
Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 Huang, Q. and Cox, S.J., Hierarchical Language Modeling for Audio Events Detection in a Sports Game.
Proc. IEEE Conf. on Acoustics, Speech and Signal Processing, Dallas, 2010.

 

Selected Publications:

Read, I. and Cox, S. J., Stochastic and Syntactic Techniques for Predicting Phrase Breaks. Computer Speech and Language, Volume 21, Issue 3, Page(s) 519-542, 2007.

Huang, Q. and Cox, S. J., Task-Independent Call-Routing. Speech Communication, Volume 48, Issues 3-4, Page(s) 374-389, 2006.

Cox, S. J., Lincoln, M., Nakisa, M., Wells, M., Tutt, M. and Abbott, S., The Development and Evaluation of a Speech to Sign Translation System to Assist Transactions. Int. Journal of Human Computer Interaction, Volume 16, Issue 2, Page(s) 141-161, 2003.

Cox, S. J. and Dasmahapatra, S., High Level Approaches to Confidence Estimation in Speech Recognition. IEEE Transactions on Speech and Audio, Volume 10, Issue 7, Page(s) 460-471, 2002.
 

List all publications by Professor Stephen Cox



Past Research Projects and Grants

Project Title Start Date End Date Funding Body Project Members
LILiR2 Language Independent Lip Reading 31/5/2007 29/9/2010 EPSRC Richard Harvey, Stephen Cox, Barry Theobald
Classification of anaesthesia breath sounds (COABS) -AU076 1/2/2003 1/10/2003 Ipswich NHS N/A
Essential sign language 1/10/2002 6/9/2004 CEC (Framework 1-5) Ralph Elliott, Stephen Cox, John Glauert
Annotatiuon of speech database (Nuance 3) 1/4/2002 30/6/2002 Nuance Communications Inc. Stephen Cox, Ben Milner
VISTA - Virtual interface for a set-top box agent (LINK) 12/3/2002 11/9/2003 ESRC Stephen Cox, Caroline Rose, Hugh Graham, Nicholas Wilkinson
Annotation of speech database 14/1/2002 22/2/2002 Nuance Communications Inc. Stephen Cox, Ben Milner
Annotation of a large speech database 1/6/2001 11/1/2002 Nuance Communications Inc. Stephen Cox, Ben Milner, Gavin Cawley
Telephone speech recognition prototype for deaf users 31/3/2001 31/8/2002 RNID N/A
Virtual Signing: Capture, Animation, Storage and Transmission (VISICAST) 1/1/2000 31/12/2002 CEC (Framework 1-5) Ralph Elliott, Stephen Cox, Andrew Bangham, John Glauert
Virtual signing, capture, animation storage and transmission 1/1/2000 31/12/2002 Post Office Counters Ltd N/A
Confidence measures for speech recognition 16/2/1998 15/8/2001 EPSRC Stephen Cox, Gavin Cawley

Number of items: 61.

Article

Huang, Q and Cox, S (2010) Inferring the Structure of a Tennis Game using Audio Information. IEEE Transactions on Audio, Speech, and Language Processing, 19 (7). pp. 1925-1937. ISSN 1558-7916

West, K. and Cox, S. (2010) Incorporating cultural representations of features into audio music similarity estimation. IEEE Transactions on Audio, Speech, and Language Processing, 18 (3). pp. 625-637. ISSN 1558-7916

Caballero Morales, Santiago-Omar and Cox, Stephen (2009) Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers. EURASIP Journal on Advances in Signal Processing, 2009, Article ID 308340. ISSN 1687-6172

Read, Ian and Cox, Stephen J. (2007) Stochastic and Syntactic Techniques for Predicting Phrase Breaks. Computer Speech and Language, 21 (3). pp. 519-542. ISSN 0885-2308

Glauert, J. R. W., Elliot, R., Cox, S. J., Tryggvason, J. and Sheard, M. (2006) VANESSA - A system for communication between deaf and hearing people. Technology and Disability, 18 (4). pp. 207-216. ISSN 1055-4181

Huang, Q. and Cox, S.J. (2006) Task-Independent Call-Routing. Speech Communication, 48 (3-4). pp. 374-389. ISSN 0167-6393

Cox, S. J. and Vinagre, L. (2004) Modelling of Confusions in Aircraft Call Signs. Speech Communication, 42 (3-4). pp. 289-312. ISSN 0167-6393

Wray, A., Cox, S. J., Lincoln, M. and Tryggvason, J. (2004) A formulaic approach to translation at the post office: reading the signs. Language and Communication, 24 (1). pp. 59-75. ISSN 0271-5309

Cox, S. J., Lincoln, M., Tryggvason, J., Nakisa, M., Wells, M., Tutt, M. and Abbott, S. (2003) The Development and Evaluation of a Speech to Sign Translation System to Assist Transactions. International Journal of Human-Computer Interaction, 16 (2). pp. 141-161. ISSN 1044-7318

Cox, S. J. and Dasmahapatra, S. (2002) High-Level Approaches to Confidence Estimation in Speech Recognition. IEEE Transactions on Speech and Audio Processing, 10 (7). pp. 460-471. ISSN 1063-6676

Cox, S. J., Marshall, I. and Safar, E. (2002) What are the difficulties of translating English to Sign language? The Linguist, 41 (1). pp. 6-10. ISSN 0268-5965

Matthews, I, Cootes, TF, Bangham, JA, Cox, SJ and Harvey, RW (2002) Extraction of Visual Features for Lipreading. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (2). pp. 198-213. ISSN 0162-8828

Theobald, Barry, Cox, Stephen, Cawley, Gavin and Milner, Ben (1999) Fast Method of Channel Equalisation for Speech Signals and its Implementation on a DSP. IEE Electronics Letters, 35 (16). pp. 1309-1311. ISSN 0013-5194

Cox, Stephen (1995) Predictive Speaker Adaptation in Speech recognition. Computer Speech and Language, 9 (1). pp. 1-17.

Cox, Stephen J. (1992) Speaker adaptation in Speech Recognition using Linear Regression Techniques. Electronic Letters, 28 (22). pp. 2093-2094. ISSN 0013-5194

Monograph

Linford, PW, Tattersall, GD, Cox, SJ and Rush, SA (1992) Matching Algorithms to Hardware. Research Report. UEA/BT.

Conference or Workshop Item

DeMarco, A and Cox, SJ (2011) An Accurate and Robust Gender Identification Algorithm. In: 14th International Conference on Spoken Language Processing (Interspeech), August 2011, Florence.

Huang, Q and Cox, SJ (2011) Iterative Improvement of Speaker Segmentation using High-level Knowledge. In: 14th International Conference on Spoken Language Processing (Interspeech), August 2011, Florence.

Huang, Q and Cox, SJ (2011) Learning Score Structure from Spoken Language for A Tennis Game. In: 14th International Conference on Spoken Language Processing (Interspeech), August 2011, Florence.

Huang, Q and Cox, SJ (2011) Improved Detection of Ball Hit Events in a Tennis Game Using Multimodal Information. In: International Conference on Auditory-visual Speech Processing, 2011, Volterra.

Huang, Qiang and Cox, Stephen (2010) Using High-Level Information to Detect Key Audio Events in a Tennis Game. In: 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), September 26-30. 2010, Makuhari, Chiba, Japan.

Huang, Qiang and Cox, Stephen (2010) Hierarchical language modeling for audio events detection in a sports game. In: IEEE International Conference on Acoustics, Speech, and Signal Processing , 14-19 March 2010, Dallas, TX.

Newman, Jacob and Cox, Stephen (2010) Speaker independent visual-only language identification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing , 14-19 March 2010, Dallas, TX.

Huang, Q and Cox, S (2010) Shallow Parsing of a Tennis Game from Audio Events. In: Fourth International Conference on Intelligent Information Technology Application (IITA 2010), November 5 - 7, 2010, Qinhuangdao, China. (In Press)

Newman, Jacob, Theobald, Barry and Cox, Stephen (2010) Limitations of Visual Speech Recognition. In: Proceedings of the International Conference on Auditory-Visual Speech Processing, Hakone, Kanagawa, Japan.

Caballero Morales, Omar and Cox, Stephen (2009) On the Estimation and the Use of Confusion-Matrices for Improving ASR Accuracy. In: 10th Annual Conference of the International Speech Communication Association (INTERSPEECH), September 6-10, 2009, Brighton, United Kingdom.

Watkins, Christopher and Cox, Stephen (2009) Example-Based Speech Recognition Using Formulaic Phrases. In: 10th Annual Conference of the International Speech Communication Association (INTERSPEECH), September 6-10, 2009, Brighton, United Kingdom.

Newman, Jacob and Cox, Stephen (2009) Automatic visual-only language identification: A preliminary study. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 19-24 April 2009, Taipei, Taiwan.

Caballero Morales, Omar and Cox, Stephen (2008) Application of Weighted Finite-State Transducers to Improve Recognition Accuracy for Dysarthric Speech. In: 9th Annual Conference of the International Speech Communication Association (INTERSPEECH), September 22-26, 2008, Brisbane, Australia.

Cox, Stephen (2008) On Estimation of a Speaker's Confusion Matrix from Sparse Data. In: 9th Annual Conference of the International Speech Communication Association (INTERSPEECH), September 22-26, 2008, Brisbane, Australia.

Cox, Stephen, Harvey, Richard, Lan, Yuxuan, Newman, Jacob and Theobald, Barry (2008) The Challenge of Multispeaker Lip-Reading. In: International Conference on Auditory-Visual Speech Processing, September 26-29, 2008, Queensland, Australia.

Caballero-Morales, Omar and Cox, Stephen (2007) Modelling Confusion-Matrices to Improve Speech Recognition Accuracy, with an Application to Dysarthric Speech. In: 8th Annual Conference of the International Speech Communication Association (Interspeech), August 27-31, 2007, Antwerp, Belgium.

Jenkins, Marie-Claire, Churchill, Richard, Cox, Stephen J. and Smith, Dan J. (2007) Analysis of user interaction with service oriented chatbot systems. In: Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments, Beijing, China.

Read, Ian and Cox, Stephen J. (2007) Automatic Pitch Accent Prediction for Text-To-Speech Synthesis. In: 8th Annual Conference of the International Speech Communication Association (INTERSPEECH-2007), August 27-31, 2007, Antwerp, Belgium.

Theobald, Barry, Harvey, Richard, Cox, Stephen, Lewis, Colin and Owens, Gari (2006) Lip-reading enhancement for law enforcement. In: Proceedings of SPIE 6402 Conference on Optics and Photonics for Counterterrorism and Crime Fighting, 11 September 2006, Stockholm, Sweden.

West, K., Cox, S. J. and Lamere, P. (2006) Incorporating Machine Learning into Music Similarity Estimation. In: Proceedings of the 1st ACM workshop on Audio and music computing multimedia, October 23 - 27, 2006, Santa Barbara, California, USA.

Cox, S. J. (2005) A Discriminative Approach to Phrase Break Modelling. In: Proceedings of 9th European Conference on Speech Communication and Technology (INTERSPEECH'05), September 4-8, 2005, Lisbon, Portugal.

West, K. and Cox, S. J. (2005) Finding an Optimal Segmentation for Audio Genre Classification. In: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), 11 - 15 September 2005, London, UK.

Read, Ian and Cox, Stephen J. (2005) Stochastic and Syntactic Techniques for Predicting Phrase Breaks. In: Proceedings 9th European Conference on Speech Communicaiton and Technology (INTERSPEECH-2005), September 4-8, 2005, Lisbon, Portugal.

Huang, Q. and Cox, S. J. (2004) Mixture Language Models for Call Routing. In: 8th International Conference on Spoken Language Processing (Interspeech 2004), October 4-8, 2004, Jeju Island, Korea.

Read, I. and Cox, S. J. (2004) Using Part-Of-Speech Tags for Predicting Phrase Breaks. In: 8th International Conference on Spoken Language Processing (Interspeech 2004), October 4-8, 2004, Jeju Island, Korea.

Huang, Q. and Cox, S. J. (2004) Automatic Call Routing with Multiple Language Models. In: Proceedings of the HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems and Higher Level Linguistic Information for Speech Processing, 2-7 May 2004, Boston, Massachusetts, USA.

Huang, Q. and Cox, S. J. (2004) Improving Phoneme Recognition of Telephone Quality Speech. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 17-21 May 2004.

Cox, S. J. (2004) Using Context to Correct Phone Recognition Errors. In: 8th International Conference on Spoken Language Processing (Interspeech 2004), October 4-8, 2004, Jeju Island, Korea.

Tan, L., Cox, S. J., Bell, G. D. and Mansfield, M. (2004) Can we modify existing automatic speech recognition technology to reliably and safely monitor respiration in patients sedated with Propofol? In: British Society of Gastroenterology Annual Meeting, 21 to 24 March, 2004, Scottish Exhibition and Conference Centre, Glasgow.

West, K. and Cox, S. J. (2004) Features and Classifiers for the Automatic Classification of Musical Audio Signals. In: Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR 2004), 10-15 October 2004, Barcelona, Spain.

Cox, Stephen J. and Cawley, Gavin C. (2003) The Use of Confidence Measures in Vector Based Call-Routing. In: 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), September 1-4, 2003, Geneva, Switzerland.

Huang, Qiang and Cox, Stephen J. (2003) Automatic Call-Routing Without Transcriptions. In: Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), September 1-4, 2003, Geneva, Switzerland.

Lambert, T., Breen, Andrew P., Eggleton, Barry, Cox, Stephen J. and Milner, Ben P. (2003) Unit Selection in Concatenative TTS Synthesis Systems Based on Mel Filter Bank Amplitudes and Phonetic Context. In: Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), September 1-4, 2003, Geneva, Switzerland.

Shao, Xu, Milner, Ben P. and Cox, Stephen J. (2003) Integrated Pitch and MFCC Extraction for Speech Reconstruction and Speech Recognition Applications. In: Eurospeech-2003 — 8th European Conference on Speech Communication and Technology, September 1-4, 2003, Geneva, Switzerland.

Cox, S. J. (2003) Discriminative Techniques in Call Routing. In: IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP '03), 6-10 April 2003, Hong Kong.

Lincoln, M. and Cox, S. J. (2003) A Comparison of Language Processing Techniques for a Constrained Speech Translation System. In: IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP '03), 6-10 April 2003, Hong Kong.

Cox, Stephen J. (2002) Speech and Language Processing for a Constrained Speech Translation System. In: 7th International Conference on Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA.

Cox, Stephen J., Lincoln, Michael, Tryggvason, Judy, Nakisa, Melanie, Wells, Mark, Tutt, Marcus and Abbott, Sanja (2002) TESSA, a system to aid communication with deaf people. In: ASSETS 2002, Fifth International ACM SIGCAPH Conference on Assistive Technologies, 8-10th July 2002, Edinburgh, Scotland.

Cox, S. J. and Shahshahani, B (2001) A Comparison of some Different Techniques for Vector Based Call-Routing. In: Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH-2001), September 3-7, 2001, Aalborg, Denmark.

Cox, S.J. and Shahshahani, B (2001) Improved Techniques for automatic call-routing. In: Institute of Acoustics Workshop on Innovation in Speech Processing (WISP-2001), 2-3 April 2001, Stratford-upon-Avon, UK.

Cox, S. J. and Dasmahapatra, S. (2000) A Semantically-Based Confidence Measure for Speech Recognition. In: Sixth International Conference on Spoken Language Processing (ICSLP 2000), October 16-20, 2000, Beijing, China.

Cox, Stephen (2000) Speaker Normalisation in the MFCC Domain. In: Sixth International Conference on Spoken Language Processing (ICSLP 2000), October 16-20, 2000, Beijing, China.

Dasmahapatra, S. and Cox, S. J. (2000) Meta-Models for Confidence Estimation in Speech Recognition. In: IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP '00), 5-9 June 2000, Istanbul, Turkey.

Bangham, JA, Cox, SJ, Elliott, R, Glauert, JRW, Marshall, I, Rankov, S and Wells, M (2000) Virtual Signing: Capture, Animation, Storage and Transmission - an overview of the ViSiCAST Project. In: IEE Colloquium on Speech and Language Processing for the Disabled and Elderly, 4 April 2000, London, UK.

Bangham, JA, Cox, SJ, Lincoln, M, Marshall, I, Tutt, M and Wells, M (2000) Signing for the deaf using virtual humans. In: IEE Colloquium on Speech and Language processing for Disabled and Elderly.

This list was generated on Sat Apr 6 09:25:52 2013 BST.

External Activities and Indicators of Esteem

  • Member IEEE Speech and Language Processing Technical Committee, 2006
  • Chairman, UK Institute of Acoustics Speech Group, 1998-2003
  • Keynote speech, AMI Workshop on “Multimodal Interaction and Related Machine Learning Algorithms”, Martigny, Switzerland, 2004

Key Responsibilities

Director of Graphics, Vision and Speech Language Laboratory
Head of Speech Language Group
Director of Research
QR code for Stephen Cox

Send this page to your mobile phone by scanning this code using a 2D barcode (QR Code) reader. These can be installed on most modern Smart Phones.