Back to Module List

CMPSMI05 - AUDIO AND VISUAL PROCESSING

Module Code:
CMPSMI05
Department:
Computing Sciences
Credit Value:
20
Level:
M
Organiser:
Dr. Barry Theobald
This module examines computer processing of audio and visual signals. The theoretical basis of analysis and processing of digital signals is explored in depth. This is then applied to the design of systems to achieve certain effects on signals representing audio and visual data. This also enables us to understand how one of the most significant technologies of recent times, the mobile telephone, is able to transmit speech at a low data-rate (speech coding). Other application areas to be covered include speech recognition and synthesis, image coding and audio-visual processing, including audio-visual synthesis (talking heads) and recognition (speech-recognition augmented by lip-reading). This module assumes that students have studied sound/image/signal processing to at least level 2 at undergraduate level.

In lectures, handouts will be distributed for material that is difficult or lengthy to copy from the board e.g. derivations of formulae. These handouts will be available on Blackboard.  However, the handouts are not comprehensive, and students are expected to make their own notes from lecturers' notes on the board. In workshops, students will be expected to tackle problems individually but with help available from the seminar leader and one other teacher.  For some workshops, the class will read sections of a textbook beforehand and then analyse the material in the workshop. Laboratory work (MATLAB programming) will take place during time-tabled laboratory periods using networked personal computers. CMP teaching laboratories running MATLAB are available to CMP students during term time outside time-tabled teaching hours.


Required purchases

  • Meddins R., (2000), Introduction to Digital Signal Processing,
  • HolmesW., (2002), Speech Synthesis and Recognition


Possible alternative purchases:

Gonzalez R. and Woods R.,(2008) Digital Image Processing Prentice Hall


Submission

Written coursework should be submitted by following the standard CMP practice. Students are advised to refer to the Guidelines and Hints on Written Work in CMP.

Deadlines

Coursework should be submitted before 23:59 on the deadline day. Paper copies can be submitted via the Hub drop boxes up to 22.00  in the LTS Hub, and there will be a ‘late box’ in the Library for submissions between 22.00 and midnight.

If coursework is handed in after the deadline day or an agreed extension:
 

 

Work submitted Marks deducted
On the day following the due date 10 marks
On either the 2nd or 3rd day after the due date 20 marks
On the 4th day after the due date and before the 20th day after the due date All the marks the work merits if submitted on time (ie no marks awarded)
After 20 working days Work will not be marked and a mark of zero will be entered


All extension requests will be managed through the LTS Hub. A request for an extension to a deadline for the submission of work for assessment should be submitted by the student to the appropriate Learning and Teaching Service Hub, prior to the deadline, on a University Extension Request Form accompanied by appropriate evidence. Extension requests will be considered by the appropriate Learning and Teaching Service Manager in those instances where (a) acceptable extenuating circumstances exist and (b) the request is submitted before the deadline. All other cases will be considered by a Coursework Coordinator in CMP.

Plagiarism

Plagiarism is the copying or close paraphrasing of published or unpublished work, including the work of another student; without due acknowledgement. Plagiarism is regarded a serious offence by the University, and all cases will be investigated. Possible consequences of plagiarism include deduction of marks anddisciplinary action, as detailed by UEA's Policy on Plagiarism and Collusion

 


Module specific:

  • Understanding of the effects of a sampled representation of a continuous signal
  • Understanding of basic digital signal processing techniques (DSP)
  • Understanding of the operation, use and meaning of the z-transform
  • Ability to design and compute responses of simple discrete systems to simple input signals using z-transform techniques.
  • Ability to apply DSP techniques to image filtering, using MATLAB
  • Appreciation and understanding of the techniques of orthogonal transformation as applied in image coding
  • Knowledge of the processes involved in speech production and the fundamentals of articulatory and acoustic phonetics
  • Understanding and analysis of the source-filter model of speech production and how it is applied in speech coding
  • An overview of the processes involved in speech recognition, appreciation of the problems that underlie each and an appreciation of how stochastic modelling can be used
  • Appreciation of the different approaches to speech synthesis and the advantages and disadvantages of each

Transferable skills:

  • Understanding of DSP and its application, which is applicable in a number of technical areas
  • Enhanced MATLAB programming skills
  • Skills in the manipulation of sound and image files
  • Enhanced problem-solving skills
  • Increased knowledge and appreciation of speech and language

Subject specific:

  • In-depth understanding of DSP and its application
  • Ability to design discrete-time filters and process signals with them
  • Ability to use DSP techniques to process audio and video signals for use in audio and video coding and recognition systems
  • Understanding of the important areas in image and speech technology and of the research issues in them

Total hours: 49 Lectures:  22 hours, Content (with provisional weekly topics)

  1. Introduction to module; review of complex numbers.
  2. DSP theory and practice 1 (Introduction to DSP, FIR and IIR structures)
  3. DSP theory and practice 2 (z-transform; step and impulse response)
  4. DSP theory and practice 3 (Sampling; z-plane representation; frequency response)
  5. DSP theory and practice 4 (Filter Design Approaches)
  6. DSP theory and practice 5 ( DFT & FFT)
  7. Image processing I (2D FIR filters)
  8. Image processing II (compression)
  9. Image processing III (transforms)
  10. Image processing IV (wavelets)
  11. Introduction to the speech signal
  12. Source/filter model of speech production I
  13. Source/filter model of speech production II
  14. Speech recognition: the front end
  15. Speech recognition: stochastic methods and search methods
  16. Speech recognition; acoustic and language modelling
  17. Speech synthesis I
  18. Speech synthesis II
  19. Speech dialogue systems
  20. Audio-visual speech
  21. Revision I image
  22. Revision II audio

Workshops: 15 hours, Content ( with provisional weekly schedule)

  1. Complex Numbers 1
  2. DSP exercises 1
  3. Assignment 1 Briefing
  4. DSP exercises II
  5. DSP exercises III
  6. Assignment II briefing
  7. Speech coding
  8. Speech recognition I
  9. Speech recognition II
  10. Speech synthesis

Laboratory Work: 12  hours, Content (with provisional weekly schedule)

  1. DSP1
  2. Image processing 1
  3. Image Processing 2
  4. Front-end speech processing
  5. Speech Coding
  6. Vowel recognition

Coursework