Module
CMPC3M03 - INFORMATION RETRIEVAL
- Module Code:
- CMPC3M03
- Department:
- Computing Sciences
- Credit Value:
- 20
- Level:
- 3
- Organiser:
- Mr. Stephen Cox
Lecture notes and materials will be made available via Blackboard; references to books, papers and other on-line resources are given in lectures.
Manning C. D., Raghavan P. and Schutze H. (2008) Introduction to Information Retrieval Cambridge University Press. ISBN-10: 0521865719
(There is a companion website and preliminary version of this book available).
- Levene, M. (2006) An Introduction to Search Engines and Web Navigation Addison Wesley, ISBN: 0-321-30677-5
- Belew, R. K.(2000) Finding Out About: A Cognitive Perspective on Search Engine Technology and the WWW, CUP, ISBN: 0-521-63028-2
Submission:
Written coursework should be submitted by following the standard CMP practice. Students are advised to refer to the Guidelines and Hints on Written Work in CMP.
Deadlines:
If coursework is handed in after the deadline day or an agreed extension:
| Work submitted | Marks deducted |
| After 15:00 on the due date and before 15:00 on the day following the due date | 10 marks |
| After 15:00 on the second day after the due date and before 15:00 on the third day after the due date | 20 marks |
| After 15:00 on the third day after the due date and before 15:00 on the 20th day after the due date. | All the marks the work merits if submitted on time (ie no marks awarded) |
| After 20 working days | Work will not be marked and a mark of zero will be entered |
Saturdays and Sundays will NOT be taken into account for the purposes of calculation of marks deducted.
All extension requests will be managed through the LTS Hub. A request for an extension to a deadline for the submission of work for assessment should be submitted by the student to the appropriate Learning and Teaching Service Hub, prior to the deadline, on a University Extension Request Form accompanied by appropriate evidence. Extension requests will be considered by the appropriate Learning and Teaching Service Manager in those instances where (a) acceptable extenuating circumstances exist and (b) the request is submitted before the deadline. All other cases will be considered by a Coursework Coordinator in CMP.
For more details, including how to apply for an extension due to extenuating circumstances download Submission for Work Assessment (PDF, 39KB)
Plagiarism:
Plagiarism is the copying or close paraphrasing of published or unpublished work, including the work of another student; without due acknowledgement. Plagiarism is regarded a serious offence by the University, and all cases will be investigated. Possible consequences of plagiarism include deduction of marks and disciplinary action, as detailed by UEA's Policy on Plagiarism and Collusion.
Module specific:
- To introduce the classic vector-based and probabilistic models of information retrieval
- To survey some common techniques for improving information retrieval (e.g. relevance feedback, link analysis algorithms, recommender systems)
- To introduce natural language processing techniques that are used to improve the performance of information retrieval systems
- To describe the main issues in engineering scalable information retrieval systems
- To understand the main issues and measures used in evaluating the performance of information retrieval and document classification systems
- To give students practical experience of information retrieval systems, experimental work and evaluation
- To describe some of the principal approaches and issues in retrieving non-text information e.g. music and video
- To introduce more specialised topics related to information retrieval and natural language processing (e.g. information extraction, document classification)
Module specific:
On completion of this module students should acheieve the following:
- Understanding and experience of a range of information retrieval techniques and models and their application, particularly in Web searching
- Understanding of how natural language processing techniques may be applied to information retrieval
- Appreciation of the main issues and techniques in multimedia information retrieval
- Appreciation of the achievements and limitations of current information retrieval approaches
Transferable skills:
On completion of this module students should acheieve the following:
- Improved research and communications skills
- Improved programming skills
- Further experience of report-writing
Total hours: 40
Lectures: 20 hours (with provisional weekly schedule)
- The IR process
- Indexes and terms
- Vector space modelling
- Probabilistic models of IR
- Relevance, feedback, evaluation
- Text classification
- Web search
- Phrases, parts of speech and stemming
- Language modelling, parsing
- Audio and music retrieval
- Image and video retrieval
Workshops: 0 hours
Laboratory work: 20 hours (with provisional weekly schedule)
- MATLAB programming (if required)
- Indexing for IR systems
- Document ranking with the vector space model
- Probabilisitc document ranking
- Evaluation
- N-gram language modelling
- Other topics depending on the coursework topic
Examination with Coursework or Project


