Organisations such as companies, government bodies and research centres collect vast amounts of data on their customers, providers, employees, processes and the like. Storing and accessing large amounts of data is not as difficult as it used to be; the challenge nowadays is to use this data effectively. Organisations need to recognise patterns and trends within the data and to present this information to management in a manner that enables it to be properly exploited.
Data mining provides a set of techniques to unearth such patterns. The data mining researchers within the school have developed a methodology for using a range of procedures to deliver patterns of an appropriate standard and quality for management. Members of the group have also developed sophisticated techniques for data mining including novel approaches to rule induction, time series analysis, clustering and ensemble techniques. These techniques are rooted in artificial intelligence, information theory and statistics. As data becomes more diverse, encompassing diagrams, pictures, charts and sound, so data mining techniques need to become more diverse, and the group is now increasingly working with multimedia.
In recent years, world-wide interest in Knowledge Discovery and Data Mining (KDD) has soared. The idea that databases can be mined for interesting patterns has appealed to a wide range of organisations. Typical KDD projects may investigate customer behaviour, plan direct marketing, detect fraudulent activity or identify machine faults.
Organisations who aim to partake in KDD projects require a considerable amount of expert knowledge about both the data and the KDD methods so that high quality, valid and interesting results can be obtained. In the years since KDD was introduced, the number of approaches, algorithms and software packages have grown rapidly, not to mention the size of the databases that are being collected.
Members of the laboratory have made significant contributions in techniques for data mining and KDD in the last 10 years, in particular: KDD Methodologies; use of metaheuristics for rule and tree induction; all-rule induction; clustering techniques; feature subset selection; feature construction, as well as many applications in the financial services industry, medicine and telecommunications.
Current research on techniques focus on Multi-objective metaheuristic techniques, Clustering methodologies, Techniques for aggregated data mining, use of genetic programming, methodolgies for medical data mining and ensemble systems.
Applications can often generate new challenges. For example, medical data mining is often a combination of data mining as we know it (based on tabular data), and other structures (text, sound, images). The objective is to infer knowledge from multiple sources relating to a patient. Even collecting and collating the data is often a challenge in itself. Another challenge is handling and mining large time series. Many applications (medical and others) are generating tera-bytes and peta-bytes of data on a regular basis. These need to be mined using both novel techniques and heavyweight computing resources.
The group has a research laboratory with access to a range of sophisticated data mining software. There are over 20 postgraduate students working on data mining.
The Data Mining group at UEA can be contacted at:
Data Mining Research Group,
c/o Prof. V J Rayward-Smith,
School of Computing Sciences
University of East Anglia
Norwich
NR4 7TJ, UK
Tel: +44 (0)1603 592850
Fax: +44 (0)1603 593345
Email: vjrs@uea.ac.uk

