Title:  DataMining &KnowledgeDiscovery 
Long Title:  Data Mining & Knowledge Discovery 
Field of Study: 
Data Format

Valid From: 
Semester 2  2014/15 ( January 2015 ) 
Module Delivered in 
no programmes

Module Coordinator: 
David Goulding 
Module Author: 
AINE NI SHE 
Module Description: 
Data mining  the discovery of valuable patterns and knowledge within large amounts of data  has become a popular and interesting interdisciplinary subject in recent years. Since its conception in the early 1990s, the subject has received a huge amount of attention from the research community, the IT industry and beyond. In this module the learner will study a variety of data mining algorithms and models and will investigate how these can be used to solve various realworld problems. 
Learning Outcomes 
On successful completion of this module the learner will be able to: 
LO1 
Describe the concepts, principles, methods and techniques of data mining and knowledge discovery. 
LO2 
Apply appropriate data preprocessing and exploration techniques to specified data mining problems. 
LO3 
Design and implement appropriate data mining solutions for a specified data mining problem by using a suitable method e.g. algorithm, statistical technique, computer program or mathematical model. 
LO4 
Evaluate, select and interpret patterns and knowledge discovered as a result of applying data mining solutions to specified data mining problems. 
Prerequisite learning 
Module Recommendations
This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named CIT module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s). 
No recommendations listed 
Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list. 
No incompatible modules listed 
Corequisite Modules

No Corequisite modules listed 
Requirements
This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.

No requirements listed 
Corequisites

No Co Requisites listed 
Module Content & Assessment
Indicative Content 
Data Mining Overview
Background to data mining; Understanding the differences between data, information and knowledge; Objectives of data mining; Knowledge Discovery in databases; Data Mining Applications  Marketing, Finance, Banking, Fraud detection, Manufacturing, Telecommunications, discovering knowledge on the Internet. Current state of data mining.

Principles of Data Mining
Data mining process/approaches e.g. CrispDM, SEMMA; Categories of data mining problems; Evaluation and interpretation of output patterns.

Data Mining Model Functions
Investigate some of the following supervised and unsupervised techniques: classification, clustering, dependency modelling, sequence modelling, data summarisation, change and deviation analysis/anomaly detection. Matching the model function(s) to the data mining problem at hand.

Data Mining Model Representations
Using a data mining tool to mine the data, investigate some of the following data mining representations: decision trees and rules; neural networks; machine learning; casebased reasoning; data visualisation: clustering, hierarchies, selforganised networks, geopositioning/landscaping.

Interpretation & Refinement
Interpreting patterns, removing redundant patterns, translating patterns, refining
the data mining process based on knowledge learned. Testing and validating the accuracy of the models using various techniques e.g. simple split, kfold crossvalidation, bootstrapping.

Data Mining Software
Using data mining and forecasting software (e.g. SAS, RapidMiner, R, SPSS) to manipulate algorithms, build and test models for a variety of data sets.

Assessment Breakdown  % 
Course Work  50.00% 
End of Module Formal Examination  50.00% 
Course Work 
Assessment Type 
Assessment Description 
Outcome addressed 
% of total 
Assessment Date 
Project 
Design and implement an appropriate data mining solution for a specified data mining problem. 
2,3 
25.0 
Week 8 
Project 
Evaluate, select and interpret patterns and knowledge discovered as a result of applying a data mining solution to a specified data mining problem. 
2,4 
25.0 
Week 12 
End of Module Formal Examination 
Assessment Type 
Assessment Description 
Outcome addressed 
% of total 
Assessment Date 
Formal Exam 
EndofSemester Final Examination 
1,2,3,4 
50.0 
EndofSemester 
Reassessment Requirement 
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.

The institute reserves the right to alter the nature and timings of assessment
Module Workload
Workload: Full Time 
Workload Type 
Workload Description 
Hours 
Frequency 
Average Weekly Learner Workload 
Lecture 
Theory 
2.0 
Every Week 
2.00 
Lab 
Utilising data mining tools to apply theory covered in lectures 
2.0 
Every Week 
2.00 
Independent & Directed Learning (Noncontact) 
Application of theory to project 
3.0 
Every Week 
3.00 
Total Hours 
7.00 
Total Weekly Learner Workload 
7.00 
Total Weekly Contact Hours 
4.00 
Workload: Part Time 
Workload Type 
Workload Description 
Hours 
Frequency 
Average Weekly Learner Workload 
Lecture 
Theory 
2.0 
Every Week 
2.00 
Lab 
Utilising data mining tools to apply theory covered in lectures 
2.0 
Every Week 
2.00 
Independent & Directed Learning (Noncontact) 
Application of theory to project 
3.0 
Every Week 
3.00 
Total Hours 
7.00 
Total Weekly Learner Workload 
7.00 
Total Weekly Contact Hours 
4.00 
Module Resources
Recommended Book Resources 

 Jiawei Han, Micheline Kamber, Jian Pei, 2011, Data Mining: Concepts and Techniques, Third Edition [ISBN: 0123814790]
 Ian H. Witten, Eibe Frank, Mark A. Hall, 2011, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition [ISBN: 0123748569]
 Supplementary Book Resources 

 Daniel T. Larose 2004, Discovering Knowledge in Data: an Introduction to Data Mining [ISBN: 0471666572]
 Richard J. Roiger, Michael W. Geatz, Data Mining: A Tutorialbased Primer [ISBN: 0321223497]
 Andy Field, Jeremy Miles 2010, Discovering Statistics Using SAS, 1st Ed. [ISBN: 9781849200929]
 This module does not have any article/paper resources 

Other Resources 

 Website: PangNing Tan 2006, Introduction to Data Mining
 