Anomaly detection is the process of identifying unusual patterns of events, observations, or a set of data which do not conform to an expected normal behaviour. This module will provide learners with a comprehensive introduction to the theory underpinning anomaly detection and will also equip learners with the knowledge to effectively apply a range of anomaly detection techniques (such as clustering and rule-based algorithms) to real-world problems such as fraud detection.
Learning Outcomes
On successful completion of this module the learner will be able to:
LO1
Apply statistical algorithms to anomaly detection for a specific application domain.
LO2
Compare the performance of a range of classification-based machine learning algorithms to anomaly detection problems.
LO3
Implement a clustering based anomaly detection.
LO4
Develop an online model for anomaly detection over big-data streams.
Pre-requisite learning
Module Recommendations
This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named CIT module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).
Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.
No requirements listed
Module Content & Assessment
Indicative Content
Statistical Techniques
Overview and application of a range of parametric and non-parametric statistical techniques for anomaly detection such as change point detection, Gaussian mixture models and hidden Markov models.
Classification Models
Anomaly detection using a range of relevant machine learning classification techniques such as neural networks, SVMs, rule-based algorithms, ensembles techniques, distance-based and density-based algorithms.
Unsupervised Model and Evaluation
Application of unsupervised models to anomaly detection problems such as LOF, COF, LOCI and CBLOF. The role of dimensionality reduction techniques such as PCA and feature selection. Best practice evaluation techniques such as F1 scores and ROC curves.
Anomaly Detection at Scale
Implement and deploy a model for real-time anomaly detection in a big data environment using Spark Streaming and MLlib for an application domain.
Assessment Breakdown
%
Course Work
100.00%
Course Work
Assessment Type
Assessment Description
Outcome addressed
% of total
Assessment Date
Project
Perform a comparative analysis of a range of statistical techniques versus classification models to detect anomalies. Standard methodologies should be applied and the performance should be comprehensively evaluated.
1,2
50.0
Week 7
Project
By employing appropriate research methods, the student is expected to apply the unsupervised techniques to Implement and deploy a model for real-time anomaly detection in a big data environment.
3,4
50.0
Week 13
No End of Module Formal Examination
Reassessment Requirement
Coursework Only This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
The institute reserves the right to alter the nature and timings of assessment
Module Workload
Workload: Full Time
Workload Type
Workload Description
Hours
Frequency
Average Weekly Learner Workload
Lecture
Delivers the concepts and theories underpinning the learning outcomes.
2.0
Every Week
2.00
Lab
Application of learning to case studies and project work.
2.0
Every Week
2.00
Independent & Directed Learning (Non-contact)
Student undertakes independent study. The student reads recommended papers and practices implementation.
3.0
Every Week
3.00
Total Hours
7.00
Total Weekly Learner Workload
7.00
Total Weekly Contact Hours
4.00
Workload: Part Time
Workload Type
Workload Description
Hours
Frequency
Average Weekly Learner Workload
Lecture
Delivers the concepts and theories underpinning the learning outcomes.
2.0
Every Week
2.00
Lab
Application of learning to case studies and project work.
2.0
Every Week
2.00
Independent & Directed Learning (Non-contact)
Student undertakes independent study. The student reads recommended papers and practices implementation.
3.0
Every Week
3.00
Total Hours
7.00
Total Weekly Learner Workload
7.00
Total Weekly Contact Hours
4.00
Module Resources
Recommended Book Resources
Sumeet Dua, Xian Du 2011, Data Mining and Machine Learning in Cybersecurity, Auerbach Publications [ISBN: 978-143983942]
Supplementary Book Resources
Ted Dunning, Ellen Friedman 2013, Practical Machine Learning: A New Look at Anomaly Detection, 1 Ed., O'Reilly Media [ISBN: 978-149191160]
Recommended Article/Paper Resources
ACM Digital Library 2009, Anomaly detection: A survey, ACM Computing Surveys (CSUR)