Site Navigation

DATA8001 - Data Science and Analytics

Title:Data Science and Analytics
Long Title:Data Science and Analytics
Module Code:DATA8001
Duration:1 Semester
Credits: 5
NFQ Level:Advanced
Field of Study: Data Format
Valid From: Semester 1 - 2018/19 ( September 2018 )
Module Delivered in 2 programme(s)
Module Coordinator: David Goulding
Module Author: Aengus Daly
Module Description: This module will provide the learner with an overview of the important themes in the growing field of data science and analytics. The learner will study the established methods and technologies and also investigate new and emerging trends. Emphasis will be placed on statistical theory, mathematical algorithmic design and modelling concepts. The context and use of data analytics in real world setting will be investigated with topics such as data privacy, data security, and ethics. Data analytics/mining software will be used, e.g. R in both the lectures and labs.
Learning Outcomes
On successful completion of this module the learner will be able to:
LO1 Describe the field of data science and analytics, its concepts, technologies and historical roots. Give a detailed overview of the main approaches to developing a data analytics/mining project lifecyle.
LO2 Perform exploratory data analysis using data science/mining software packages.
LO3 Find patterns and solutions within a data set using data mining and/or statistical modelling techniques.
LO4 Describe a number of data mining and business intelligence concepts and techniques.
LO5 Develop a deep understanding of data protection, data privacy and other ethical issues.
Pre-requisite learning
Module Recommendations

This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named MTU module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).

Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
No incompatible modules listed
Co-requisite Modules
No Co-requisite modules listed

This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.

No requirements listed

Module Content & Assessment

Indicative Content
Investigate the data science and analytics landscape, its historical development, terminology and technologies; big data concepts, structured and unstructured data types.
Data analytics project life cycle
Use of the CRISP-DM framework to manage a data analytics project with its variety of actors and challenges. Investigate case studies in the field, looking at a variety of approaches, technologies with successes, failures, new developments and unusual applications of analytics.
Data quality, pre-processing and EDA
Cleaning/scrubbing data techniques, ETL (Extract, Transform, Load) systems and methods; data pre-processing: zero variance, dummy variables, correlations, linear dependencies. Use of exploratory data analysis, summary statistics, plots and visualisations.
Data analytical techniques
Examine an overview of data mining, regression and classification, pattern recognition, anomaly detection and visualisation techniques. Investigate how these techniques are used in a real-world setting, e.g. profit-testing scenarios, key performance indicators (KPIs), dashboards, balanced score cards.
Data science and analytics theory
Statistics, sampling theory, MLE (Maximum Likelihood Estimation), overview of statistical learning theory, algorithmic design; characteristics, strengths and weaknesses of models; decision trees and ensemble techniques, e.g. random forests; discuss testing and validation of models.
Data analytics techniques and software technologies
Introduction to various data analytics techniques, methods and predictive models. Explore how to load data and carry out initial data exploration. Use a variety of data analytics technologies e.g. R and Excel.
Technical report writing
Investigate how to write a technical report - structure and narrative of documents, referencing, bibliography and awareness of expected audience.
Ethics, data privacy and security
Investigate ethics, data privacy, security, data protection legislation, including GDPR and related topics in data governance.
Assessment Breakdown%
Course Work40.00%
End of Module Formal Examination60.00%
Course Work
Assessment Type Assessment Description Outcome addressed % of total Assessment Date
Project Solve a data analytics problem using R or similar data analytics software and produce a report. 2,3,4 40.0 Week 9
End of Module Formal Examination
Assessment Type Assessment Description Outcome addressed % of total Assessment Date
Formal Exam End-of-Semester Final Examination 1,4,5 60.0 End-of-Semester
Reassessment Requirement
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.

The institute reserves the right to alter the nature and timings of assessment


Module Workload

Workload: Full Time
Workload Type Workload Description Hours Frequency Average Weekly Learner Workload
Lecture Theory and Case Studies 2.0 Every Week 2.00
Lab Computer-based lab 2.0 Every Week 2.00
Independent & Directed Learning (Non-contact) Independent Study 3.0 Every Week 3.00
Total Hours 7.00
Total Weekly Learner Workload 7.00
Total Weekly Contact Hours 4.00
Workload: Part Time
Workload Type Workload Description Hours Frequency Average Weekly Learner Workload
Lecture Theory and case Studies 2.0 Every Week 2.00
Lab Computer-based lab 2.0 Every Second Week 1.00
Independent Learning Independent Study 4.0 Every Week 4.00
Total Hours 8.00
Total Weekly Learner Workload 7.00
Total Weekly Contact Hours 3.00

Module Resources

Recommended Book Resources
  • Peter Bruce, Andrew Bruce 2017, Practical Statistics for Data Scientists, 1st Ed., O'Reilly Media California, USA [ISBN: 9781491952962]
  • Foster Provost, Tom Fawcett 2013, Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, O'Reilly Media Cambridge UK [ISBN: 1449361323]
  • Matthew North 2012, Data Mining for the Masses, Global Text Project [ISBN: 0615684378]
Supplementary Book Resources
  • Kabacoff, Robert 2015, R in Action, 2nd Ed., Manning New York [ISBN: 1617291382]
  • Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani 2013, An Introduction to Statistical Learning, Springer-Verlag New York [ISBN: 9781461471370]
  • Norman Matloff 2011, The Art of R Programming, 1st Ed., No Starch Press San Francisco [ISBN: 9781593273842]
  • Andy Field, Jeremy Miles 2012, Discovering Statistics Using SAS, 1st Ed. [ISBN: 1849200920]
  • Efraim Turban , Ramesh Sharda, Dursun Delen 2011, Decision Support and Business Intelligence Systems, 9th Ed., Pearson Prentice Hall New Jersey [ISBN: 013610729X]
Recommended Article/Paper Resources
  • Watson, Hugh 2011, Business Analytics Insight: Hype or Here to Stay?, Business Intelligence Journal, vol. 16, No. 1, 1-8
  • Vijay Khatri, Carol V. Brown 2010, Designing data governance, Communications of the ACM, Volume 53 Issue 1
Other Resources

Module Delivered in

Programme Code Programme Semester Delivery
CR_SDAAN_8 Higher Diploma in Science in Data Science & Analytics 1 Mandatory
CR_SDAAN_9 Master of Science in Data Science & Analytics 1 Mandatory

Cork Institute of Technology
Rossa Avenue, Bishopstown, Cork

Tel: 021-4326100     Fax: 021-4545343
Email: help@cit.edu.ie