About
The course teaches students comprehensive and specialised subjects in data science; it develops sophisticated skills in statistics, mathematical modelling, and the ability to code in support of such analyses. It further grounds students in the disciplinary history and methodology of data science, preparing them for either further study or to work as a practitioner in the field. The program prominently features a major capstone project, requiring students to identify a real-world problem that would benefit from a data-driven approach; to collect and prepare the data to address the problem; and to build visualisations in support of their arguments. The combination of rigorous mathematical training with practical approaches gives learners the ability to autonomously further develop their skills after graduation, turning them into lifelong learners of data science methods.
How students have found success through Woolf
Course Structure
About
Most industry analysis starts with exploratory data analysis and a thorough study of this will help learners to perform data health checks and provide initial business insights.
The module will help the learner to understand and perform descriptive statistics and present the data using appropriate graphs/diagrams and serves as a foundation for advanced analytics.
This module also introduces the basics of programming in R and Python, the most commonly used languages used for data science.
The module culminates in practices related to data management, which is essential for both exploratory data analysis and advanced analytics. In particular, the module focuses on SQL as a highly practical language for data preprocessing, and addresses ways to connect SQL with R and Python tools, as well as learning the skills required to prepare data for machine learning and efficient data modelling.
Core Reading List:
R for Data Science: Import, Tidy, Transform, Visualise, and Model DataPaperback – 25 July 2016
by Garrett Grolemund (Author), Hadley Wickham (Author)
Hands-On Exploratory Data Analysis with Python: Perform EDA techniques to understand, summarise, and investigate your data Paperback – 27 Mar. 2020
by Suresh Kumar Mukhiya (Author), Usman Ahmed (Author)
Supplementary Reading List:
Exploratory Data Analysis with R
Radhika Datar, Harish Garg
Publisher: Packt Publishing (31 May 2019)
ISBN: 178980437X
Teachers





Intended learning outcomes
- Methods of distribution
- Best practices used to visually display data.
- Best practices related to data analysis and management, especially for large data sets.
- Key strategies related to the most appropriate measures of central tendency.
- Autonomously gather material, including from large data sets, and organise it into effective visualisations for analysis.
- Assess symmetry of data using measures of skewness.
- Accurately visualise and analyse data relationships. Autonomously connect SQL to R and Python to efficiently demonstrate data modelling processes through industry application.
- Import and export datasets and create data frames within R and Python, and connect these to SQL for preprocessing.
- Troubleshoot problems and be prepared to make leadership decisions related to industry methods and principles of data analysis and management.
- Independently work in R, Python, and SQL development environments.
- Manage data sets using a variety of functions, including acting autonomously to identify problems and relevant solutions for data wrangling.
About
This module provides learners with an in-depth understanding of the statistical distribution and hypothesis testing in a practical approach to getting things done.
Statistical distributions include Binomial, Poisson, Normal, Log-Normal, Exponential, t, F, and Chi-Square. Parametric and non-parametric tests used in research problems are covered in this unit.
The module will help learners to formulate research hypotheses, select appropriate tests of hypotheses, write primarily R programs to perform hypothesis testing, and to draw inferences using the output generated. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analysing data.
Core Reading List:
Statistical Inference For Everyone
Copyright Year: 2017
Brian Blais, Bryant University
Teachers





Intended learning outcomes
- The relevance of R to calculate probabilities.
- Discrete and continuous random variables.
- Key strategies related to distributions of observed data.
- Select topics for the advanced management of parametric and non- parametric tests.
- Understand and use statistical hypothesis testing concepts and terminology.
- Autonomously perform tests for normality and common distribution
- Analyse data relationships using covariance
- Evaluate standard types of distributions.
- Demonstrate self-direction and industry practices in developing solutions for hypothesis testing.
- Efficiently analyse the concept of variance through a variety of models.
About
This module provides a strong foundation for predictive modelling. Its
objective is to define the entire modelling process with the help of real life
case studies.
Many concepts in predictive modelling methods are common and, therefore,
these concepts will be covered in detail in this module.
Students will learn how to carry out exploratory data analysis to gain
insights and prepare data for predictive modelling, an essential skill valued
in many industries.
The module also builds on information covered in the module Exploratory
Data Analysis to include hands-on applications of the summarization and
visualisation of datasets through plots to present results in compelling and
meaningful ways.
Core Reading List:
Mastering Predictive Analytics with R - Second Edition James D. Miller, Rui Miguel Forte Publisher Packt Publication date: August 2017
Predictive Analytics with Python, 1st Edition Alvaro Fuentes Publisher Packt
Teachers





Intended learning outcomes
- Key strategies and best practices related to assessing the goodness of fit of a model.
- Industry applications of normality tests.
- The step-by-step construction of regression models.
- Test value assumptions using multiple predictors.
- Autonomously carry out global and individual testing of parameters used in defining predictive models
- Evaluate machine learning models on a limited data sample.
- Demonstrate self-direction in calculating inflation factors.
- Efficiently manage troubleshooting issues that arise in connection to data not explained by a model.
- Solve problems and be prepared to take leadership decisions related to the methods and correlation of variables.
- Apply a professional and scholarly approach to real-world problems pertaining to the estimation of model parameters.
About
PowerBI and Excel are fundamental parts of the data analytics toolkit. A strong understanding of these also provides a basis for more advanced data analytics with other techniques and technologies. In this unit, learners will gain experience in collecting, processing, analysing, and communicating with data using Excel. In addition, data visualisation is a powerful way to communicate meaning in data and support business decision-making. This unit will cover the main commercial tools used in data visualisation such as Tableau and Power BI, enabling learners to create a wide range of graphs, charts, and dashboards and use them appropriately in context.
Teachers

Intended learning outcomes
- Theories and contemporary practices in business analytics.
- Key strategies related to deploying data in business operations and management
- Select topics related to industry-specific uses of PowerBI
- Autonomously solve problems in the domain of visualizing business data.
- Autonomously identify opportunities for the use of Excel and PowerBI in business contexts
- Employ Excel–including tools such as pivot tables, and basic visualisations–and PowerBI to surface insights about business operations .
- Apply a professional and scholarly approach to data analytics within a business context.
- Demonstrate self-direction in research and originality in addressing the availability of data for business operations.
- Solve problems related to the use of dashboards and visualisations for business management
- Act autonomously in identifying research problems and solutions related to applications of Excel and PowerBI for analytics.
About
Machine learning algorithms are new generation algorithms used in conjunction with classical predictive modelling methods.
In this Machine Learning 1 module, learners will understand applications of the Support vector machine, K Nearest Neighbours and Naive Bayes algorithms for classification and regression problems. Additionally, students will develop practical machine learning and data science skills including theoretical basics of a broad range of machine learning concepts and methods with practical applications to sample datasets.
Reading List:
Introduction to Machine Learning with Python: A guide for Data Scientists, Andreas Müller and Sarah Guido, 1st Edition. (O’Reilly Media, 2016).
Teachers





Intended learning outcomes
- Decision boundaries that help classify data points.
- The industry relevance of the apriori algorithm.
- Models intended to predict the value of a target variable.
- Regression models with binary target variables.
- Appraise classification methods and the support vector machine algorithm.
- Use algorithims to make predictions and apply neutral networks to classification problems.
- Apply decision tree and random forest algorithms to classification and regression problems.
- Act autonomously in identifying neutral networks for classification problems.
- Apply a professional and scholarly approach to Bayes theorem and its applications.
- Efficiently manage issues in connection to machine algorithms.
- Demonstrate self-direction in bootstrapping and aggregation.
About
Machine learning algorithms are new generation algorithms used in
conjunction with classical predictive modelling methods.
Machine learning algorithms are new generation algorithms used in
conjunction with classical predictive modelling methods. In this Machine Learning 2 module, students build on the knowledge gained from Machine Learning 1 and will go on to understand applications of decision trees and random forest algorithms, and neural networks for classification and regression problems. Additionally, students will develop practical machine learning and data science skills including theoretical basics of a broad range of machine learning concepts and methods with practical applications to sample datasets.
Reading List:
Introduction to Machine Learning with Python: A guide for Data Scientists, Andreas Müller and Sarah Guido, 1st Edition. (O’Reilly Media, 2016).
Teachers





Intended learning outcomes
- Regression models with binary target variables.
- Models intended to predict the value of a target variable.
- Decision boundaries that help classify data points.
- The industry relevance of the apriori algorithm.
- Use algorithms to make predictions and apply neutral networks to classification problems.
- Apply decision tree and random forest algorithms to classification and regression problems.
- Appraise classification methods and the support vector machine algorithm.
- Demonstrate self-direction in bootstrapping and aggregation.
- Apply a professional and scholarly approach to Binary Logistic Regression and its applications.
- Act autonomously in identifying neutral networks for classification problems.
- Efficiently manage issues in connection to decision tree and random forest machine learning algorithms.
About
In this module, time series forecasting methods are introduced and explored. Students will gain a working knowledge of the nature and processes used in relation to time series data and confidently recognize and understand trends that exist within that data. This information will be used to make predictions or forecasts.
Students will analyse and forecast macroeconomic variables such as GDP and inflation. Additionally, students will work with complex financial models using ARCH and GARCH, ARIMA, time series regression, exponential smoothing, and other models.
Core Reading List:
Hands on Time Series Analysis with R
Rami Krispin
Publisher: Packt
Copyright Year: May 2019
Teachers





Intended learning outcomes
- Models related to series analysis.
- Conversion of non-stationary time series data into stationary time series data.
- Key strategies related to the concept of seasonal decomposition.
- Validate Auto Regressive Integrated Moving Average (ARIMA) models and use estimation.
- Implement panel data regression methods.
- Assess the concepts and uses of time series analysis and test for stationarity in time series data.
- Create synthetic contextualised discussions of key issues related to components of time series.
- Efficiently manage industry-level issues in connection to trend analysis.
- Demonstrate self-direction in developing real-world applications for serial correlation.
- Solve problems and be prepared to take leadership decisions related to the methods and principles of residual analysis.
About
In this module, students will look at analysing unstructured data such as that found on social media, newspaper articles, videos, and more.
Specifically, students will look at text techniques for text mining and natural language processing using R and Python code to produce graphical representations of unstructured data and carry out sentiment analysis.
This module focuses on learning key concepts, tools, and methodologies for natural language processing and emphasises hands-on learning through guided tutorials and real-world examples.
Core Reading List:
Text Mining with R
Julia Silge and David Robinson.
O’Reilly
Natural Language Processing with Python
Steven Bird, Ewan Klein and Edward Loper.
O’Reilly
Teachers





Intended learning outcomes
- Industry applications in the domain of language processing.
- Principles and applications of sentiment analysis.
- Key strategies related to structured data versus unstructured data and the features of each.
- Principles and applications of text analysis.
- Process text data to generate insights
- Perform sentiment analysis on unstructured data.
- Process text data and strings, and perform pattern matching with expressions in R and Python.
- Efficiently manage issues that arise in connection to text mining.
- Apply a professional and scholarly approach to research problems pertaining to natural language processing.
- Demonstrate self-direction in applying solutions related to text mining.
About
Data reduction is a key process in business analytics projects. In this module, learners will learn data reduction methods such as Principal Component Analysis, Factor Analysis and Multidimensional Scaling.
Students will develop skills related to the formation of segments using cluster analysis methods. Additionally, students will analyse segments, the process of which is a key technique for large groups of data as intrinsic information appears in detail once segmented thoughtfully.
Required Reading Material:
Applied Unsupervised Learning with R Publisher: Packt R Copyright Year: March 2019 ISBN 9781789956399 Alok Malik, Bradford Tuckfield
Teachers





Intended learning outcomes
- Estimating the optimum number of clusters using hierarchical clustering.
- Key strategies related to the concept of data reduction.
- Select topics for the advanced implementation of Eigenvectors.
- Algorithms relevant to multivariate methods.
- Apply an in-depth domain-specific knowledge and understanding to multivariate analysis.
- Define Principal Component Analysis (PCA) and its derivations and assess their application.
- Critically understand and implement hierarchical and non-hierarchical cluster analysis and assess their outputs.
- Solve problems and be prepared to take leadership decisions related to the methods and principles of visualising the level of similarity of individual cases of a dataset.
- Act autonomously in the estimation of loading matrices and interpreting factor solutions.
- Demonstrate self-direction in research and originality in developing scoring models.
- Apply industry best practices for resolving issues pertaining to factor analysis.
About
This module builds on the concepts introduced in the module Fundamentals of Predictive Modelling.
In this module, learners are introduced to model development for categorical dependent variables. Binary dependent variables are encountered in many domains such as risk management, marketing and clinical research and this unit covers detailed model building processes for binary dependent variables. Additionally, a primary goal of the module is for students to be able to select and successfully apply appropriate advanced regression models in applied settings.
The module will culminate with multinomial models and ordinal scaled variables.
Core Reading List:
Mastering Predictive Analytics with R - Second Edition James D. Miller, Rui Miguel Forte Publisher Packt Publication date: August 2017
Predictive Analytics with Python, 1st Edition Alvaro Fuentes Publisher Packt
Teachers





Intended learning outcomes
- Comparing data to a known distribution.
- Determining if a sample follows a normal distribution.
- The implementation of binomial regression in real world settings.
- Develop applications using more than two categories of dependent, outcome, or explanatory variables.
- Critically assess the effect of several variables upon the time a specified result takes to occur.
- Develop models using one or more predictor variables to predict the target variable classes.
- Efficiently estimate model parameters.
- Act autonomously in developing estimates of unknown population parameters.
- Demonstrate self-direction in global hypothesis testing.
- Solve problems related to generalised linear models through link function.
About
This unit provides learners with an opportunity to apply key knowledge and skills through project work. They will be able to select a project from a specific domain and will be required to carry out various data management, exploratory data analysis, data visualisation and predictive modelling tasks. The Data Science in Practice work should deepen their engagement with this material, and should prepare students for engaging fully with contemporary research methods in data science.
Reading List
(General):
Gao, G., Mishra, B., & Ramazzotti, D. (2018). Causal data science for financial stress testing. J. Comput. Sci., 26, 294-304.
Chen, H., Lundberg, S.M., & Lee, S. (2018). Hybrid Gradient Boosting Trees and Neural Networks for Forecasting Operating Room Data. ArXiv, abs/1801.07384. (Machine Learning)
Miller, James D. and Rui Miguel Forte. Mastering Predictive Analytics with R: Machine Learning Techniques For Advanced Models. Second Ed. Birmingham: Packt, 2017.
Fuentes, Alvaro. Mastering Predictive Analytics with Python. Birmingham: Packt, 2018. (Data Analytics in Business)
Hands-On Exploratory Data Analysis with R: Become an Expert in Exploratory Data Analysis Using R Packages, Radhika Datar and Harish Garg, 1st Edition. (Packt Publishing, 2019). 266 pages
Hands-On Exploratory Data Analysis with Python: Perform EDA Techniques to Understand, Summarize, and Investigate Your Data, Suresh Kumar Mukhiya and Usman Ahmed, 1st Edition. (Packt Publishing, 2020). 352 pages.
Teachers




Intended learning outcomes
- Theories and contemporary practices in data analytics.
- Select topics related to industry-specific uses of programming in R, Python, and MySQL, as well as data visualisation tools.
- Key strategies related to statistical modelling and predictive analytics.
- Autonomously solve problems in the domain of data analytics.
- Autonomously identify opportunities for the use of Python, R, MySQL, and data visualisation tools.
- Employ statistical modelling and predictive analytics within real-world business contexts
- Act autonomously in identifying research problems and solutions related to data visualisation and analytics.
- Apply a professional and scholarly approach to data analytics within a real-world context.
- Demonstrate self-direction in research and originality in addressing statistical analysis and predictive modelling.
- Solve problems related to the use of programming and data modelling in real-world applications
Entry Requirements
Application Process
Submit initial Application
Complete the online application form with your personal information
Documentation Review
Submit required transcripts, certificates, and supporting documents
Assessment
Your application will be evaluated against program requirements
Interview
Selected candidates may be invited for an interview
Decision
Receive an admission decision
Enrollment
Complete registration and prepare to begin your studies
.avif)