About
The course teaches students comprehensive and specialised subjects in data science; it develops sophisticated skills in statistics, mathematical modelling, and the ability to code in support of such analyses. It further grounds students in the disciplinary history and methodology of data science, preparing them for either further study or to work as a practitioner in the field. The program prominently features a major capstone project, requiring students to identify a real-world problem that would benefit from a data-driven approach; to collect and prepare the data to address the problem; and to build visualisations in support of their arguments. The combination of rigorous mathematical training with practical approaches gives learners the ability to autonomously further develop their skills after graduation, turning them into lifelong learners of data science methods.
How students have found success through Woolf
Course Structure
About
This module provides a strong foundation for predictive modelling. Its
objective is to define the entire modelling process with the help of real life
case studies.
Many concepts in predictive modelling methods are common and, therefore,
these concepts will be covered in detail in this module.
Students will learn how to carry out exploratory data analysis to gain
insights and prepare data for predictive modelling, an essential skill valued
in many industries.
The module also builds on information covered in the module Exploratory
Data Analysis to include hands-on applications of the summarization and
visualisation of datasets through plots to present results in compelling and
meaningful ways.
Core Reading List:
Mastering Predictive Analytics with R - Second Edition James D. Miller, Rui Miguel Forte Publisher Packt Publication date: August 2017
Predictive Analytics with Python, 1st Edition Alvaro Fuentes Publisher Packt
Teachers





Intended learning outcomes
- Key strategies and best practices related to assessing the goodness of fit of a model.
- Industry applications of normality tests.
- The step-by-step construction of regression models.
- Test value assumptions using multiple predictors.
- Autonomously carry out global and individual testing of parameters used in defining predictive models
- Evaluate machine learning models on a limited data sample.
- Demonstrate self-direction in calculating inflation factors.
- Efficiently manage troubleshooting issues that arise in connection to data not explained by a model.
- Solve problems and be prepared to take leadership decisions related to the methods and correlation of variables.
- Apply a professional and scholarly approach to real-world problems pertaining to the estimation of model parameters.
About
Most industry analysis starts with exploratory data analysis and a thorough study of this will help learners to perform data health checks and provide initial business insights.
The module will help the learner to understand and perform descriptive statistics and present the data using appropriate graphs/diagrams and serves as a foundation for advanced analytics.
This module also introduces the basics of programming in R and Python, the most commonly used languages used for data science.
The module culminates in practices related to data management, which is essential for both exploratory data analysis and advanced analytics. In particular, the module focuses on SQL as a highly practical language for data preprocessing, and addresses ways to connect SQL with R and Python tools, as well as learning the skills required to prepare data for machine learning and efficient data modelling.
Core Reading List:
R for Data Science: Import, Tidy, Transform, Visualise, and Model DataPaperback – 25 July 2016
by Garrett Grolemund (Author), Hadley Wickham (Author)
Hands-On Exploratory Data Analysis with Python: Perform EDA techniques to understand, summarise, and investigate your data Paperback – 27 Mar. 2020
by Suresh Kumar Mukhiya (Author), Usman Ahmed (Author)
Supplementary Reading List:
Exploratory Data Analysis with R
Radhika Datar, Harish Garg
Publisher: Packt Publishing (31 May 2019)
ISBN: 178980437X
Teachers





Intended learning outcomes
- Methods of distribution
- Best practices used to visually display data.
- Best practices related to data analysis and management, especially for large data sets.
- Key strategies related to the most appropriate measures of central tendency.
- Autonomously gather material, including from large data sets, and organise it into effective visualisations for analysis.
- Assess symmetry of data using measures of skewness.
- Accurately visualise and analyse data relationships. Autonomously connect SQL to R and Python to efficiently demonstrate data modelling processes through industry application.
- Import and export datasets and create data frames within R and Python, and connect these to SQL for preprocessing.
- Troubleshoot problems and be prepared to make leadership decisions related to industry methods and principles of data analysis and management.
- Independently work in R, Python, and SQL development environments.
- Manage data sets using a variety of functions, including acting autonomously to identify problems and relevant solutions for data wrangling.
About
This module provides learners with an in-depth understanding of the statistical distribution and hypothesis testing in a practical approach to getting things done.
Statistical distributions include Binomial, Poisson, Normal, Log-Normal, Exponential, t, F, and Chi-Square. Parametric and non-parametric tests used in research problems are covered in this unit.
The module will help learners to formulate research hypotheses, select appropriate tests of hypotheses, write primarily R programs to perform hypothesis testing, and to draw inferences using the output generated. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analysing data.
Core Reading List:
Statistical Inference For Everyone
Copyright Year: 2017
Brian Blais, Bryant University
Teachers





Intended learning outcomes
- The relevance of R to calculate probabilities.
- Discrete and continuous random variables.
- Key strategies related to distributions of observed data.
- Select topics for the advanced management of parametric and non- parametric tests.
- Understand and use statistical hypothesis testing concepts and terminology.
- Autonomously perform tests for normality and common distribution
- Analyse data relationships using covariance
- Evaluate standard types of distributions.
- Demonstrate self-direction and industry practices in developing solutions for hypothesis testing.
- Efficiently analyse the concept of variance through a variety of models.
About
This unit provides learners with an opportunity to apply key knowledge and skills through project work. They will be able to select a project from a specific domain and will be required to carry out various data management, exploratory data analysis, data visualisation and predictive modelling tasks. The Data Science in Practice work should deepen their engagement with this material, and should prepare students for engaging fully with contemporary research methods in data science.
Reading List
(General):
Gao, G., Mishra, B., & Ramazzotti, D. (2018). Causal data science for financial stress testing. J. Comput. Sci., 26, 294-304.
Chen, H., Lundberg, S.M., & Lee, S. (2018). Hybrid Gradient Boosting Trees and Neural Networks for Forecasting Operating Room Data. ArXiv, abs/1801.07384. (Machine Learning)
Miller, James D. and Rui Miguel Forte. Mastering Predictive Analytics with R: Machine Learning Techniques For Advanced Models. Second Ed. Birmingham: Packt, 2017.
Fuentes, Alvaro. Mastering Predictive Analytics with Python. Birmingham: Packt, 2018. (Data Analytics in Business)
Hands-On Exploratory Data Analysis with R: Become an Expert in Exploratory Data Analysis Using R Packages, Radhika Datar and Harish Garg, 1st Edition. (Packt Publishing, 2019). 266 pages
Hands-On Exploratory Data Analysis with Python: Perform EDA Techniques to Understand, Summarize, and Investigate Your Data, Suresh Kumar Mukhiya and Usman Ahmed, 1st Edition. (Packt Publishing, 2020). 352 pages.
Teachers




Intended learning outcomes
- Theories and contemporary practices in data analytics.
- Select topics related to industry-specific uses of programming in R, Python, and MySQL, as well as data visualisation tools.
- Key strategies related to statistical modelling and predictive analytics.
- Autonomously solve problems in the domain of data analytics.
- Autonomously identify opportunities for the use of Python, R, MySQL, and data visualisation tools.
- Employ statistical modelling and predictive analytics within real-world business contexts
- Act autonomously in identifying research problems and solutions related to data visualisation and analytics.
- Apply a professional and scholarly approach to data analytics within a real-world context.
- Demonstrate self-direction in research and originality in addressing statistical analysis and predictive modelling.
- Solve problems related to the use of programming and data modelling in real-world applications
About
This module builds on the concepts introduced in the module Fundamentals of Predictive Modelling.
In this module, learners are introduced to model development for categorical dependent variables. Binary dependent variables are encountered in many domains such as risk management, marketing and clinical research and this unit covers detailed model building processes for binary dependent variables. Additionally, a primary goal of the module is for students to be able to select and successfully apply appropriate advanced regression models in applied settings.
The module will culminate with multinomial models and ordinal scaled variables.
Core Reading List:
Mastering Predictive Analytics with R - Second Edition James D. Miller, Rui Miguel Forte Publisher Packt Publication date: August 2017
Predictive Analytics with Python, 1st Edition Alvaro Fuentes Publisher Packt
Teachers





Intended learning outcomes
- Comparing data to a known distribution.
- Determining if a sample follows a normal distribution.
- The implementation of binomial regression in real world settings.
- Develop applications using more than two categories of dependent, outcome, or explanatory variables.
- Critically assess the effect of several variables upon the time a specified result takes to occur.
- Develop models using one or more predictor variables to predict the target variable classes.
- Efficiently estimate model parameters.
- Act autonomously in developing estimates of unknown population parameters.
- Demonstrate self-direction in global hypothesis testing.
- Solve problems related to generalised linear models through link function.
About
In this module, time series forecasting methods are introduced and explored. Students will gain a working knowledge of the nature and processes used in relation to time series data and confidently recognize and understand trends that exist within that data. This information will be used to make predictions or forecasts.
Students will analyse and forecast macroeconomic variables such as GDP and inflation. Additionally, students will work with complex financial models using ARCH and GARCH, ARIMA, time series regression, exponential smoothing, and other models.
Core Reading List:
Hands on Time Series Analysis with R
Rami Krispin
Publisher: Packt
Copyright Year: May 2019
Teachers





Intended learning outcomes
- Models related to series analysis.
- Conversion of non-stationary time series data into stationary time series data.
- Key strategies related to the concept of seasonal decomposition.
- Validate Auto Regressive Integrated Moving Average (ARIMA) models and use estimation.
- Implement panel data regression methods.
- Assess the concepts and uses of time series analysis and test for stationarity in time series data.
- Create synthetic contextualised discussions of key issues related to components of time series.
- Efficiently manage industry-level issues in connection to trend analysis.
- Demonstrate self-direction in developing real-world applications for serial correlation.
- Solve problems and be prepared to take leadership decisions related to the methods and principles of residual analysis.
About
This course covers advanced statistical inference techniques, focusing on resampling methods such as bootstrapping, permutation tests, and jackknife techniques. It also introduces Monte Carlo simulations and Bayesian statistical approaches with applications in biomedical research and clinical decision-making. Students will gain hands-on experience with computational resampling techniques using R to enhance their ability to conduct robust statistical analyses.
Teachers
Intended learning outcomes
- Differentiate between traditional statistical inference and Bayesian approaches in the context of biomedical data analysis.
- Explain the principles and applications of resampling methods, including bootstrapping, permutation tests, and jackknife techniques.
- Summarize the role of Monte Carlo simulations in enhancing the robustness of statistical analyses in clinical research.
- Construct Bayesian models to address real-world clinical decision-making problems with appropriate prior information.
- Design Monte Carlo simulations to assess statistical properties such as variability and confidence intervals in biostatistical studies.
- Implement resampling techniques using R to analyze biomedical datasets and interpret the results effectively.
- Integrate computational resampling techniques into the workflow of biostatistical consulting projects, ensuring accuracy and reproducibility in online collaborative environments.
- Communicate complex statistical findings from resampling and Bayesian analyses to non-technical stakeholders through clear visualizations and reports.
- Demonstrate proficiency in selecting and applying appropriate advanced statistical methods to solve practical problems in biostatistics.
About
This course covers the fundamental principles of clinical trials, including trial phases, regulatory guidelines, ethical considerations, and Good Clinical Practice. Students will explore randomization techniques, endpoint selection, and bias reduction while understanding the biostatistician’s role in study design, sample size estimation, interim analysis, and regulatory reporting. The module covers Statistical Analysis Plans (SAP), interpretation of trial results, and compliance with regulatory standards, ensuring students develop industry-relevant expertise in biostatistical applications in clinical research.
Teachers
Intended learning outcomes
- Explain the fundamental principles of clinical trial design, including trial phases, regulatory guidelines, and ethical considerations in biomedical research.
- Describe various randomization techniques, endpoint selection methods, and bias reduction strategies used in clinical trials.
- Outline the biostatistician's responsibilities in developing Statistical Analysis Plans (SAP), performing interim analyses, and ensuring compliance with Good Clinical Practice (GCP).
- Develop a basic Statistical Analysis Plan (SAP) using templates and guidelines relevant to clinical research standards.
- Interpret clinical trial results accurately by analyzing statistical outputs and understanding their implications for regulatory submissions.
- Apply sample size estimation methods to design statistically sound clinical trials for different study phases.
- Collaborate effectively in a virtual team to address statistical challenges in clinical trial design and reporting.
- Adhere to ethical and regulatory standards when managing biostatistical tasks in clinical research, ensuring integrity and compliance.
- Demonstrate the ability to critically evaluate clinical trial protocols and suggest improvements from a biostatistical perspective.
About
This course covers advanced statistical inference techniques, focusing on resampling methods such as bootstrapping, permutation tests, and jackknife techniques. It also introduces Monte Carlo simulations and Bayesian statistical approaches with applications in biomedical research and clinical decision-making. Students will gain hands-on experience with computational resampling techniques using R to enhance their ability to conduct robust statistical analyses.
Teachers
Intended learning outcomes
- Differentiate between traditional statistical inference and Bayesian approaches in the context of biomedical data analysis.
- Explain the principles and applications of resampling methods, including bootstrapping, permutation tests, and jackknife techniques.
- Summarize the role of Monte Carlo simulations in enhancing the robustness of statistical analyses in clinical research.
- Construct Bayesian models to address real-world clinical decision-making problems with appropriate prior information.
- Design Monte Carlo simulations to assess statistical properties such as variability and confidence intervals in biostatistical studies.
- Implement resampling techniques using R to analyze biomedical datasets and interpret the results effectively.
- Integrate computational resampling techniques into the workflow of biostatistical consulting projects, ensuring accuracy and reproducibility in online collaborative environments.
- Communicate complex statistical findings from resampling and Bayesian analyses to non-technical stakeholders through clear visualizations and reports.
- Demonstrate proficiency in selecting and applying appropriate advanced statistical methods to solve practical problems in biostatistics.
About
This course covers the fundamental principles of clinical trials, including trial phases, regulatory guidelines, ethical considerations, and Good Clinical Practice. Students will explore randomization techniques, endpoint selection, and bias reduction while understanding the biostatistician’s role in study design, sample size estimation, interim analysis, and regulatory reporting. The module covers Statistical Analysis Plans (SAP), interpretation of trial results, and compliance with regulatory standards, ensuring students develop industry-relevant expertise in biostatistical applications in clinical research.
Teachers
Intended learning outcomes
- Explain the fundamental principles of clinical trial design, including trial phases, regulatory guidelines, and ethical considerations in biomedical research.
- Describe various randomization techniques, endpoint selection methods, and bias reduction strategies used in clinical trials.
- Outline the biostatistician's responsibilities in developing Statistical Analysis Plans (SAP), performing interim analyses, and ensuring compliance with Good Clinical Practice (GCP).
- Develop a basic Statistical Analysis Plan (SAP) using templates and guidelines relevant to clinical research standards.
- Interpret clinical trial results accurately by analyzing statistical outputs and understanding their implications for regulatory submissions.
- Apply sample size estimation methods to design statistically sound clinical trials for different study phases.
- Collaborate effectively in a virtual team to address statistical challenges in clinical trial design and reporting.
- Adhere to ethical and regulatory standards when managing biostatistical tasks in clinical research, ensuring integrity and compliance.
- Demonstrate the ability to critically evaluate clinical trial protocols and suggest improvements from a biostatistical perspective.
Entry Requirements
Application Process
Submit initial Application
Complete the online application form with your personal information
Documentation Review
Submit required transcripts, certificates, and supporting documents
Assessment
Your application will be evaluated against program requirements
Interview
Selected candidates may be invited for an interview
Decision
Receive an admission decision
Enrollment
Complete registration and prepare to begin your studies
.avif)