The course teaches students comprehensive and specialised subjects in computer science; it teaches students cutting edge engineering skills to solve real-world problems using computational thinking and tools, as well as soft skills in communication, collaboration, and project management that enable students to succeed in real-world business environments. Most of this program is case (or) project-based where students learn by solving real-world problems end to end. This program has core courses that focus on computational thinking and problems solving from first principles. The core courses are followed by specialization courses that teach various aspects of building real-world systems. This is followed by more advanced courses that focus on research level topics, which cover state of the art methods. The program also has a capstone project at the end, wherein students can either work on building end to end solutions to real world problems (or) work on a research topic. The program also focuses on teaching the students the “ability to learn” so that they can be lifelong learners constantly upgrading their skills. Students can choose from a spectrum of courses to specialize in a specific sub-area of Computer Science like Artificial Intelligence and Machine Learning, Cloud Computing, Software Engineering, or Data Science, etc.
Target Audience
- Ages 19-30, 31-65, 65+
Target Group
This course is designed for individuals who wish to enhance their knowledge of computer science and its various applications used in different fields of employment. It is designed for those that will have responsibility for planning, organizing, and directing technological operations. In all cases, the target group should be prepared to pursue substantial academic studies. Students must qualify for the course of study by entrance application. A prior computer science degree is not required; however the course does assume technical aptitude; and it targets students with finance, engineering, or STEM training or professional experience.
Mode of attendance
Online/Blended Learning
Structure of the programme - Please note that this structure may be subject to change based on faculty expertise and evolving academic best practices. This flexibility ensures we can provide the most up-to-date and effective learning experience for our students.The Master of Science in Computer Science combines asynchronous components (lecture videos, readings, and assignments) and synchronous meetings attended by students and a teacher during a video call. Asynchronous components support the schedule of students from diverse work-life situations, and synchronous meetings provide accountability and motivation for students. Students have direct access to their teacher and their peers at all times through the use of direct message and group chat; teachers are also able to initiate voice and video calls with students outside the regularly scheduled synchronous sessions. Modules are offered continuously on a publicly advertised schedule consisting of cohort sequences designed to accommodate adult students at different paces. Although there are few formal prerequisites identified throughout the programme, enrollment in courses depends on advisement from Woolf faculty and staff.The degree has 3 tiers: The first tier is required for all students, who must take 15 ECTS. In the second tier, students must select 45 ECTS from elective tiers. Under the guidance of the Academic Staff at Woolf, students may either select exclusively from one specialization track (in which case they will earn that specialization), or they may mix tracks (in which case they will finish without a specialization). Tier Three may be completed in two different ways: a) by completing a 30ECTS Advanced Applied Computer Science capstone project, or b) by completing a 10 ECTS Applied Computer Science project and 20 ECTS of electives from the program.
Grading System
Scale: 0-100 points
Components: 60% of the mark derives from the average of the assignments, and 40% of the mark derives from the cumulative examination
Passing requirement: minimum of 60% overall
Dates of Next Intake
Rolling admission
Pass rates
2023 pass rates will be publicised in the next cycle, contingent upon ensuring sufficient student data for anonymization.
Identity Malta’s VISA requirement for third country nationals: https://www.identitymalta.com/unit/central-visa-unit/
Passing requirement: minimum of 60% overall
Dates of Next Intake
Rolling admission
Pass rates
2023 pass rates will be publicised in the next cycle, contingent upon ensuring sufficient student data for anonymization. Identity Malta’s VISA requirement for third country nationals: https://www.identitymalta.com/unit/central-visa-unit/
This course helps students translate advanced mathematical/statistical/scientific concepts into code. This is a module for writing code to solve real-world problems. It introduces programming concepts (such as control structures, recursion, classes and objects) assuming no prior programming knowledge, to make this course accessible to advanced professionals from scientific fields like Biology, Physics,
Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation for converting scientific knowledge into programming concepts, the course advances to dive deeply into Object-Oriented Programming and its methodologies. It also covers when and how to use inbuilt-data structures like 1- Dimensional and 2-Dimensional Arrays before introducing the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods.
The module allows students to learn these concepts using a
modern programming language such as Java or Python. The course offers students the ability to identify and solve computer programming problems in scientific fields at a graduate level. The course prepares students to handle advanced data structures and algorithm design methods in the separate module, ‘Data Structures’
The ability to solve problems is a skill, and just like any other skill, the more one practices, the better one gets. So how exactly does one practice problem solving? Learning about different problem-solving strategies and when to use them will give a good start. Problem solving is a process. Most strategies provide steps that help you identify the problem and choose the best solution.
Building a toolbox of problem-solving strategies will improve problem solving skills. With practice, students will be able to recognize and choose among multiple strategies to find the most appropriate one to solve complex problems. The course will focus on developing problem-solving strategies such as abstraction, modularity, recursion, iteration, bisection, and exhaustive enumeration.
The course will also introduce arrays and some of their real-world applications, such as prefix sum, carry forward, subarrays, and 2-dimensional matrices. Examples will include industry-relevant problems and dive deeply into building their solutions with various approaches, recognizing each’s limitations (i.e when to use a data structure and when not to use a data structure).
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
Mathematics and computer science are closely related fields. Problems in computer science are often formalized and solved with mathematical methods. It is likely that many important problems currently facing computer scientists will be solved by researchers skilled in algebra, analysis, combinatorics, logic and/or probability theory, as well as computer science.
This course covers elementary discrete mathematics for computer science and engineering. Topics may include asymptotic notation and growth of functions; permutations and combinations; counting principles; discrete probability. Further selected topics may also be covered, such as recursive definition and structural induction; state machines and invariants; recurrences; generating functions.
Students will be able to explain and apply the basic methods of discrete (noncontinuous) mathematics in computer science. They will be able to use these methods in subsequent courses in the design and analysis of algorithms, computability theory, software engineering, and computer systems.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
Spreadsheets for Data Understanding introduces students to the principles and techniques of data cleaning, handling data sets of varying sizes, and visualizing data/data storytelling. Students will also learn the basics of predictive modelling from data sets. These are all introduced through the means of Microsoft Excel, the industry-standard spreadsheet program. Students will learn how to use inbuilt functions, as well as techniques such as creating and modifying pivot tables.
Structured Query Language (SQL) is key to working with data in relational databases, a task at the core of data science and analytics. In this course, students will learn all the major keywords and clauses used to extract data, best practices for formatting SQL queries, and how to generate meaningful insights from the results.
The focus is at all times on real-world uses of SQL queries, syntax, and expression, to allow students to begin professional-level work as quickly as possible.
This course is aimed to build a strong foundational knowledge of Data Analytics used extensively in the Data Science field. Tableau is a powerful data visualisation tool used in the business analytics industry to process and visualise raw business data in a very presentable and understandable format. Tableau is used by all data analytics departments of companies and in data analytics companies in various fields for its ease of use and efficiency. Tableau uses relational databases, Online Analytical Processing Cubes, Spreadsheets, cloud databases to generate graphical type visualisations. Course starts with visualisations and moves to an in-depth look at the different chart and graph functions, calculations, mapping and other functionality. Students will be taught quick table calculations, reference lines, different types of visualisations, bands and distributions, parameters, motion chart, trends and forecasting, formatting, stories, performance recording and advanced mapping.
At the end of this course, students will be prepared, if they desire, to earn industry desktop certifications as a Tableau Desktop Specialist, a Tableau Certified Associate, or a Tableau Certified Professional.
A business case study is a course designed for the learner to identify a business real world problem and its objective is to help students rigorously solve a real-world, technically-challenging business problem where they would apply all of the concepts, techniques and tools learnt in the program. Students typically pick a problem from a known business problem or identify business cases where data analytics can be used to solve a problem. The choosing of a topic can be done after discussing it with the course instructor(s). Students also have an option of choosing a business problem in their professional organization but the external supervisors should be approved by the instructor(s). Students start by identifying a business problem and proposing a methodology to solve the said business problem. Students then decide what technical and business tools will be used for the solution methodology. Students will first work on the real-world data, clean and process it using techniques learnt in this program. Students then will use algorithms and approach with a coding language and tool they think will get the best results. At the end of the case study student should be able to present the business problem and solution either via Jupyter notebooks or via a blog.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students
also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices
This course helps students translate mathematical/statistical/scientific concepts into code. This is a foundational course for writing code to solve Data Science ML & AI problems. It introduces basic programming concepts (like control structures, recursion, classes and objects) from scratch, assuming no prerequisites, to make this course accessible to students from non-computational scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation, the course advances to dive deep into core Mathematical libraries like NumPy, Scipy and Pandas. Students also learn when and how to use inbuilt-data structures like Lists, Dicts, Sets and Tuples. The module introduces the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods. The module does not dive deep into the data structures and algorithm design methods in this course - that is available in the ‘Data Structures and Algorithms’ module. This course is valuabe for all students specializing in mathematical sub-areas of CS like ML, Data Science, Scientific Computing etc.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. The module starts with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. We build on top this to introduce the mathematical formulation of logistic regression to find a separating hyperplane. Students learn to solve the optimization problem using vector calculus and gradient descent (GD) based algorithms. The module introduces computational variations of GD like mini-batch and stochastic gradient descent. Students also learn other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc. Students also learn how each of these techniques under various real world situations like the presence of outliers, imbalanced data, multi class classification etc. Students learn bias and variance trade-off and various techniques to avoid overfitting and underfitting. Students also study these algorithms from a Bayesian viewpoint along with geometric intuition. This module is hands-on and students apply all these classical techniques to real world problems.
This course introduces more advanced ML techniques like ensembles: bagging, boosting, cascading and stacking classifiers and regressors. It covers both the theoretical foundations and applicative details of these techniques along with popular implementations of boosting like LightGBM, CatBoost and XGBoost. Students also delve into kernel methods with specific focus on SVMs for classification and regression. Students will study state of the art model agnostic feature importance and model-interpretability techniques like LIME and SHAP. Students also study classical NLP based text encoding methods like Bag-of-words, TF-IDF etc. The module teaches various classical methods in time series analysis and forecasting like ARMA, ARIMA etc. Students also learn how to pose time series forecasting problems as regression and classification problems to leverage well studied ML techniques. This is followed by various domain and problem specific Feature engineering techniques that are often helpful in real world problem solving. Students will study methods like error analysis, ablative analysis etc., to debug and understand why and where a model is performing well and where it is not performing well. This will further help us in designing appropriate features. Students study model calibration techniques like Platt Scaling, Isotonic Regression etc. Later in this course, we cover how to build recommender systems using content-based and collaborative filtering methods. The module also teaches the detailed solution of the Netflix prize (2009) and various recent advances in RecSys.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course is aimed to help learners understand various techniques and algorithms to visualize, analyse and understand high dimensional data which is very common in Data Science and ML. The module starts with linear algebraic methods like Principal Component Analysis (PCA) and SVD (Singular Value Decomposition) for obtaining linear projection of high dimensional data. This is followed by more advanced nonlinear and state of the art techniques like t-SNE and UMAP for visualizing high dimensional data. Each of these techniques would be covered in full mathematical detail from first principles along with applying them to real world datasets in NLP, Genomics and internet-datasets. Students will also study how PCA and SVD are related to general Matrix Factorization techniques. To analyse and understand high dimensional un-labelled data, students learn clustering techniques like K-Means, Gaussian Mixture models, Hierarchical Clustering and DBSCAN. The modules shows how some of the techniques are mathematically related to Matrix Factorization. Students study various outlier detection techniques based on density, proximity, factorization and cluster analysis.
This course is a follow-up to Introduction to Problem-Solving Techniques: Part 1, and as part of their academic planning process with Woolf staff, students will ordinarily take that course first.
Part 2 deepens the approach to data structures by including such topics as stacks, queues, linked lists, and trees, and we will discuss in detail real world applications of each approach and their comparative strengths and limitations (i.e when to use a data structure and when not to use a data structure). This course will also include hashing techniques along with recursion and subset problems. This course will have rigorous homework and assignments as we introduce more than 4 data structures.
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
This core course equips the student with knowledge of database management systems, operating systems and computer networks. At the end of the course, students will have a critical understanding of the architecture of computers and networks, as well has how programs interact with these. Students begin with mapping data storage problems (as they had done in Relational Databases) to understand how data is stored in a distributed network, and related issues such as concurrency. Subsequently, students cover operating systems with an overview of process scheduling, process synchronisation and memory management techniques with disk scheduling. The module concludes with computer networks, where we will be discussing all of the computer network layers and their protocols in detail.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a foundational course on building server-side (or backend) applications using popular JavaScript runtime environments like Node.js. Students will learn event driven programming for building scalable backend for web applications. The module teaches various aspects of Node.js like setup, package manager, client-server programming and connecting to various databases and REST APIs. Most of these concepts would be covered in a hands-on manner with real world examples and applications built from scratch using Node.js on Linux servers. This course also provides an introduction to Linux server administration and scripting with special focus on web-development and networking. Students learn to use Linux monitoring tools (like Monit) to track the health of the servers. The module also provides an introduction to Express.js which is a popular light-weight framework for Node.js applications. Given the practical nature of this course, this would involve building actual website backends via assignments/projects for ecommerce, online learning and/or photo-sharing.
A distributed system is an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks.
Goals of a Distributed System:
● Transparency -> End user does not know what lies behind and how the system is working internally.
● Scalability - > Refers to the growth of the system.
● Availability -> Refers to the system's uptime.
The module will carefully examine three case studies, with attention to such topics as:
● Basics of High Level System Design and consistent Hashing
● Caching
● CAP Theorem
● Replication and Master-Slave
● NoSQL
● Differences between SQL and NoSQL
● Multi Master
● Apache Zookeeper & Apache Kafka
● Case Study on ElasticSearch
● AWS S3 and Quad Trees
● Design Distributed Crawler
● Microservices and Containerisation
● Hotstar & IRCTC System design
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
This course introduces learners to the advanced concepts and practical applications of Generative AI. Students will explore the underlying theories, architectures, and methodologies that drive state-of-the-art generative models. By integrating hands-on projects and case studies, participants will apply these models in real-world scenarios, enhancing both their theoretical understanding and practical proficiency. The course aims to equip learners with the necessary skills to develop, fine-tune, and evaluate generative models, fostering creativity and innovation in various domains including art, music, and content creation.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
This is a foundational and mandatory course which aims to build student's ability to apply various algorithmic design methods to provide an optimal solution to computational problems. This course starts with time and space complexity analysis of divide and conquer algorithms using recursion-tree based methods and Master’s theorem. Students would also learn about amortized time and space complexity analysis for randomized/probabilistic algorithms. Various algorithmic design strategies would be introduced via real world examples and problems. Students would learn when, where and how to optimally use Divide and Conquer, Dynamic programming (top-down and button-up), Greedy, Backtracking and Randomization strategies with examples. The module uses various practical examples from Array manipulations, Sorting, Searching, String manipulations, Tree & Graphs traversals, Graph path-finding, Spanning Trees etc., to introduce the above algorithmic strategies in action. Students would implement many of the above algorithmic design methods from scratch as part of the assignments. The module also introduces how some of these popular algorithms are readily available via popular libraries in various programming languages.
This advanced graduate class addresses a unique topic on a rotating basis in order to keep the program at the forefront of scholarly research and industry practice.
Every year the academic staff member will approve of a new topic to be covered.
The bibliography will contain not less than 8 peer-reviewed articles or scholarly publications reflecting the current topic. Though the exact topic will vary, the emphasis of this module is practical, domain-specific issues in data science. Topics might include data handling, big data management systems, optimization, sparse signal recovery, principal component analysis, or deeper explorations of text mining, natural language processing, computer vision, or other topics introduced in other modules.
Often, Further Studies in Data Science and Data Analytics will extend, complicate, or otherwise deepen the topic taken on in its predecessor course, Studies in Data Science and Data Analytics, giving students who elect this sequence to develop
genuine expertise in a specific domain.
This course is designed to prepare Master’s students in Computer Science for successful transition into the professional world. Emphasizing both personal and professional development, it focuses on industry trends, job market demands, and the cultivation of soft and hard skills relevant in today's dynamic tech environment. Through hands-on experiences, interactive workshops, and expert guest lectures, students will gain insights into the expectations and challenges they'll face as Computer Science professionals. They will also develop a deeper understanding of potential career pathways, while honing skills to make them stand out in the competitive job market.
In this module we will discuss general approaches to the construction of efficient solutions to problems.
Such methods are of interest because:
They provide templates suited to solving a broad range of diverse problems.
They can be translated into common control and data structures provided by most high-level languages.
The temporal and spatial requirements of the algorithms which result can be precisely analyzed.
This course will provide a solid foundation and background to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, which might include:
Greedy algorithms (Activity Selection, 0-1 Knapsack Problem, Fractional Knapsack Problem)
Dynamic programming (Longest Common Subsequence, 0-1 Knapsack Problem)
Minimum Spanning Trees (Prim’s Algorithm, Kruskal’s Algorithm)
Graph Algorithms (Dijkstra’s Shortest Path Algorithm, Bipartite Graphs, Minimum Vertex Cover)
Although more than one technique may be applicable to a specific problem, it is often the case that an algorithm constructed by one approach is clearly superior to equivalent solutions built using alternative techniques. This module will help students assess these choices.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course helps students translate advanced mathematical/statistical/scientific concepts into code. This is a module for writing code to solve real-world problems. It introduces programming concepts (such as control structures, recursion, classes and objects) assuming no prior programming knowledge, to make this course accessible to advanced professionals from scientific fields like Biology, Physics,
Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation for converting scientific knowledge into programming concepts, the course advances to dive deeply into Object-Oriented Programming and its methodologies. It also covers when and how to use inbuilt-data structures like 1- Dimensional and 2-Dimensional Arrays before introducing the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods.
The module allows students to learn these concepts using a
modern programming language such as Java or Python. The course offers students the ability to identify and solve computer programming problems in scientific fields at a graduate level. The course prepares students to handle advanced data structures and algorithm design methods in the separate module, ‘Data Structures’
The ability to solve problems is a skill, and just like any other skill, the more one practices, the better one gets. So how exactly does one practice problem solving? Learning about different problem-solving strategies and when to use them will give a good start. Problem solving is a process. Most strategies provide steps that help you identify the problem and choose the best solution.
Building a toolbox of problem-solving strategies will improve problem solving skills. With practice, students will be able to recognize and choose among multiple strategies to find the most appropriate one to solve complex problems. The course will focus on developing problem-solving strategies such as abstraction, modularity, recursion, iteration, bisection, and exhaustive enumeration.
The course will also introduce arrays and some of their real-world applications, such as prefix sum, carry forward, subarrays, and 2-dimensional matrices. Examples will include industry-relevant problems and dive deeply into building their solutions with various approaches, recognizing each’s limitations (i.e when to use a data structure and when not to use a data structure).
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
Mathematics and computer science are closely related fields. Problems in computer science are often formalized and solved with mathematical methods. It is likely that many important problems currently facing computer scientists will be solved by researchers skilled in algebra, analysis, combinatorics, logic and/or probability theory, as well as computer science.
This course covers elementary discrete mathematics for computer science and engineering. Topics may include asymptotic notation and growth of functions; permutations and combinations; counting principles; discrete probability. Further selected topics may also be covered, such as recursive definition and structural induction; state machines and invariants; recurrences; generating functions.
Students will be able to explain and apply the basic methods of discrete (noncontinuous) mathematics in computer science. They will be able to use these methods in subsequent courses in the design and analysis of algorithms, computability theory, software engineering, and computer systems.
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
Spreadsheets for Data Understanding introduces students to the principles and techniques of data cleaning, handling data sets of varying sizes, and visualizing data/data storytelling. Students will also learn the basics of predictive modelling from data sets. These are all introduced through the means of Microsoft Excel, the industry-standard spreadsheet program. Students will learn how to use inbuilt functions, as well as techniques such as creating and modifying pivot tables.
Structured Query Language (SQL) is key to working with data in relational databases, a task at the core of data science and analytics. In this course, students will learn all the major keywords and clauses used to extract data, best practices for formatting SQL queries, and how to generate meaningful insights from the results.
The focus is at all times on real-world uses of SQL queries, syntax, and expression, to allow students to begin professional-level work as quickly as possible.
This course is aimed to build a strong foundational knowledge of Data Analytics used extensively in the Data Science field. Tableau is a powerful data visualisation tool used in the business analytics industry to process and visualise raw business data in a very presentable and understandable format. Tableau is used by all data analytics departments of companies and in data analytics companies in various fields for its ease of use and efficiency. Tableau uses relational databases, Online Analytical Processing Cubes, Spreadsheets, cloud databases to generate graphical type visualisations. Course starts with visualisations and moves to an in-depth look at the different chart and graph functions, calculations, mapping and other functionality. Students will be taught quick table calculations, reference lines, different types of visualisations, bands and distributions, parameters, motion chart, trends and forecasting, formatting, stories, performance recording and advanced mapping.
At the end of this course, students will be prepared, if they desire, to earn industry desktop certifications as a Tableau Desktop Specialist, a Tableau Certified Associate, or a Tableau Certified Professional.
A business case study is a course designed for the learner to identify a business real world problem and its objective is to help students rigorously solve a real-world, technically-challenging business problem where they would apply all of the concepts, techniques and tools learnt in the program. Students typically pick a problem from a known business problem or identify business cases where data analytics can be used to solve a problem. The choosing of a topic can be done after discussing it with the course instructor(s). Students also have an option of choosing a business problem in their professional organization but the external supervisors should be approved by the instructor(s). Students start by identifying a business problem and proposing a methodology to solve the said business problem. Students then decide what technical and business tools will be used for the solution methodology. Students will first work on the real-world data, clean and process it using techniques learnt in this program. Students then will use algorithms and approach with a coding language and tool they think will get the best results. At the end of the case study student should be able to present the business problem and solution either via Jupyter notebooks or via a blog.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students
also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices
This course helps students translate mathematical/statistical/scientific concepts into code. This is a foundational course for writing code to solve Data Science ML & AI problems. It introduces basic programming concepts (like control structures, recursion, classes and objects) from scratch, assuming no prerequisites, to make this course accessible to students from non-computational scientific fields like Biology, Physics, Medicine, Chemistry, Civil & Mechanical Engineering etc. After building a strong foundation, the course advances to dive deep into core Mathematical libraries like NumPy, Scipy and Pandas. Students also learn when and how to use inbuilt-data structures like Lists, Dicts, Sets and Tuples. The module introduces the concepts of computational complexity to help students write optimized code using appropriate data structures and algorithmic design methods. The module does not dive deep into the data structures and algorithm design methods in this course - that is available in the ‘Data Structures and Algorithms’ module. This course is valuabe for all students specializing in mathematical sub-areas of CS like ML, Data Science, Scientific Computing etc.
This course focuses on building basic classification and regression models and understanding these models rigorously both with a mathematical and an applicative focus. The module starts with a basic introduction to high dimensional geometry of points, distance-metrics, hyperplanes and hyperspheres. We build on top this to introduce the mathematical formulation of logistic regression to find a separating hyperplane. Students learn to solve the optimization problem using vector calculus and gradient descent (GD) based algorithms. The module introduces computational variations of GD like mini-batch and stochastic gradient descent. Students also learn other popular classification and regression methods like k-Nearest Neighbours, Naive Bayes, Decision Trees, Linear Regression etc. Students also learn how each of these techniques under various real world situations like the presence of outliers, imbalanced data, multi class classification etc. Students learn bias and variance trade-off and various techniques to avoid overfitting and underfitting. Students also study these algorithms from a Bayesian viewpoint along with geometric intuition. This module is hands-on and students apply all these classical techniques to real world problems.
This course introduces more advanced ML techniques like ensembles: bagging, boosting, cascading and stacking classifiers and regressors. It covers both the theoretical foundations and applicative details of these techniques along with popular implementations of boosting like LightGBM, CatBoost and XGBoost. Students also delve into kernel methods with specific focus on SVMs for classification and regression. Students will study state of the art model agnostic feature importance and model-interpretability techniques like LIME and SHAP. Students also study classical NLP based text encoding methods like Bag-of-words, TF-IDF etc. The module teaches various classical methods in time series analysis and forecasting like ARMA, ARIMA etc. Students also learn how to pose time series forecasting problems as regression and classification problems to leverage well studied ML techniques. This is followed by various domain and problem specific Feature engineering techniques that are often helpful in real world problem solving. Students will study methods like error analysis, ablative analysis etc., to debug and understand why and where a model is performing well and where it is not performing well. This will further help us in designing appropriate features. Students study model calibration techniques like Platt Scaling, Isotonic Regression etc. Later in this course, we cover how to build recommender systems using content-based and collaborative filtering methods. The module also teaches the detailed solution of the Netflix prize (2009) and various recent advances in RecSys.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course is aimed to help learners understand various techniques and algorithms to visualize, analyse and understand high dimensional data which is very common in Data Science and ML. The module starts with linear algebraic methods like Principal Component Analysis (PCA) and SVD (Singular Value Decomposition) for obtaining linear projection of high dimensional data. This is followed by more advanced nonlinear and state of the art techniques like t-SNE and UMAP for visualizing high dimensional data. Each of these techniques would be covered in full mathematical detail from first principles along with applying them to real world datasets in NLP, Genomics and internet-datasets. Students will also study how PCA and SVD are related to general Matrix Factorization techniques. To analyse and understand high dimensional un-labelled data, students learn clustering techniques like K-Means, Gaussian Mixture models, Hierarchical Clustering and DBSCAN. The modules shows how some of the techniques are mathematically related to Matrix Factorization. Students study various outlier detection techniques based on density, proximity, factorization and cluster analysis.
This course is a follow-up to Introduction to Problem-Solving Techniques: Part 1, and as part of their academic planning process with Woolf staff, students will ordinarily take that course first.
Part 2 deepens the approach to data structures by including such topics as stacks, queues, linked lists, and trees, and we will discuss in detail real world applications of each approach and their comparative strengths and limitations (i.e when to use a data structure and when not to use a data structure). This course will also include hashing techniques along with recursion and subset problems. This course will have rigorous homework and assignments as we introduce more than 4 data structures.
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
This core course equips the student with knowledge of database management systems, operating systems and computer networks. At the end of the course, students will have a critical understanding of the architecture of computers and networks, as well has how programs interact with these. Students begin with mapping data storage problems (as they had done in Relational Databases) to understand how data is stored in a distributed network, and related issues such as concurrency. Subsequently, students cover operating systems with an overview of process scheduling, process synchronisation and memory management techniques with disk scheduling. The module concludes with computer networks, where we will be discussing all of the computer network layers and their protocols in detail.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a foundational course on building server-side (or backend) applications using popular JavaScript runtime environments like Node.js. Students will learn event driven programming for building scalable backend for web applications. The module teaches various aspects of Node.js like setup, package manager, client-server programming and connecting to various databases and REST APIs. Most of these concepts would be covered in a hands-on manner with real world examples and applications built from scratch using Node.js on Linux servers. This course also provides an introduction to Linux server administration and scripting with special focus on web-development and networking. Students learn to use Linux monitoring tools (like Monit) to track the health of the servers. The module also provides an introduction to Express.js which is a popular light-weight framework for Node.js applications. Given the practical nature of this course, this would involve building actual website backends via assignments/projects for ecommerce, online learning and/or photo-sharing.
A distributed system is an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks.
Goals of a Distributed System:
● Transparency -> End user does not know what lies behind and how the system is working internally.
● Scalability - > Refers to the growth of the system.
● Availability -> Refers to the system's uptime.
The module will carefully examine three case studies, with attention to such topics as:
● Basics of High Level System Design and consistent Hashing
● Caching
● CAP Theorem
● Replication and Master-Slave
● NoSQL
● Differences between SQL and NoSQL
● Multi Master
● Apache Zookeeper & Apache Kafka
● Case Study on ElasticSearch
● AWS S3 and Quad Trees
● Design Distributed Crawler
● Microservices and Containerisation
● Hotstar & IRCTC System design
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
This course introduces learners to the advanced concepts and practical applications of Generative AI. Students will explore the underlying theories, architectures, and methodologies that drive state-of-the-art generative models. By integrating hands-on projects and case studies, participants will apply these models in real-world scenarios, enhancing both their theoretical understanding and practical proficiency. The course aims to equip learners with the necessary skills to develop, fine-tune, and evaluate generative models, fostering creativity and innovation in various domains including art, music, and content creation.
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
This is a foundational and mandatory course which aims to build student's ability to apply various algorithmic design methods to provide an optimal solution to computational problems. This course starts with time and space complexity analysis of divide and conquer algorithms using recursion-tree based methods and Master’s theorem. Students would also learn about amortized time and space complexity analysis for randomized/probabilistic algorithms. Various algorithmic design strategies would be introduced via real world examples and problems. Students would learn when, where and how to optimally use Divide and Conquer, Dynamic programming (top-down and button-up), Greedy, Backtracking and Randomization strategies with examples. The module uses various practical examples from Array manipulations, Sorting, Searching, String manipulations, Tree & Graphs traversals, Graph path-finding, Spanning Trees etc., to introduce the above algorithmic strategies in action. Students would implement many of the above algorithmic design methods from scratch as part of the assignments. The module also introduces how some of these popular algorithms are readily available via popular libraries in various programming languages.
This advanced graduate class addresses a unique topic on a rotating basis in order to keep the program at the forefront of scholarly research and industry practice.
Every year the academic staff member will approve of a new topic to be covered.
The bibliography will contain not less than 8 peer-reviewed articles or scholarly publications reflecting the current topic. Though the exact topic will vary, the emphasis of this module is practical, domain-specific issues in data science. Topics might include data handling, big data management systems, optimization, sparse signal recovery, principal component analysis, or deeper explorations of text mining, natural language processing, computer vision, or other topics introduced in other modules.
Often, Further Studies in Data Science and Data Analytics will extend, complicate, or otherwise deepen the topic taken on in its predecessor course, Studies in Data Science and Data Analytics, giving students who elect this sequence to develop
genuine expertise in a specific domain.
This course is designed to prepare Master’s students in Computer Science for successful transition into the professional world. Emphasizing both personal and professional development, it focuses on industry trends, job market demands, and the cultivation of soft and hard skills relevant in today's dynamic tech environment. Through hands-on experiences, interactive workshops, and expert guest lectures, students will gain insights into the expectations and challenges they'll face as Computer Science professionals. They will also develop a deeper understanding of potential career pathways, while honing skills to make them stand out in the competitive job market.
In this module we will discuss general approaches to the construction of efficient solutions to problems.
Such methods are of interest because:
They provide templates suited to solving a broad range of diverse problems.
They can be translated into common control and data structures provided by most high-level languages.
The temporal and spatial requirements of the algorithms which result can be precisely analyzed.
This course will provide a solid foundation and background to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, which might include:
Greedy algorithms (Activity Selection, 0-1 Knapsack Problem, Fractional Knapsack Problem)
Dynamic programming (Longest Common Subsequence, 0-1 Knapsack Problem)
Minimum Spanning Trees (Prim’s Algorithm, Kruskal’s Algorithm)
Graph Algorithms (Dijkstra’s Shortest Path Algorithm, Bipartite Graphs, Minimum Vertex Cover)
Although more than one technique may be applicable to a specific problem, it is often the case that an algorithm constructed by one approach is clearly superior to equivalent solutions built using alternative techniques. This module will help students assess these choices.
Spreadsheets for Data Understanding introduces students to the principles and techniques of data cleaning, handling data sets of varying sizes, and visualizing data/data storytelling. Students will also learn the basics of predictive modelling from data sets. These are all introduced through the means of Microsoft Excel, the industry-standard spreadsheet program. Students will learn how to use inbuilt functions, as well as techniques such as creating and modifying pivot tables.
Structured Query Language (SQL) is key to working with data in relational databases, a task at the core of data science and analytics. In this course, students will learn all the major keywords and clauses used to extract data, best practices for formatting SQL queries, and how to generate meaningful insights from the results.
The focus is at all times on real-world uses of SQL queries, syntax, and expression, to allow students to begin professional-level work as quickly as possible.
This course is aimed to build a strong foundational knowledge of Data Analytics used extensively in the Data Science field. Tableau is a powerful data visualisation tool used in the business analytics industry to process and visualise raw business data in a very presentable and understandable format. Tableau is used by all data analytics departments of companies and in data analytics companies in various fields for its ease of use and efficiency. Tableau uses relational databases, Online Analytical Processing Cubes, Spreadsheets, cloud databases to generate graphical type visualisations. Course starts with visualisations and moves to an in-depth look at the different chart and graph functions, calculations, mapping and other functionality. Students will be taught quick table calculations, reference lines, different types of visualisations, bands and distributions, parameters, motion chart, trends and forecasting, formatting, stories, performance recording and advanced mapping.
At the end of this course, students will be prepared, if they desire, to earn industry desktop certifications as a Tableau Desktop Specialist, a Tableau Certified Associate, or a Tableau Certified Professional.
A business case study is a course designed for the learner to identify a business real world problem and its objective is to help students rigorously solve a real-world, technically-challenging business problem where they would apply all of the concepts, techniques and tools learnt in the program. Students typically pick a problem from a known business problem or identify business cases where data analytics can be used to solve a problem. The choosing of a topic can be done after discussing it with the course instructor(s). Students also have an option of choosing a business problem in their professional organization but the external supervisors should be approved by the instructor(s). Students start by identifying a business problem and proposing a methodology to solve the said business problem. Students then decide what technical and business tools will be used for the solution methodology. Students will first work on the real-world data, clean and process it using techniques learnt in this program. Students then will use algorithms and approach with a coding language and tool they think will get the best results. At the end of the case study student should be able to present the business problem and solution either via Jupyter notebooks or via a blog.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course teaches students how to analyse the ways users engage with a service. This method, called product analytics, helps businesses track and analyse user data. Students will learn more deeply what is required to move a product from idea to implementation, through to launch, and then on to iterative improvements. The course teaches how to measure progress, validate or update product hypotheses, and present product learnings.
Also, students will gain experience in making informed decisions, as well as how to present findings and make an analytics-informed business case to win support for a product.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students
also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices
This course introduces learners to the advanced concepts and practical applications of Generative AI. Students will explore the underlying theories, architectures, and methodologies that drive state-of-the-art generative models. By integrating hands-on projects and case studies, participants will apply these models in real-world scenarios, enhancing both their theoretical understanding and practical proficiency. The course aims to equip learners with the necessary skills to develop, fine-tune, and evaluate generative models, fostering creativity and innovation in various domains including art, music, and content creation.
This advanced graduate class addresses a unique topic on a rotating basis in order to keep the program at the forefront of scholarly research and industry practice.
Every year the academic staff member will approve of a new topic to be covered.
The bibliography will contain not less than 8 peer-reviewed articles or scholarly publications reflecting the current topic. Though the exact topic will vary, the emphasis of this module is practical, domain-specific issues in data science. Topics might include data handling, big data management systems, optimization, sparse signal recovery, principal component analysis, or deeper explorations of text mining, natural language processing, computer vision, or other topics introduced in other modules.
Often, Further Studies in Data Science and Data Analytics will extend, complicate, or otherwise deepen the topic taken on in its predecessor course, Studies in Data Science and Data Analytics, giving students who elect this sequence to develop
genuine expertise in a specific domain.
This course is designed to prepare Master’s students in Computer Science for successful transition into the professional world. Emphasizing both personal and professional development, it focuses on industry trends, job market demands, and the cultivation of soft and hard skills relevant in today's dynamic tech environment. Through hands-on experiences, interactive workshops, and expert guest lectures, students will gain insights into the expectations and challenges they'll face as Computer Science professionals. They will also develop a deeper understanding of potential career pathways, while honing skills to make them stand out in the competitive job market.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course provides a strong mathematical and applicative introduction to Deep Learning. The module starts with the perceptron model as an over simplified approximation to a biological neuron. We motivate the need for a network of neurons and how they can be connected to form a Multi Layered Perceptron (MLPs). This is followed by a rigorous understanding of back-propagation algorithms and its limitations from the 1980s. Students study how modern deep learning took off with improved computational tools and data sets. We teach more modern activation units (like ReLU and SeLU) and how they overcome problems with the more classical Sigmoid and Tanh units. Students learn weight initialization methods, regularization by dropouts, batch normalization etc., to ensure that deep MLPs can be successfully trained. The module teaches variants of Gradient Descent that have been specifically designed to work well for deep learning systems like ADAM, AdaGrad, RMSProp etc. Students also learn AutoEncoders, VAEs and Word2Vec as unsupervised, encoding deep-learning architectures. We apply all of the foundational theory learned to various real world problems using TensorFlow 2 and Keras. Students also understand how TensorFlow 2 works internally with specific focus on computational graph processing.
This course provides a comprehensive overview of Computer vision problems and how they can be tackled using various Convolutional Neural networks (CNNs). Students start with classical image processing operations like edge detection, convolution, shape detectors and colour space conversions. This is followed by a foundational understanding of Deep-Convolutional Neural networks and how their training and evaluation works. We introduce various CNN specific layers like pooling-layers and upsampling layers. We also introduce various Data Augmentation techniques that are very helpful for image-related problems. This is followed by a dive deep into the internals of popular CNN architectures like: AlexNet, VGGNet, ResNet etc. Students also learn how to use these methods practically for transfer learning. Students will study how various computer-vision related tasks like image segmentation, image-generation, object detection and localization, contrastive learning etc., can be performed using state of the art algorithms for each of these tasks. Most of these techniques would be studied directly from the original research papers and open-source code provided by the authors. Students would also implement some of these algorithms from scratch in this course.
This course focuses on modelling sequences (text, music, time-series, genes) using deep-learning models. We start with a simple Recurrent Neural Network and its limitations with long-sequences. Students learn LSTMs and GRUs which can handle significantly longer sequences to model sequence data like text, music, gene-sequences and time-series data. We study variations of LSTM like bi-directional LSTMs and encoder-decoder architectures. This is followed by a detailed study of attention mechanism and Transformer based models which are currently the state-of-the-art for NLP and sequence modelling. The module teaches encoder-decoder Transformers, BERT, BERT-variations, GPT-1,2 &3 models from both the architectural and mathematical viewpoints and also a practical viewpoint. Studnets learn to implement many of these complex models from scratch (using TensorFlow 2 and Keras) to gain a deeper understanding of how they work internally. Students will study popular applications of deep-learning in NLP like parts-of-speech tagging, question-answering systems, conversational engines (chatbots), Semantic search with low-latency etc. For each of these problems, Students will study cutting edge deep-learning models along with code implementations.
This course is aimed to help learners understand various techniques and algorithms to visualize, analyse and understand high dimensional data which is very common in Data Science and ML. The module starts with linear algebraic methods like Principal Component Analysis (PCA) and SVD (Singular Value Decomposition) for obtaining linear projection of high dimensional data. This is followed by more advanced nonlinear and state of the art techniques like t-SNE and UMAP for visualizing high dimensional data. Each of these techniques would be covered in full mathematical detail from first principles along with applying them to real world datasets in NLP, Genomics and internet-datasets. Students will also study how PCA and SVD are related to general Matrix Factorization techniques. To analyse and understand high dimensional un-labelled data, students learn clustering techniques like K-Means, Gaussian Mixture models, Hierarchical Clustering and DBSCAN. The modules shows how some of the techniques are mathematically related to Matrix Factorization. Students study various outlier detection techniques based on density, proximity, factorization and cluster analysis.
This course introduces learners to the advanced concepts and practical applications of Generative AI. Students will explore the underlying theories, architectures, and methodologies that drive state-of-the-art generative models. By integrating hands-on projects and case studies, participants will apply these models in real-world scenarios, enhancing both their theoretical understanding and practical proficiency. The course aims to equip learners with the necessary skills to develop, fine-tune, and evaluate generative models, fostering creativity and innovation in various domains including art, music, and content creation.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students
also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices
This course is aimed to build a strong foundational knowledge of data structures (DS) used extensively in computing. The module starts with introducing time and space complexity notations and estimation for code snippets. This helps students be able to make trade-offs between various Data Structures while solving real world computational problems. The module introduces most widely used basic data structures like Dynamic arrays, multi-dimensional arrays, Lists, Strings, Hash Tables, Binary Trees, Balanced Binary Trees, Priority Queues and Graphs. The module discusses multiple implementation variations for each of the above data-structures along with trade-offs in space and time for each implementation. In this course, students implement these data-structures from scratch to gain a solid understanding of their inner workings. Students are also introduced to how to use the built-in data-structures available in various programming languages/libraries like Python/NumPy/C++ STL/Java/JavaScript. Students solve real-world problems where they must use an optimal DS to solve a computational problem at hand.
This course is a follow-up to Introduction to Problem-Solving Techniques: Part 1, and as part of their academic planning process with Woolf staff, students will ordinarily take that course first.
Part 2 deepens the approach to data structures by including such topics as stacks, queues, linked lists, and trees, and we will discuss in detail real world applications of each approach and their comparative strengths and limitations (i.e when to use a data structure and when not to use a data structure). This course will also include hashing techniques along with recursion and subset problems. This course will have rigorous homework and assignments as we introduce more than 4 data structures.
By the end of this course a student can come up with the best strategy which can optimize both time and space complexities by choosing the best data structure suitable for a given problem.
This advanced graduate class addresses a unique topic on a rotating basis in order to keep the program at the forefront of scholarly research and industry practice.
Every year the academic staff member will approve of a new topic to be covered.
The bibliography will contain not less than 8 peer-reviewed articles or scholarly publications reflecting the current topic. Though the exact topic will vary, the emphasis of this module is practical, domain-specific issues in data science. Topics might include data handling, big data management systems, optimization, sparse signal recovery, principal component analysis, or deeper explorations of text mining, natural language processing, computer vision, or other topics introduced in other modules.
Often, Further Studies in Data Science and Data Analytics will extend, complicate, or otherwise deepen the topic taken on in its predecessor course, Studies in Data Science and Data Analytics, giving students who elect this sequence to develop
genuine expertise in a specific domain.
This course is designed to prepare Master’s students in Computer Science for successful transition into the professional world. Emphasizing both personal and professional development, it focuses on industry trends, job market demands, and the cultivation of soft and hard skills relevant in today's dynamic tech environment. Through hands-on experiences, interactive workshops, and expert guest lectures, students will gain insights into the expectations and challenges they'll face as Computer Science professionals. They will also develop a deeper understanding of potential career pathways, while honing skills to make them stand out in the competitive job market.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course builds upon the introductory JavaScript course to acquaint students of popular and modern frameworks to build the front end. We focus on three very popular frameworks/libraries in use: React.js, jQuery and AngularJS. We start with React.js, one of the most popular and advanced ones amongst the three. students learn various components and data flow to learn to architect real world front end using React.js. This would be achieved via multiple code examples and code-walkthroughs from scratch. We would also dive into React Native which is a cross platform Framework to build native mobile and smart-TV apps using JavaScript. This helps students to build applications for various platforms using only JavaScript. jQuery is one of the oldest and most widely used JavaScript libraries, which students cover in detail. Students specifically focus on how jQuery can simplify event handling, AJAX, HTML DOM tree manipulation and create CSS animations. We also provide a hands-on introduction to AngularJS to architect model-view-controller (MVC) based dynamic web pages.
This is a foundational course on building server-side (or backend) applications using popular JavaScript runtime environments like Node.js. Students will learn event driven programming for building scalable backend for web applications. The module teaches various aspects of Node.js like setup, package manager, client-server programming and connecting to various databases and REST APIs. Most of these concepts would be covered in a hands-on manner with real world examples and applications built from scratch using Node.js on Linux servers. This course also provides an introduction to Linux server administration and scripting with special focus on web-development and networking. Students learn to use Linux monitoring tools (like Monit) to track the health of the servers. The module also provides an introduction to Express.js which is a popular light-weight framework for Node.js applications. Given the practical nature of this course, this would involve building actual website backends via assignments/projects for ecommerce, online learning and/or photo-sharing.
A distributed system is an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks.
Goals of a Distributed System:
● Transparency -> End user does not know what lies behind and how the system is working internally.
● Scalability - > Refers to the growth of the system.
● Availability -> Refers to the system's uptime.
The module will carefully examine three case studies, with attention to such topics as:
● Basics of High Level System Design and consistent Hashing
● Caching
● CAP Theorem
● Replication and Master-Slave
● NoSQL
● Differences between SQL and NoSQL
● Multi Master
● Apache Zookeeper & Apache Kafka
● Case Study on ElasticSearch
● AWS S3 and Quad Trees
● Design Distributed Crawler
● Microservices and Containerisation
● Hotstar & IRCTC System design
Data is the fuel driving all major organisations. In this course, we help you understand how to process data at scale.
From understanding the fundamentals of distributed processing to designing data warehousing and writing ETL (Extract Transform Load) pipelines to process batch and streaming data.
We will give you a comprehensive view of the complete Data Engineering lifecycle.
This course introduces learners to the advanced concepts and practical applications of Generative AI. Students will explore the underlying theories, architectures, and methodologies that drive state-of-the-art generative models. By integrating hands-on projects and case studies, participants will apply these models in real-world scenarios, enhancing both their theoretical understanding and practical proficiency. The course aims to equip learners with the necessary skills to develop, fine-tune, and evaluate generative models, fostering creativity and innovation in various domains including art, music, and content creation.
Every organisation is building products to solve the pain points of its customers. Product managers are a critical part of an organisation, who make sure that evolving customer needs, and market trends are observed and converted into delightful solutions which help businesses get its outcomes.
In this course, students will get a fundamental understanding of product management practices.
This will give them a comprehensive view of the complete product management life cycle.
This course is designed to prepare Master’s students in Computer Science for successful transition into the professional world. Emphasizing both personal and professional development, it focuses on industry trends, job market demands, and the cultivation of soft and hard skills relevant in today's dynamic tech environment. Through hands-on experiences, interactive workshops, and expert guest lectures, students will gain insights into the expectations and challenges they'll face as Computer Science professionals. They will also develop a deeper understanding of potential career pathways, while honing skills to make them stand out in the competitive job market.
This is a project-based course, with the aim of building the required skills for creating web-based software systems. The course covers the entire lifecycle of building software projects, from requirement gathering and scope definition from a product document, to designing the architecture of the system, and all the way to delivery and maintenance of the software system.
The course covers both frontend, which is, building browser-based interfaces for users, using frontend web frameworks, and also building the backend, which is the server running an API to serve the information to the frontend, and running on an SQL or similar database management system for storage.
All aspects of delivering a software project, including security, user authentication and authorisation, monitoring and analytics, and maintaining the project are covered. The course also covers the aspects of project maintenance, like using a version control system, setting up continuous integration and deployment pipelines and bug trackers.
This course aims to build the core competency of building real world end-to-end ML systems and deploy them into production for a variety of problems and scenarios. Students would learn a variety of ML systems ranging from high throughput and low latency internet scale systems to low compute power and energy constrained IoT devices like smart watches. Students will study the ML lifecycle and various components in detail. We also use real world ML platforms like Google’s KubeFlow, TensorFlow Lite, and Amazon’s SageMaker to implement real world systems and understand the engineering trade-offs and challenges. Students
also learn relevant technologies and tools like Containerization (Docker) and Container Orchestration (Kubernetes) and Git which are often used extensively in real world scalable ML systems. This course is a hands-on course where we solve multiple real world cases and discuss solutions built by various companies and organizations to provide the students a comprehensive understanding of varied systems and design choices
This is a course that focuses both on architectural design and practical hands-on learning of the most used cloud services. The module extensively uses Amazon Web services (AWS) to show real world code examples of various cloud services. It also covers the core concepts and architectures in a platform agnostic manner so that students can easily translate these learnings to other cloud platforms (like Azure, GCP etc.). The module starts with virtualization and how virtualized compute instances are created and configured. Students also learn how to auto-scale applications using load balancers and build fault tolerant applications across a geographically distributed cloud. As relational databases are widely used in most enterprises, students learn how to migrate and scale (both vertically and horizontally) these databases on the cloud while ensuring enterprise grade security. Virtual private clouds enable us to create a logically isolated virtual network of compute resources. Students learn to set up a VPC using virtualized-compute-servers on AWS. The course also covers the basics of networking while setting up a VPC. Students learn of the architecture and practical aspects of distributed object storage and how it enables low latency and high availability data storage on the cloud.
In this module we will discuss general approaches to the construction of efficient solutions to problems.
Such methods are of interest because:
They provide templates suited to solving a broad range of diverse problems.
They can be translated into common control and data structures provided by most high-level languages.
The temporal and spatial requirements of the algorithms which result can be precisely analyzed.
This course will provide a solid foundation and background to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, which might include:
Greedy algorithms (Activity Selection, 0-1 Knapsack Problem, Fractional Knapsack Problem)
Dynamic programming (Longest Common Subsequence, 0-1 Knapsack Problem)
Minimum Spanning Trees (Prim’s Algorithm, Kruskal’s Algorithm)
Graph Algorithms (Dijkstra’s Shortest Path Algorithm, Bipartite Graphs, Minimum Vertex Cover)
Although more than one technique may be applicable to a specific problem, it is often the case that an algorithm constructed by one approach is clearly superior to equivalent solutions built using alternative techniques. This module will help students assess these choices.