Data Science – Working with Data

Course Level 3: Advanced

Estimated Study Time: 2-3 hours

This course will introduce you to methods for preparing data, how to differentiate between continuous and categorical variables, and what quantization and scaling involve. The course begins by introducing you to the data flow in Azure ML, you will learn about batch and real time processing, and the different types of joins you can use on your data. You will learn about R and Python programming languages and how they can be used in a data science project.

Next, you will be introduced to data sampling and preparation. You will learn about continuous and categorical variables, and what quantization can do for your data. The course will teach you about data munging which is the process of manually converting or mapping data from one “raw” form into another format, and how it is the most time-consuming part of a data science project. You will also learn about handling errors and outliers in your project. Finally, you will learn about scaling using either Python, R or Azure ML module for scaling.

Prerequisites: To complete this course successfully you need a basic knowledge of mathematics, including linear algebra. Additionally, some programming experience, ideally in either R or Python, is assumed and you will need to have completed the previous course Introduction to Data Science.

You must be a registered member of our website to access this course.

Course Content