Data Science with R and Python Course Details
With Artificial Intelligence continuing to be a prominent buzzword in 2019; the race to implement artificial intelligence (AI) and machine learning into products and services across every industry has caused a job boom in the field. Most of the organizations across globe have already beginning to see the incredible capabilities of AI, using advantages of AI to enhance human intelligence and gain real value from their data.
The global machine learning (ML) market is estimated to grow from $1.4 billion in 2017 to $8.8 billion by 2022. AI is projected to create 2.3 million related jobs by 2020. Artificial Intelligence (AI) will define the next generation of software solutions. Human-like capabilities such as understanding natural language, speech, vision, and making inferences from knowledge will extend software beyond the app.

Course Curriculum
Basic Probability and Terms
Events and their Probabilities | Rules of Probability | Conditional Probability and Independence | Permutations and Combinations | Bayers Theorem | Descriptive Statistics | Compound Probability | Conditional Probability.
Probability Distributions
Types of Distributions | Functions of Random Variables | Probability Distribution Graphs | Confidence Intervals.
Data Transformations and Quality Analysis
Merge, Rollup, Transpose and Append | Missing Analysis and Treatment Outlier Analysis and Treatment
Exploratory Data Analysis
Summarizing and Visualizing the Important Characteristics of Data | Hypothesis Testing | Visualizations | Univariates, Bivariates | Crosstabs, Correlation
Linear Regression
Implementing Simple & Multiple Linear Regression with Python | Making Sense of Result Parameters | Model Validation | Handling Other Issues/Assumptions in Linear Regression: Handling Outliers, Categorical Variables, Autocorrelation, Multicollinearity, Heteroskedasticity | Prediction and Confidence Intervals | Use Cases
Logistics Regression
Implementing Logistic Regression with Python | Making Sense of Result Parameters: Wald Test, Likelihood Ratio Test Statistic, Chi-Square Test | Goodness of Fit Measures | Model Validation: Cross Validation, ROC Curve, Confusion Matrix | Use Cases
Decision Trees
Implementing Decision Trees using Python | Homogeneity | Entropy Information Gain | Gini Index | Standard Deviation Reduction | Vizualizing & Prunning a Tree | Implementing Random Forests using Python | Random Forest Algorithm | Important hyper-parameters of Random Forest for tuning the model | Variable Importance | Out of Bag Errors
Pandas
Introduction to Pandas | IO Tools | Basics of NumPy | NumPy Functions Pandas – Series and Data frames,
Scikit Learn
Introduction to SciKit Learn | Load Data into Scikit Learn | Run Machine Learning Algorithms Both for Unsupervised and Supervised Data | Supervised Methods: Classification & Regression | Unsupervised Methods: Clustering, Gaussian Mixture Models | Decide What’s the Best Model for Every Scenario.
3 Projects
Linear + Logistics + Decision trees
Keras
Keras for Classification and Regression in Typical Data Science Problems | Setting up KERAS | Different Layers in KERAS | Creating a Neural Network Training Models and Monitoring | Artificial Neural Network
Tensorflow
Introducing Tensorflow | Neural Networks using Tensorflow | Debugging and Monitoring | Convolutional Neural Networks | Unsupervised Learning
2 Projects
ANN + CNN
Introduction to R
What is R? |What is Open Source? |Capabilities of R |GUI for R| R IDE – Rstudio |Using R
Programming in R
Data Types | Operators in R |Data Input and Output |R Data Frames |R statistics – Mean, Median, Mode etc. | Data Manipulation in R – Counting, Merging, Append, Sort, Subset, Filter, New Variable Creation etc. |R Logical Statements – If/ else, Loops etc. |Plotting- Graphs and Charts | Packages in R- Details of the most commonly used packages | Functions in R (High Level) |R- Best Practices.
Introduction
What is Statistics |Data Types |Qualitative vs. Quantitative |Basic Operations Based on Data Type |Variables |Measurement Scales |Measures of Variance |Measures of Central Tendency |Correlation vs. Causation (Correlational vs. Experimental Research) |Sampling – Usage of Sampling | Distributions |Central Limit Theorem |Hypothesis Testing | Types of Hypothesis Testing |Introduction to ANOVA and Basics of Regression/Classification.
Linear regression
Implementing Simple & Multiple Linear Regression with R | Making Sense of Result Parameters | Model Validation | Handling Other Issues/Assumptions in Linear Regression: Handling Outliers, Categorical Variables, Autocorrelation, Multicollinearity, Heteroskedasticity | Prediction and Confidence Intervals | Use Cases.
Logistics Regression
Implementing Logistic Regression with R | Making Sense of Result Parameters: Wald Test, Likelihood Ratio Test Statistic, Chi-Square Test | Goodness of Fit Measures | Model Validation: Cross Validation, ROC Curve, Confusion Matrix | Use Cases.
Decision Trees
Implementing Decision Trees using R | Homogeneity | Entropy Information Gain | Gini Index | Standard Deviation Reduction | Vizualizing & Prunning a Tree | Implementing Random Forests using Python | Random Forest Algorithm | Important hyper-parameters of Random Forest for tuning the model | Variable Importance | Out of Bag Errors.
3 Projects
Linear + Logistics + Decision trees