Introduction to Python :
- Concepts of Python programming
- Configuration of Development Environment
- Variable and Strings
- Functions, Control Flow and Loops
- Tuple, Lists and Dictionaries
- Standard Libraries
Module 2: Data Science Fundamentals :
- Introduction to Data Science
- Real world use-cases of Data Science
- Walkthrough of data types
- Data Science project lifecycle
Module 3: Introduction to NumPy:
- Basics of NumPy Arrays
- Mathematical operations in NumPy
- NumPy Array manipulation
- NumPy Array broadcasting
Module 4: Data Manipulation with Pandas :
- Data Structures in Pandas-Series and DataFrames
- Data cleaning in Pandas
- Data manipulation in Pandas
- Handling missing values in datasets
- Hands-on: Implement NumPy arrays and Pandas DataFrames
Module 5: Data Visualization in Python :
- Plotting basic charts in Python
- Data visualization with Matplotlib
- Statistical data visualization with Seaborn
- Hands-on: Coding sessions using Matplotlib, Seaborn packages
Module 6: Exploratory Data Analysis :
- Introduction to Exploratory Data Analysis (EDA) steps
- Plots to explore relationship between two variables
- Histograms, Box plots to explore a single variable
- Heat maps, Pair plots to explore correlations
- Perform EDA to explore survival using titanic dataset
Module 7: Introduction to Machine Learning :
- What is Machine Learning?
- Use Cases of Machine Learning
- Types of Machine Learning - Supervised to Unsupervised methods
- Machine Learning workflow
Module 8: Linear Regression :
- Introduction to Linear Regression
- Use cases of Linear Regression
- How to fit a Linear Regression model?
- Evaluating and interpreting results from Linear Regression models
- Predict Bike sharing demand
Module 9: Logistic Regression :
- Introduction to Logistic Regression
- Logistic Regression use cases
- Understand use of odds & Logit function to perform logistic regression
- Predicting credit card default cases
Module 10: Decision Trees & Random Forest :
- Introduction to Decision Trees & Random Forest
- Understanding criterion(Entropy & Information Gain) used in Decision Trees
- Using Ensemble methods in Decision Trees
- Applications of Random Forest
- Predict passenger survival using Titanic Data set
Module 11: Model Evaluation Techniques :
- Introduction to evaluation metrics and model selection in Machine Learning
- Importance of Confusion matrix for predictions
- Measures of model evaluation - Sensitivity, specificity, precision, recall & f-score
- Use AUC-ROC curve to decide best model
- Applying model evaluation techniques to Titanic dataset
Module 12: Dimensionality Reduction using PCA:
- Unsupervised Learning: Introduction to Curse of Dimensionality
- What is dimensionality reduction?
- Technique used in PCA to reduce dimensions
- Applications of Principle component Analysis (PCA)
- Optimize model performance using PCA on SPECTF heart data
Module 13: KNearestNeighbours:
- Introduction to KNN
- Calculate neighbours using distance measures
- Find optimal value of K in KNN method
- Advantage & disadvantages of KNN
Module 14: Naive Bayes Classifier:
- Introduction to Naive Bayes Classification
- Refresher on Probability theory
- Applications of Naive Bayes Algorithm in Machine Learning
- Classify spam emails based on probability
Module 15: K-means Clustering:
- Introduction to K-means clustering
- Decide clusters by adjusting centroids
- Find optimal 'k value' in K-means
- Understand applications of clustering in Machine Learning
- Segment hands in Poker data and segment flower species in Iris flower data
Module 16: Support Vector Machines:
- Introduction to SVM
- Figure decision boundaries using support vectors
- Identify hyperplane in SVM
- Applications of SVM in Machine Learning
- Predicting wine quality using SVM