DATAMARCOS

Data Science & Analytics

Data Science & Analytics

Data Science and Analytics program will be delivered in approximately 100 hours over a sixteen-week period of time.
In this course, Upgradata will not only focuses the Technical aspects of it, also the Behavioural aspect to give the holistic approach towards upskilling for Self- Development.

course Content

• What is Analytics and Data Science?
• Overview of Data Science and Analytics
• Why Analytics is is becoming popular now?
• Application of Analytics in business
• Analytics Vs Data warehousing and MIS Reporting
• Various Terminology in Analytics
• Various Analytics Methodology
• How businesses are using the power of Analytics?
• Various Analytics tools and their usage 

• Define Python
• Overview of Python
• Understand why Python is Popular
• Setup Python Environment
• Python files I/O Functions
• Numbers
• Strings and related operations
• Tuple - properties, related operations, compared with list
• List - properties, related operations
• Dictionary - properties, related operations
• Set - properties, related operations

• Understand Python Standard Libraries
• Packages and Module - Modules, Import Options, sys Path
• Functions - Syntax, Arguments, Keyword Arguments, Return Values
• Function Parameters
• Global Variables
• Variable Scope and Returning Values
• Lambda - Features, Syntax, Options, Compared with the Functions
• Sorting - Sequences, Dictionaries, Limitations of Sorting
• Errors and Exceptions - Types of Issues, Remediation
• Object Oriented Concepts
• Modules Used in Python
• The Import Statements
• Module Search Path
• Package Installation Ways
• Errors and Exception Handling
• Handling Multiple Exceptions

• What is Numpy?
• Importing Numpy
• Numpy overview
• Numpy Array creation and basic operation
• Numpy universal function
• Selecting and retrieving data
• Data slicing
• Iterating Numpy Data
• Shape Manipulation
• Stacking and Splitting Arrays
• Copies and Views: no copy, shallow copy, deep copy
• Indexing : Arrays of indices, Boolean Arrays

• Selecting data from Pandas DataFrame
• Slicing and dicing using Pandas
• GroupBy/Aggregate
• Strings with Pandas
• Cleaning up messy data with Pandas
• Dropping Entries
o Selecting Entries
o Read & write data from text/CSV files into arrays and viceversa
o Create Series and Data Frames in Pandas
o Data structures & index operations in pandas
o Importing and exporting data
o Indexing and slicing of data structures in pandas
o Reading and Writing data from Excel/CSV formats into
Pandas

• Basic Functionalities of a data object
• Merging of Data objects
• Concatenation of data objects
• Types of Joins on data objects
• Exploring a Dataset
• Analysing a dataset

• Anatomy of a MatplotLib Plot
• Matplotlib basic plots and it’s containers
• A Matplotlib figure, it’s components and properties
• Axes and other graphical objects
• Pylab and Pyplot
• Data for Matplotlib Plots
• What is a Subplot?
• Modifying size of figures
• Plotting routines with pyplot
• Customizing your pyplot
• Deleting an Axes
• Setting up Plot Title, Axes Labels, Legend, Layout
• Showing, Saving and Closing your Plot
• Save a Plot to an image file and pdf file
• Use cla(), clf() or close.

• Events and their Probabilities
• Rules of Probability
• Conditional Probability and Independence
• Distribution of a Random Variable
• Moment Generating functions Central
• Limit Theorem
• Expectation
• Variance

• Measures of Central tendency
• Measures of Dispersion
• Skewness and Kurtosis
• Sample and Population
• Formulate the Hypothesis
• Select an Appropriate Test
• Choose level of Significance
• Calculate Test Statistics
• Determine the Probability
• Compare the Probability and Make Decision
• One Sample T-Test
• Two Independent Samples Tests• Paired T-test
• Proportional Test
• Non Parametric One Sample Test
• Chi Square Test

• Necessary Machine Learning Python libraries
• What is Machine Learning?
• Machine Learning Use-Cases
• Machine Learning Process Flow
• Machine Learning Categories
• List the categories of Machine Learning
• Illustrate Supervised Learning Algorithms
• Identify and recognize machine learning algorithms around us
• Linear regression
• Logistic Regression

• Understand What is Supervised Learning?
• Define Classification
• What is Classification and its use cases?
• What is Decision Tree?
• Algorithm for Decision Tree Induction
• Creating a Perfect Decision Tree
• Confusion Matrix
• What is Random Forest?

• Introduction to Dimensionality
• Why Dimensionality Reduction
• PCA
• Scaling
• Factor Analysis
• Scaling dimensional model
• LDA

• What is Naïve Bayes?
• How Naïve Bayes works?
• Implementing Naïve Bayes Classifier                                                                                                      • What is Support Vector Machine?
• Illustrate how Support Vector Machine works?
• Hyperparameter Optimization
• Grid Search vs Random Search
• Implementation of Support Vector Machine for Classification 

• What is Clustering & its Use Cases?
• What is K-means Clustering?
• How K-means algorithm works?
• How to do optimal clustering
• What is C-means Clustering?
• What is Hierarchical Clustering?
• How Hierarchical Clustering works?
• Types of Boosting Algorithms
• Adaptive Boosting
• Cross Validation
• AdaBoost

Scroll to Top