Five Day FDP on Data Analytics in R
Indian Institute of Information Technology, Vadodara
14-18 December, 2020
AICTE Training and Learning (ATAL) Academy
Data Analytics refers to the set of quantitative and qualitative approaches for deriving insights from the data. The use of statistical methods in manufacturing, development of food products, computer software, energy sources, pharmaceuticals, and many other areas involves the gathering of information or scientific data. Of course, the gathering of data is nothing new. It has been done for well over a thousand years. Data have been collected, summarized, reported, and stored for perusal. However, there is a profound distinction between collection of scientific information and inferential statistics. It is the latter that has received rightful attention in recent decades. This program aims at equipping the participants with the necessary tools to prepare, visualize and model the data. And further perform analysis, inference and predictive tasks.
About the Institute
Indian Institute of Information Technology Vadodara was established in 2013 under Public Private Partnership of Government of India, Government of Gujarat, Tata Consultancy Services, Gujarat State Fertilizer Company and Gujarat Energy Research and Management Institute. Further, the institute has been declared as an Institute of National Importance by an Act of Parliament. The major objective of its establishment is to set up a model of education which can produce best-in-class human resources in IT and harnessing the multidimensional facets of IT in various domains.
In this course participants will learn data analytics techniques in order to analyze datasets and make statistically valid conclusions. It is expected that at the completion of the course, the participants will be able to (a) Prepare scripts in R for data analytics and understand the principles on which the statistics are derived (b) Apply descriptive analytics for extracting patterns and visualize the data (c) Apply the predictive analytics methods and use R for analysis and forecasting.
The course contains hands-on laboratory sessions. Participants will be using R and Python for solving analytics tasks assigned in the laboratory session. The laboratory exercises will be designed to augment the theory sessions. Most of the exercises will be presented in the form of case studies. Participants will have a better learning experience if they carry their laptops.
Introduction to R: Getting started with R, Language features: functions, assignment, arguments and types, Language features: binding and arrays, Error handling, Numeric, Statistical and Character functions, Data frames and I/O, Lists
Introduction to Statistics for Data Analytics: Probability, Combinatorics, Expectations, Descriptive statistics, Statistical distributions, Scatter plot, stem and leaf plot, Histogram, Box plot and Five point summary, Hypothesis testing and Estimation, Goodness of fit
Graphical Models: Bayesian Networks, Hidden Markov Model, Markov Network (Markov Random Field)
Linear Regression, Multi-linear Regression and Logistic Regression: Fitting a line to data, Outliers, influence, and robust regression, Standard error and confidence interval, Interpretation of model coefficients, multiple regression assumptions, diagnostics and, efficacy measures
Time Series Analysis: Characteristics of time series, Autocorrelation, Autoregressive Moving Average Models (ARIMA), Forecasting and Estimation
Prof. Sarat Kumar Patra
Dr. Pratik Shah
Experts from Industry and Academia
Prof. Suman Kumar Mitra
Dr. Pratik Shah
Dr. Ratnik Gandhi
Asim Rama Praveen