CSE DS 4 1 SEM CS Syllabus - UG - R20
CSE DS 4 1 SEM CS Syllabus - UG - R20
2020 – 21
Note:
1. TWO, NPTEL courses of EIGHT week duration covering a total of 4 credits (offered by CSE
Department only), Student can register at any time after the completion of II B.Tech. I Sem.
IV B Tech I Sem L T P C
3 0 0 3
Reinforcement Learning
(Professional Elective-III)
Course Objective:
x Learn various approaches to solve decision problems with functional models and algorithms for task
formulation, Tabular based solutions, Function approximation solutions, policy gradients and model
based reinforcement learning.
Course Outcomes:
By completing the course the students will be able to:
x Understand basic concepts of Reinforcement learning
x Identifying appropriate learning tasks for Reinforcement learning techniques
x Understand various methods and applications of reinforcement learning
UNIT I:
Introduction: Reinforcement Learning, Examples, Elements of Reinforcement Learning, Limitations and
Scope, An Extended Example: Tic-Tac-Toe
Multi-armed Bandits: A k-armed Bandit Problem, Action-value methods, The 10-armed Testbed,
Incremental Implementation, Tracking a Nonstationary Problem, Optimistic Initial Values, Upper –
Confidence-Bound Action Selection, Gradient Bandit Algorithm
UNIT II:
Finite Markov Decision Process: The Agent-Environment Interface, Goals and Rewards, Returns and
Episodes, Unified Notataion for Episodic and Continuing Tasks, Policies and Value Functions,
Dynamic Programming: Policy Evaluation, Policy Improvement, Policy Iteration, Value Iteration,
Asynchronous Dynamic Programming, Generalized Policy Iteration, Efficiency of Dynamic Programming
UNIT III:
Monte Carlo Methods: Monte Carlo Prediction, Monte Carlo Estimation of Action Values, Monte Carlo
Control, Monte Carlo Control without Exploring Starts, Off-policy Prediction via Importance Sampling,
Incremental Implementation, Discontinuing-aware Importance Sampling, Per-decision Importance Sampling
n-step Bootstrapping: n-step TD Prediction, n-step Sarsa, n-step Off-policy Learning, Per-decision
methods with Control Variables, A Unifying Algorithm: n-step Q(σ)
UNIT IV:
Off-policy Methods with Approximation: Semi-gradient Methods, Examples of Off-policy Divergence,
The Deadly Triad, Linear Value-function Geometry, Gradient Descent in the Bellman Error, The Bellman
Error is not Learnable, Gradient-TD methods, Emphatic-TD methods, Reducing Variance
Eligibility Traces: The λ-return, TD(λ), n-step Truncated λ-return methods, Online λ –return Algorithm,
True Online TD(λ), Dutch Traces in Monte Carlo Learning, Sarsa(λ), Variable λ and γ, Off-policy Traces
with Control Variables, Watkins’s Q(λ) to Tree-Backup(λ)
UNIT V:
Policy Gradient Methods: Policy Approximation and its Advantages, The Policy Gradient Theorem,
REINFOECE: Monte Carlo Policy Gradient, REINFORCE with Baseline, Actor-Critic Methods, Policy
Gradient for Continuing Problems, Policy Parameterization fr Continuous Actions
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Text Books:
1. R. S. Sutton and A. G. Bart,. “Reinforcement Learning - An Introduction,” MIT Press, 2018.
References:
1. Szepesvári, Csaba, “Algorithms for Reinforcement Learning,” United States: Morgan &
Claypool, 2010.
2. Puterman, Martin L., “Markov Decision Processes: Discrete Stochastic Dynamic
Programming,” Germany: Wiley, 2014.
Web References:
1. https://onlinecourses.nptel.ac.in/noc20_cs74/preview
2. https://www.coursera.org/learn/fundamentals-of-reinforcement-learning
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objective:
x Learn the theoretical foundations of Nature Inspired Computing techniques, how they can be used to
solve problems, and in which areas are most useful and effective.
Course Outcomes:
By completing the course the students will be able to:
x Understand the strengths, weaknesses and appropriateness of nature-inspired algorithms.
x Apply nature-inspired algorithms to optimization, design and learning problems.
.
UNIT I :
Analysis of Algorithms: Analysis of Optimization Algorithms, Nature Inspired Algorithms, Parameter
Tuning and Parameter Control: Parameter Tuning, Hyper optimization, Multi objective View, Parameter
Control, Simulated Annealing: Algorithm, Basic Convergence Properties, Stochastic Tunneling
UNIT II:
Genetic Algorithms: Introduction, Role of Genetic Operators, Choice of Parameters, GA Variants,
Differential Evolution: Introduction, Differential Evolution, Variants, Choice of Parameters, Convergence
Analysis, Particle Swarm Optimization: Swarm Intelligence, PSO Algorithm, Accelerated PSO, Binary
PSO
UNIT III:
Firefly Algorithms: Firefly Behavior, Standard Firefly Algorithm Variations of Light Intensity and
Attractiveness, Controlling Randomization, Firefly Algorithms in Applications
Cuckoo Search: Cuckoo Breeding Behavior, Levy Flights, Cuckoo Search: Special Cases of Cuckoo
Search, Variants of Cuckoo Search, Global Convergence, Applications
UNIT IV
Bat Algorithms: Echolocation of Bats: Behavior of Microbats, Acoustics of Echolocation, Bat Algorithms:
Movement of Virtual Bats, Loudness and Pulse Emission, Binary Bat Algorithm, Variants of the Bat
Algorithm, Convergence Analysis, Applications: Continuous Optimization, Combinatorial Optimization and
Scheduling, Inverse Problems and Parameter Estimation, Classifications, Clustering and Data Mining,
Image Processing, Fuzzy Logic and Other Applications
UNIT V:
Flower Pollination Algorithms: Introduction, Characteristics of Flower Pollination, Flower Pollination
Algorithms, Multi-Objective Flower Pollination Algorithms, Validation and Numerical Experiments:
Single-Objective Test Functions, Multi-Objective Test Functions, Applications: Single-Objective Design
Benchmarks, Multi-Objective Design Benchmarks
Text Books:
1. “Nature-Inspired Optimization Algorithms”, Yang, Xin-She, Elsevier Science, 2014.
References:
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objective:
x Understand and deal with any social media network, strategy, or campaign.
Course Outcomes:
By completing the course the students will be able to:
x Understand social media categories and types of social media analytics
x Understand the impact of social media analytics integration with and affects other areas of business.
.
UNIT I :
Introduction: Foundation for Analytics, Evolution of Data and the Digital Gap, Social Media Data Sources:
Offline and Online, Definition of Social Media, Data Sources in Social Media Channels, Estimated vs.
Factual Data Sources, Public and Private Data, Data Gathering in Social Media Analytics, Social Media
Network Support of Data Collection, API: Application Programming Interface, Web Crawling or Scraping,
UNIT II:
From Data to Insights: Example of a Single Metric Giving Actionable Insight, An Example of a Metric
Leading to New Questions, Creating a Plan to Shape Data into Insights, The Planning Stage: Projecting
Possible Insights, Analysis of a Social Media Post, The process of Comparison, Data Aggregation,
Calculations and Display, Data Display, Social Media and Big Data, Potential Challenges
UNIT III:
Analytics in Social Media: Types of Analytics in Social Media, Analytics or Channel Analytics, Social
Media Listening: Keyword and Mention-Based Analysis, Demographics, Interests and Sentiment,
Advertising Analytics: Focus on Conversions and ROI of Paid Social Media Campaigns, Conversions: The
Key to Digital and Social Advertising, CMS Analytics: Measuring the Performance of the Content
Management Team, CRM Analytics: Customer Support and Sales via Social Media
UNIT IV:
Dedicated vs. Hybrid Tools :Common to all Tools, Dedicated Tools, Advantages of Dedicated Tools,
Disadvantages of Dedicated tools, Hybrid Tools, Dedicated Tools with Hybrid Features, Advantages of
Hybrid Tools, Disadvantages of Hybrid Tools, Data Integration Tools, Advantages of Data Integration
Tools, Disadvantages of Data Integration Tools.
UNIT V:
Social Network Landscape: Concept and UX on Social Networks, Features and Their Strategic Value,
Interactivity: How Social is the Network, Content Flow on Social Network
The Analytics Process: Analysis is Comparison, Investigation beyond Social Analytics, Shaping a Method:
The End Game for an Analyst, The Analysis Circle, Dynamic Cycles, The Analyst Mindset: Making the
Right Questions and Running the Right Experiments
Text Books:
1. Alex Goncalves, “Social Media Analytics Strategy-Using Data to Optimize Business Performance,”
Apress, 2017.
References:
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Web References:
1. https://www.coursera.org/learn/social-media-analytics-introduction
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objectives:
To understand block chain technology and Cryptocurrency works
Course Outcomes:
After the completion of the course, student will be able to
x Demonstrate the block chain basics, Crypto currency
x To compare and contrast the use of different private vs. public block chain and use cases
x Design an innovative Bit coin Block chain and scripts, Block chain Science on varies coins
x Classify Permission Block chain and use cases – Hyper ledger, Corda
x Make Use of Block-chain in E-Governance, Land Registration, Medical Information Systems and
others
UNIT I:
Introduction: Introduction, basic ideas behind block chain, how it is changing the landscape
ofdigitalization, introduction to cryptographic concepts required, Block chain or distributed trust, Currency,
Cryptocurrency, How a Cryptocurrency works, Financial services, Bitcoin prediction markets.
UNIT II:
Hashing, public key cryptosystems, private vs public block chain and use cases, HashPuzzles, Extensibility
of Block chain concepts, Digital Identity verification, Block chain Neutrality, Digital art, Block chain
Environment
UNIT III:
Introduction to Bitcoin : Bitcoin Block chain and scripts, Use cases of BitcoinBlockchain scripting
language in micropayment, escrow etc Downside of Bit coin mining, Block chain Science: Grid coin,
Folding coin, Block chain Genomics, Bit coin MOOCs.
UNIT IV:
Ethereum continued, IOTA, The real need for mining, consensus, Byzantine Generals Problem, and
Consensus as a distributed coordination problem, Coming to private or permissioned block chains,
Introduction to Hyper ledger, Currency, Token, Campus coin, Coin drop as a strategy for Public
adoption,Currency Multiplicity, Demurrage currency
UNIT V:
Technical challenges, Business model challenges, Scandals and Public perception,Government Regulations,
Uses of Block chain in E-Governance, Land Registration, Medical Information Systems.
Text Books:
1. Blockchain Blue print for Economy by Melanie Swan
Reference Books:
1. Blockchain Basics: A Non-Technical Introduction in 25 Steps 1st Edition, by Daniel Drescher
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
IV B Tech I Sem L T P C
3 0 0 3
SnowFlake Cloud Analytics
(Professional Elective-IV)
Course Objectives:
The main objective of the course is to master data warehousing on cloud using Snowflake
Course Outcomes:
At the end of the course, the student will be able to
x load & transform data in Snowflake
x scale virtual warehouses for performance and concurrency
x share data and work with semi-structured data
x gain a thorough knowledge of query constructs, DDL & DML operations, managing and monitoring
Snowflake accounts and Snowflake's continuous data protection methods.
UNIT I:
Snowflake Architecture - Unlocking Business Value, Business Agility Is More Important Than Ever, All
Hail the Cloud! , Snowflake Architecture, Database Storage, Micro Partitions, Benefit of Micro Partitioning,
Data Clustering, Virtual Warehouses, Caching, Result Cache, Local Disk Cache (Text Book 1)
Getting Started with Cloud Analytics - Key Cloud Computing Concepts (Text Book 2)
Getting Started with Snowflake – Planning, Deciding on a Snowflake Edition, Choosing a Cloud Provider
and Region, Examining Snowflake’s Pricing Model, Other Pricing Considerations, Examining Types
of Snowflake Tools, Creating a Snowflake Account, Connecting to Snowflake (Text Book 2)
UNIT II:
Building a Virtual Warehouse - Overview of Snowflake Virtual Warehouses, Warehouse Sizes
and Features, Multicluster Virtual Warehouses, Virtual Warehouse Considerations, Building a Snowflake
Virtual Warehouse
(Text Book 2)
Getting Started with SnowSQL - Installing SnowSQL, Configuring SnowSQL, SnowSQL Commands,
Multiple Connection Names (Text Book 2)
UNIT III:
Data Movement – Stages, External Stages, External Tables and Data Lakes, Internal Stages (Text Book 1)
Loading Bulk Data into Snowflake - Overview of Bulk Data Loading, Bulk Data Loading
Recommendations, Bulk Loading with the Snowflake Web Interface, Data Loading with SnowSQL (Text
Book 2)
Continuous Data Loading with Snowpipe - Loading Data Continuously, Snowpipe Auto-Ingest, Building
a Data Pipeline Using the Snowpipe Auto-Ingest Option (Text Book 2)
UNIT IV:
Snowflake Administration - Administering Roles and Users, Administering Resource Consumption,
Administering Databases and Warehouses, Administering Account Parameters, Administering Database
Objects, Administering Data Shares, Administering Clustered Tables, Snowflake Materialized Views (Text
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Unit V:
Working with Semistructured Data- Supported File Formats, Advanced Data Types, Working with XML,
Working with JSON, Working with AVRO, Working with Parquet (Text Book 2)
Secure Data Sharing - Secure Data Sharing, Secure Table Sharing, Data Sharing Using a Secure View (Text
Book 2)
Time Travel (pr book 2) (Text Book 2)
Advanced Performance Tuning - Designing Tables for High Performance, Designing High-Performance
Queries Optimizing Queries, Optimizing Warehouse Utilization, Monitoring Resources and Account Usage
Resource Monitors (Text Book 1)
Text Books:
1. Mastering Snowflake Solution Supporting Analytics and Data Sharing, Apress
2. Jumpstart Snowflake A Step-by-Step Guide to modern cloud analytics, Apress
Reference Books:
1. Snowflake Essentials Getting Started with Big Data in the Cloud, Apress
2. Snowflake Cookbook: Techniques for building modern cloud data warehousing solutions
3. Snowflake: The Definitive Guide Architecting, Designing, and Deploying on the Snowflake Data
Cloud – ORIELLY
4. https://docs.snowflake.com/en/
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objectives:
x To explain the evolving computer model caned cloud computing.
x To introduce the various levels of services that can be achieved by cloud.
x To describe the security aspects in cloud.
x To motivate students to do programming and experiment with the various cloud computing
environments.
UNITI:
Systems Modeling, Clustering and Virtualization: Scalable Computing over the Internet-The Age of
Internet Computing, Scalable computing over the internet, Technologies for Network Based Systems,
System models for Distributed and Cloud Computing, , Performance, Security and Energy Efficiency
UNITII:
Virtual Machines and Virtualization of Clusters and Data Centers: Implementation Levels of
Virtualization, Virtualization Structures/ Tools and Mechanisms, Virtualization of CPU, Memory and I/O
Devices, Virtual Clusters and Resource Management, Virtualization for Data-Center Automation.
UNITIII:
Cloud Platform Architecture: Cloud Computing and Service Models, Public Cloud Platforms, Service
Oriented Architecture, Programming on Amazon AWS and Microsoft Azure
UNIT IV:
Cloud Resource Management and Scheduling: Policies and Mechanisms for Resource Management,
Applications of Control Theory to Task Scheduling on a Cloud, Stability of a Two Level Resource
Allocation Architecture, Feedback Control Based on Dynamic Thresholds. Coordination of Specialized
Autonomic Performance Managers, Resource Bundling, Scheduling Algorithms for Computing Clouds-Fair
Queuing, Start Time Fair Queuing.
UNITV:
Storage Systems: Evolution of storage technology, storage models, file systems and database, distributed
file systems, general parallel file systems. Google file system.
Text Books:
1. Distributed and Cloud Computing, Kai Hwang, Geoffry C. Fox, Jack J. Dongarra MK Elsevier.
2. Cloud Computing, Theory and Practice, Dan C Marinescu, MK Elsevier.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Reference Books:
1. Cloud Computing, A Hands on approach, ArshadeepBahga, Vijay Madisetti, University Press
2. Cloud Computing, A Practical Approach, Anthony T Velte, Toby J Velte, Robert Elsenpeter, TMH
3. Mastering Cloud Computing, Foundations and Application Programming, Raj Kumar Buyya,
Christen vecctiola, S Tammaraiselvi, TMH
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objectives:
● To provide the foundation knowledge in information retrieval.
● To equip students with sound skills to solve computational search problems.
● To appreciate how to evaluate search engines.
● To appreciate the different applications of information retrieval techniques in the Internet or Web
environment.
● To provide hands-on experience in building search engines and/or hands-on experience in evaluating
search engines.
Course Outcomes:
By the end of the course, student will be able to
x Identify basic theories in information retrieval systems
x Classify the analysis tools as they apply to information retrieval systems.
x Illustrate the problems solved in currentIR systems.
x Discuss the advantages of current IR systems
x Summarizethedifficultyofrepresentingand retrieving documents.
x Translatethelatesttechnologiesfor linking, describing and searching the web
UNIT-I:
IntroductiontoInformationStorage andRetrieval System:Introduction, Domain Analysis of IR systems
and other types of Information Systems, IR System Evaluation. Introduction to Data Structures and
Algorithms related to Information Retrieval: Basic Concepts, Data structures, Algorithms
UNIT-II:
Inverted Files:Introduction, Structures used in Inverted Files, Building Inverted file using a sorted array,
Modifications to Basic Techniques.
UNIT-III:
SignatureFiles:Introduction,ConceptsofSignatureFiles, Compression, Vertical Partitioning, Horizontal
Partitioning.
UNIT-IV:
New Indices for Text:PAT Trees and PAT Arrays: Introduction, PAT Tree structure, algorithms on the
PAT Trees, Building PAT trees as PATRICA Trees, PAT representation as arrays.
UNIT-V:
Stemming Algorithms: Introduction, Types of Stemming Algorithms, Experimental Evaluations of
Stemming to Compress Inverted Files
Thesaurus Construction:Introduction, Features of Thesauri, Thesaurus Construction, Thesaurus
construction from Texts, Merging existing Thesauri
Text Books:
1. Frakes, W.B., Ricardo Baeza-Yates: Information Retrieval Data Structures and Algorithms, Prentice Hall,
1992.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Reference Books:
1. Kowalski, Gerald, Mark T Maybury: Information Retrieval Systems: Theory and Implementation,
Kluwer Academic Press, 1997.
2. Information retrieval Algorithms and Heuristics, 2ed, Springer
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
UNITI:
Why NoSQL, The Value of Relational Databases, Getting at Persistent Data, Concurrency, Integration, A
(Mostly) Standard Model, Impedance Mismatch, Application and Integration Databases, Attack of the
Clusters, The Emergence of NoSQL, Aggregate Data Models; Aggregates, Example of Relations and
Aggregates, Consequences of Aggregate Orientation, Key-Value and Document Data Models, Column-
Family Stores, Summarizing Aggregate-Oriented Databases. More Details on Data Models; Relationships,
Graph Databases, Schema less Databases, Materialized Views, Modelling for Data Access,
UNIT II:
Distribution Models: Single Server, Shading, Master-Slave Replication, Peer-to-Peer Replication,
Combining Shading and Replication. Consistency, Update Consistency, Read Consistency, Relaxing
Consistency, The CAP Theorem, Relaxing Durability, Quorums. Version Stamps, Business and System
Transactions, Version Stamps on Multiple Nodes
UNIT III:
What Is a Key-Value Store, Key-Value Store Features, Consistency, Transactions, Query Features, Structure
of Data, Scaling, Suitable Use Cases, Storing Session Information, User Profiles, Preference, Shopping Cart
Data, When Not to Use, Relationships among Data, Multi operation Transactions, Query by Data,
Operations by Sets.
UNITIV:
Document Databases, What Is a Document Database?, Features, Consistency, Transactions, Availability,
Query Features, Scaling, Suitable Use Cases, Event Logging, Content Management Systems, Blogging
Platforms, Web Analytics or Real-Time Analytics, Ecommerce Applications, When Not to Use, Complex
Transactions Spanning Different Operations, Queries against Varying Aggregate Structure
UNIT V:
Graph Databases, What Is a Graph Database?, Features, Consistency, Transactions, Availability, Query
Features, Scaling, Suitable Use Cases, Connected Data, Routing, Dispatch and Location-Based Services,
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Text Books:
1. Sadalage, P. & Fowler, No SQL Distilled: A Brief Guide to the Emerging World of Polyglot
Persistence, Pearson Addision Wesley, 2012
Reference Books:
1. Dan Sullivan, "NoSQLFor Mere Mortals", 1st Edition, Pearson Education India, 2015. (ISBN13:
978-9332557338)
2. Dan McCreary and Ann Kelly, "Making Sense of NoSQL: A guide for Managers and the Rest of
us", 1st Edition, Manning Publication/Dreamtech Press, 2013. (ISBN-13: 978-9351192022)
3. Kristina Chodorow, "Mongodb: The Definitive Guide- Powerful and Scalable Data Storage", 2nd
Edition, O'Reilly Publications, 2013. (ISBN-13: 978-9351102694)
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
CourseObjectives:
x Formalizedifferenttypesofentitiesandrelationshipsasnodesandedgesandrepresentthisinformationas
relationaldata
x Planandexecutenetworkanalyticalcomputations
x Useadvancednetworkanalysissoftwaretogeneratevisualizationsandperformempiricalinvestigation
sofnetworkdata
x Interpretandsynthesizethemeaningoftheresultswithrespecttoaquestion,goalortask
x Collectnetworkdataindifferentwaysandfromdifferentsourceswhileadheringtolegalstandardsandeth
ics standards
CourseOutcomes:
Aftercompletingthecoursestudentshould:
x Knowbasicnotationandterminologyusedinnetworkscience
x Beabletovisualize, summarizeandcomparenetworks
x Illustratebasicprinciplesbehind networkanalysisalgorithms
x Developpracticalskillsofnetworkanalysis inRprogramminglanguage
x Be capableof analyzing realworld networks
UNITI:
Social Network Analysis: Preliminaries and definitions, Erdos Number Project, Centrality
measures,Balance andHomophily.
UNITII:
Random graph models: Random graphs and alternative models, Models of network growth,
Navigationin social Networks, Cohesive subgroups, Multidimensional Scaling, Structural
equivalence, roles andpositions.
UNITIII:
Networktopologyanddiffusion,ContagioninNetworks,Complexcontagion,Percolationandinformati
on,NavigationinNetworksRevisited.
UNITIV:
Small world experiments, small world models, origins of small world, Heavy tails, Small
Diameter,Clusteringofconnectivity,TheErdosRenyiModel,ClusteringModels.
UNITV:
Network structure -Important vertices and page rank algorithm, towards rational dynamics in
networks,basics of game theory, Coloring and consensus, biased voting, network formation
games, networkstructureandequilibrium,behavioralexperiments,Spatialandagent-basedmodels.
Text Books:
1. S.WassermanandK.Faust.“SocialNetworkAnalysis:MethodsandApplications”,CambridgeUniversity
Press.
2. D.EasleyandJ.Kleinberg,“Networks,CrowdsandMarkets:Reasoningaboutahighlyconnectedworld”,C
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
ReferenceBooks:
1. MaartenvanSteen.“GraphTheoryand ComplexNetworks.AnIntroduction”,2010.
2. RezaZafarani,MohammedAliAbbasi,HuanLiu.“SocialMediaMining:AnIntroduction”.Cambridge
UniversityPress 2014.
3. MaksimTsvetovat and Alexander Kouznetsov. “Social Network Analysis for Startups”.
O’ReillyMedia,2011.
e-Resources:
1) https://www.classcentral.com/course/edx-social-network-analysis-sna-9134
2) https://www.coursera.org/learn/social-network-analysis
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
IV B Tech I Sem L T P C
3 0 0 3
Recommender Systems
(Professional Elective-V)
Course Objective:
To develop state-of-the-art recommender systems that automates a variety of choice-making strategies with
the goal of providing affordable, personal, and high-quality recommendations
Course Outcomes:
By completing the course the students will be able to:
• Understand the basic concepts of recommender systems
• Carry out performance evaluation of recommender systems based on various metrics
• Implement machine-learning and data-mining algorithms in recommender systems data sets.
• Design and implement a simple recommender system.
UNIT I :
An Introduction to Recommender Systems: Goals of Recommender Systems, Basic Models of
Recommender Systems, Collaborative Filtering Models, Content-Based Recommender Systems,
Knowledge-Based Recommender Systems, Domain-Specific Challenges in Recommender Systems,
Advanced Topics and Applications.
UNIT II:
Neighborhood-Based Collaborative Filtering: Key Properties of Ratings Matrices, Predicting Ratings with
Neighborhood-Based Methods, Clustering and Neighborhood-Based Methods, Dimensionality Reduction
and Neighborhood Methods, A Regression Modeling View of Neighborhood Methods, Graph Models for
Neighborhood-Based Methods
UNIT III:
Model-Based Collaborative Filtering: Decision and Regression Trees, Rule-Based Collaborative Filtering,
Naïve Bayes Collaborative Filtering, Latent Factor Models, Integrating Factorization and Neighborhood
Models
UNIT IV:
Content-Based Recommender Systems: Basic Components of Content-Based Systems, Preprocessing and
Feature Extraction, Learning User Profiles and Filtering, Content-Based Versus Collaborative
Recommendations
Knowledge-Based Recommender Systems: Constraint-Based Recommender Systems, Case-Based
Recommenders, Persistent Personalization in Knowledge-Based Systems.
UNIT V:
Evaluating Recommender Systems: Evaluation Paradigms, General Goals of Evaluation Design, Design
Issues in Offline Recommender Evaluation, Accuracy Metrics in Offline Evaluation, Limitations of
Evaluation Measures
Text Books:
1. Charu .C. Aggarwal, Recommender Systems: The Textbook, Springer, 2016.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Objectives:
x Learn how artificial intelligence powers chatbots, get an overview of the bot ecosystem and bot
anatomy, and study different types of bots and use cases.
x Identify best practices for defining a chatbot use case, and use a rapid prototyping framework to
develop a use case for a personalized chatbot.
Course Outcomes:
x Develop an in-depth understanding of conversation design, including onboarding, flows, utterances,
entities, and personality.
x Design, build, test, and iterate a fully-functional, interactive chatbot using a commercial platform.
x Deploy the finished chatbot for public use and interaction.
UNIT I:
Introduction: Benefits from Chatbots for a Business, A Customer-Centric Approach in Financial Services,
Chatbots in the Insurance Industry, Conversational Chatbot Landscape,
Identifying the Sources of Data: Chatbot Conversations, Training Chatbots for Conversations, Personal Data
in Chatbots, Introduction to the General Data Protection Regulation (GDPR)
UNIT II:
Chatbot Development Essentials: Customer Service-Centric Chatbots, Chatbot Development Approaches,
Rules-Based Approach, AI-Based Approach, Conversational Flow, Key Terms in Chatbots, Utterance,
Intent, Entity, Channel, Human Takeover, Use Case: 24x7 Insurance Agent
UNIT III:
Building a Chatbot Solution: Business Considerations, ChatbotsVs Apps, Growth of Messenger
Applications, Direct Contact Vs Chat, Business Benefits of Chatbots, Success Metrics, Customer
Satisfaction Index, Completion Rate, Bounce Rate, Managing Risks in Chatbots Service, Generic Solution
Architecture for Private Chatbots
UNIT IV:
Natural Language Processing, Understanding, and Generation: Chatbot Architecture, Popular Open Source
NLP and NLU Tools, Natural Language Processing, Natural Language Understanding, Natural Language
Generation, Applications.
UNIT V:
Introduction to Microsoft Bot, RASA, and Google Dialog flow: Microsoft Bot Framework, Introduction to
QnA Maker, Introduction to LUIS, Introduction to RASA, RASA Core, RASA NLU, Introduction to
Dialog flow
Chatbot Integration Mechanism: Integration with Third-Party APIs, Connecting to an Enterprise Data Store,
Integration Module
Text Books:
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Reference Books:
1. Janarthanam and Srini, Hands-on chatbots and conversational UI development: Build chatbots and
voice user interfaces with C (1 ed.), Packt Publishing Ltd, 2017. ISBN 978-1788294669.
2. Galitsky, Boris., Developing Enterprise Chatbots (1 ed.), Springer International Publishing, 2019.
ISBN 978-303004298
3. Kelly III, John E. and Steve Hamm, Smart machines: IBM's Watson and the era of cognitive
computing (1 ed.), Columbia University Press, 2013. ISBN 978- 0231168564.
4. Abhishek Singh, KarthikRamasubramanian and ShreyShivam, Building an Enterprise Chatbot (1
ed.), Springer, 2019. ISBN 978-1484250334.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Outcomes:
Upon completion of this course, the students will be able to
x Understand basics of Data Visualization
x Implement visualization of distributions
x Write programs on visualization of time series, proportions & associations
x Apply visualization on Trends and uncertainty
x Explain principles of proportions
UNIT I:
INTRODUCTION TO VISUALIZATION: Visualizing Data-Mapping Data onto Aesthetics, Aesthetics and
Types of Data, Scales Map Data Values onto Aesthetics, Coordinate Systems and Axes- Cartesian
Coordinates, Nonlinear Axes, Coordinate Systems with Curved Axes, Color Scales-Color as a Tool to
Distinguish, Color to Represent Data Values, Color as a Tool to Highlight, Directory of Visualizations-
Amounts, Distributions, Proportions, x–y relationships, Geospatial Data
UNIT II:
VISUALIZING DISTRIBUTIONS: Visualizing Amounts-Bar Plots, Grouped and Stacked Bars, Dot Plots
and Heatmaps, Visualizing Distributions: Histograms and Density Plots- Visualizing a Single Distribution,
Visualizing Multiple Distributions at the Same Time, Visualizing Distributions: Empirical Cumulative
Distribution Functions and Q-Q Plots-Empirical Cumulative Distribution Functions, Highly Skewed
Distributions, Quantile Plots, Visualizing Many Distributions at Once-Visualizing Distributions Along the
Vertical Axis, Visualizing Distributions Along the Horizontal Axis
UNIT III:
VISUALIZING ASSOCIATIONS & TIME SERIES: Visualizing Proportions-A Case for Pie Charts, A
Case for Side-by-Side Bars, A Case for Stacked Bars and Stacked Densities, Visualizing Proportions
Separately as Parts of the Total ,Visualizing Nested Proportions- Nested Proportions Gone Wrong, Mosaic
Plots and Treemaps, Nested Pies ,Parallel Sets. Visualizing Associations Among Two or More Quantitative
Variables-Scatterplots, Correlograms, Dimension Reduction, Paired Data. Visualizing Time Series and
Other Functions of an Independent Variable-Individual Time Series , Multiple Time Series and Dose–
Response Curves, Time Series of Two or More Response Variables
UNIT IV:
VISUALIZING UNCERTIANITY: Visualizing Trends-Smoothing, Showing Trends with a Defined
Functional Form, Detrending and Time-Series Decomposition, Visualizing Geospatial Data-Projections,
Layers, Choropleth Mapping, Cartograms, Visualizing Uncertainty-Framing Probabilities as Frequencies,
Visualizing the Uncertainty of Point Estimates, Visualizing the Uncertainty of Curve Fits, Hypothetical
Outcome Plots
UNIT V:
PRINCIPLE OF PROPORTIONAL INK: The Principle of Proportional Ink-Visualizations Along Linear
Axes, Visualizations Along Logarithmic Axes, Direct Area Visualizations, Handling Overlapping Points-
Partial Transparency and Jittering, 2D Histograms, Contour Lines, Common Pitfalls of Color Use-Encoding
Too Much or Irrelevant Information ,Using Nonmonotonic Color Scales to Encode Data Values, Not
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Text Books:
1. Claus Wilke, “Fundamentals of Data Visualization: A Primer on Making Informative and
Compelling Figures”, 1st edition, O’Reilly Media Inc, 2019.
2. OssamaEmbarak, Data Analysis and Visualization Using Python: Analyze Data to Create
Visualizations for BI Systems, Apress, 2018
Reference Books:
1. Tony Fischetti, Brett Lantz, R: Data Analysis and Visualization, O’Reilly, 2016
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course Outcomes:
At the end of this course, the student will be able to
x Develop a Spring Data JPA application with Spring Boot
x Implement CRUD operations using Spring Data JPA
x Implement pagination and sorting mechanism using Spring Data JPA
x Implement query methods for querying the database using Spring Data JPA
x Implement a custom repository to customize a querying mechanism using Spring Data JPA
x Understand update operation using query approaches in Spring Data JPA
x Implement Spring Transaction using Spring Data JPA
x Develop RESTful endpoints using Spring REST Processing URI parameters
x Write RESTful services using Spring REST that consumes and produces data in different formats
x Handle exceptions and errors in Spring REST endpoints
x Write Spring based REST clients to consume RESTful services programmatically
x Create secure RESTful endpoints using Spring Security Document and version the Spring REST
endpoints Implement CORS in a Spring REST application
UNIT I:
Spring 5 Basics : Why Spring, What is Spring Framework, Spring Framework - Modules, Configuring IoC
container using Java-based configuration, Introduction To Dependency Injection, Constructor Injection,
Setter Injection, What is AutoScanning
UNIT II:
Spring Boot: Creating a Spring Boot Application, Spring Boot Application Annotation, What is Autowiring
, Scope of a bean, Logger, Introduction to Spring AOP, Implementing AOP advices, Best Practices : Spring
Boot Application
UNIT III:
Spring Data JPA with Boot: Limitations of JDBC API, Why Spring Data JPA, Spring Data JPA with
Spring Boot, Spring Data JPA Configuration, Pagination and Sorting, Query Approaches, Named Queries
and Query, Why Spring Transaction, Spring Declarative Transaction, Update Operation in Spring Data JPA,
Custom Repository Implementation, Best Practices - Spring Data JPA
UNIT IV:
Web Services: Why Web services, SOA - Service Oriented Architecture, What are Web Services, Types of
Web Services, SOAP based Web Services, RESTful Web Services, How to create RESTful Services
UNIT V:
Spring REST: Spring REST - An Introduction, Creating a Spring REST Controller, @RequestBody and
ResponseEntity, Parameter Injection, Usage of @PathVariable, @RequestParam and @MatrixVariable,
Exception Handling, Data Validation, Creating a REST Client, Versioning a Spring REST endpoint,
Enabling CORS in Spring REST, Securing Spring REST endpoints
Text Books:
1. Spring in action, 5th Edition, Author: Craig Walls, Ryan Breidenbach, Manning books
Web references:
IV B Tech I Sem L T P C
3 0 0 3
Secure Coding Techniques
(Job Oriented Course)
Course Outcomes:
At the end of the Course, student will be able to:
x Differentiate the objectives of information security
x Understand the trend, reasons and impact of the recent Cyber attacks
x Understand OWASP design principles while designing a web application
x Understand Threat modelling
x Importance of security in all phases of SDLC
x Write secure coding using some of the practices in C/C++/Java and Python programming languages
UNIT I:
Network and Information security Fundamentals: Network Basics, Network Components, Network
Types, Network Communication Types, Introduction to Networking Models, Cyber Security Objectives and
Services, Other Terms of Cyber Security, Myths Around Cyber Security, Myths Around Cyber Security,
Recent Cyber Attacks, Generic Conclusion about Attacks, Why and What is Cyber Security, Categories of
Attack
UNIT II:
Introduction to Cyber security: Introduction to OWASP Top 10, A1 Injection, A1 Injection Risks Root
Causes and its Mitigation, A1 Injection, A2 Broken Authentication and Session Management, A7 Cross Site
Scripting XSS,A3 Sensitive Data Exposure, A5 Broken Access Control, A4 XML External Entity (XEE),
A6 Security Misconfiguration, A7 Missing Function Level Access Control, A8 Cross Site Request Forgery
CSRF, A8 Insecure Deserialization, A9 Using Components With Known Vulnerabilities, A10 Unvalidated
Redirects and Forwards, A10 Insufficient Logging and Monitoring, Secure Coding Practices, Secure Design
Principles, Threat Modelling, Microsoft SDL Tool
UNIT III:
Secure coding practices and OWASP Top 10: Declarative Security, Programmatic Security, Concurrency,
Configuration, Cryptography, Input and Output Sanitization, Error Handling, Input Validation, Logging and
auditing, Session Management, Exception Management, Safe APIs, Type Safety, Memory Management,
Tokenizing, Sandboxing, Static and dynamic testing, vulnerability scanning and penetration testing
UNIT IV
Secure coding practices in C/C++ and Java: Potential Software Risks in C/C++, Defensive coding,
Preventative Planning, Clean Code, Iterative Design, Assertions, Pre Post Conditions, Low level design
inspections, Unit Tests
Java- Managing Denial of Service, Securing Information, Data Integrity, Accessibility and Extensibility,
Securing Objects, Serialization Security
UNIT V
Secure coding in Python: Interactive Python Scripting, Python Variables, Conditionals, Loops, Functions,
External Modules, File operations, Web requests
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Web Links:
Infosys Springboard courses
1. https://infyspringboard.onwingspan.com/en/app/toc/lex_auth_012683751296065536354_shared/conten
ts [Network Fundamentals]
2. https://infyspringboard.onwingspan.com/en/app/toc/lex_3388902307073574000_shared/overview
[Introduction to cybersecurity]
3. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_0135015696571596809160
[Certified Secure Software Lifecycle Professional (CSSLP) 2019: Secure Coding Practices]
4. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_0135015689927557129660
[OWASP Top 10: Web Application Security]
5. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_01350159304097792013093
[Defensive coding fundamentals in C and C++]
6. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_01350159172969267213125 [Java
SE 11 Programmer II: Secure Coding in Java SE 11 Applications]
7. https://infyspringboard.onwingspan.com/en/app/toc/lex_auth_01350158164493107211192/overview
[Security Programming: Python Scripting Essentials]
Web references:
1. https://www.stealthlabs.com/blog/infographic-top-15-cybersecurity-myths-vs-reality/
2. https://microage.ca/cybersecurity-layering-approach/
3. https://www.synopsys.com/glossary/what-is-threat-
modeling.html#:~:text=Threat%20modeling%20is%20a%20structured,An%20abstraction%20of%20the
%20system
4. https://www.microsoft.com/en-us/securityengineering/sdl/threatmodeling
5. https://www.checkpoint.com/cyber-hub/threat-prevention/what-is-sandboxing/
6. https://www.skillsoft.com/course/defensive-coding-fundamentals-for-cc-f44c02f9-1bcc-11e7-b15b-
0242c0a80b07#:~:text=Defensive%20Programming%20is%20a%20methodology,%2C%20testing%2C
%20and%20input%20validation.
7. https://www.oracle.com/java/technologies/javase/seccodeguide.html
8. https://www.skillsoft.com/course/security-programming-python-scripting-essentials-be99adad-1f65-
47a8-a4b5-6b5346072b8e
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
IV B Tech I Sem L T P C
3 0 0 3
Universal Human Values 2: Understanding Harmony
1. Objective:
The objective of the course is four fold:
1. Development of a holistic perspective based on self-exploration about themselves (human being),family,
society and nature/existence.
2. Understanding (or developing clarity) of the harmony in the human being, family, society and
nature/existence
3. Strengthening of self-reflection.
4. Development of commitment and courage to act.
2. Course Topics:
The course has 28 lectures and 14 practice sessions in 5 modules:
Module 1: Course Introduction - Need, Basic Guidelines, Content and Process for Value Education
5. Purpose and motivation for the course, recapitulation from Universal Human Values-I
6. Self-Exploration–what is it? - Its content and process; ‘Natural Acceptance’ and Experiential Validation-
as the process for self-exploration
7. Continuous Happiness and Prosperity- A look at basic Human Aspirations
8. Right understanding, Relationship and Physical Facility- the basic requirements for fulfilment of
aspirations of every human being with their correct priority
9. Understanding Happiness and Prosperity correctly- A critical appraisal of the current scenario
10. Method to fulfill the above human aspirations: understanding and living in harmony at various levels.
Include practice sessions to discuss natural acceptance in human being as the innate acceptance for living
with responsibility (living in relationship, harmony and co-existence) rather than asarbitrariness in choice
based on liking-disliking
Include practice sessions to discuss the role others have played in making material goods available to me.
Identifying from one’s own life. Differentiate between prosperity and accumulation. Discuss program for
ensuring health vs dealing with disease
17. Understanding values in human-human relationship; meaning of Justice (nine universal values in
relationships) and program for its fulfilment to ensure mutual happiness; Trust and Respect as the
foundational values of relationship
18. Understanding the meaning of Trust; Difference between intention and competence
19. Understanding the meaning of Respect, Difference between respect and differentiation; the other
salient values in relationship
20. Understanding the harmony in the society (society being an extension of family): Resolution,
Prosperity, fearlessness (trust) and co-existence as comprehensive Human Goals
21. Visualizing a universal harmonious order in society- Undivided Society, Universal Order- from
family to world family.
Include practice sessions to reflect on relationships in family, hostel and institute as extended family, real
life examples, teacher-student relationship, goal of education etc. Gratitude as a universal value in
relationships. Discuss with scenarios. Elicit examples from students’ lives
Module 4: Understanding Harmony in the Nature and Existence - Whole existence as Coexistence
22. Understanding the harmony in the Nature
23. Interconnectedness and mutual fulfillment among the four orders of nature- recyclability and self-
regulation in nature
24. Understanding Existence as Co-existence of mutually interacting units in all-pervasive space
25. Holistic perception of harmony at all levels of existence.
Include practice sessions to discuss human being as cause of imbalance in nature (film “Home” can be
used), pollution, depletion of resources and role of technology etc.
3. READINGS:
In the discussions, particularly during practice sessions (tutorials), the mentor encourages the student to
connect with one’s own self and do self-observation, self-reflection and self-exploration. Scenarios may be
used to initiate discussion. The student is encouraged to take up” ordinary” situations rather than” extra-
ordinary” situations. Such observations and their analyses are shared and discussed with other students and
faculty mentor, in a group sitting.
Tutorials (experiments or practical) are important for the course. The difference is that the laboratory is
everyday life, and practical are how you behave and work in real life. Depending on the nature of topics,
worksheets, home assignment and/or activity are included. The practice sessions (tutorials)
would also provide support to a student in performing actions commensurate to his/her beliefs. It is intended
that this would lead to development of commitment, namely behaving and working based on basic human
values.
It is recommended that this content be placed before the student as it is, in the form of a basic foundation
course, without including anything else or excluding any part of this content. Additional content may be
offered in separate, higher courses.
This course is to be taught by faculty from every teaching department, including HSS faculty. Teacher
preparation with a minimum exposure to at least one 8-day FDP on Universal Human Values is deemed
essential.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
5. ASSESSMENT:
This is a compulsory credit course. The assessment is to provide a fair state of development of the student,
so participation in classroom discussions, self-assessment, peer assessment etc. will be used in evaluation.
Example:
Assessment by faculty mentor: 10 marks
Self-assessment: 10 marks
Assessment by peers: 10 marks
Socially relevant project/Group Activities/Assignments: 20 marks
Semester End Examination: 50 marks
The overall pass percentage is 40%. In case the student fails, he/she must repeat the course.
IV B Tech I Sem L T P C
0 0 4 2
Machine Learning with Go
(Skill Oriented Course)
Course Objectives:
x To turn the students into a productive, innovative data analyst who can leverage Go to build robust
and valuable applications.
x To introduce the technical aspects of building predictive models in Go, but also helps you
understand how machine learning workflows are applied in real-world scenarios.
x To understand how to gather, organize, and parse real-work data from a variety of sources.
x To develop a solid statistical toolkit that will allow you to quickly understand gain intuition about
the content of a dataset.
x To implement essential machine learning techniques (regression, classification, clustering, and so
on) with the relevant Go packages.
Prerequisites:
1. Bash Shell
2. Go-an editor
List of Experiments:
1. a) Write a Go program to read CSV file and find the maximum value in a particular column.
b) Write a Go program to read iris dataset which is in csv format and demonstrate handling of
unexpected fields, types and manipulating CSV data.
2. a) Demonstrate how JSON data can be parsed using Go.
b) Demonstrate how to connect and Querying SQL like databases (Postgres MySQL, SQL Lite) using
Go
3. Demonstrate how to cache data in memory using Go
4. a) Demonstrate how to represent matrices and vectors in Go
b) Write a Go program to get statistical measures like mean, median, standard deviation and so on for
any dataset.
c) Write a Go program to visualize data distributions using Histogram, Box Plots.
5. a) Write a Go program to demonstrate Mean Squared Error(MSE), Mean Absolute Error (MAE) ,
R2 (R Squared).
b) Write a Go program to compute Accuracy, Precision , Recall, AUC (Area Under Cover)
6. a) Demonstrate how to build a linear regression model using Go.
b) Demonstrate how to build a multiple linear regression model using Go.
7. Demonstrate how to build a logistic regression model using Go
8. Apply k-nearest neighbor classifier on iris dataset using Go
9. Build a decision tree on iris dataset using Go.
10. Demonstrate K-Means clustering method using Go.
11. Build auto regressive models for time series data using Go
12. Demonstrate how to build a simple neural network using Go
References:
1. https://infyspringboard.onwingspan.com/web/en/app/toc/lex_auth_0130944292286873602383_share
d/overview
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
IV B Tech I Sem L T P C
0 0 4 2
MEAN Stack Technologies-Module II-Angular JS and MongoDB
(Skill Oriented Course)
Course Outcomes:
• Build a component-based application using Angular components and enhance their functionality using
directives.
• Utilize data binding for developing Angular forms and bind them with model data.
• Apply Angular built-in or custom pipes to format the rendered data.
• Develop a single page application by using synchronous or asynchronous Angular routing.
• Make use of MongoDB queries to perform CRUD operations on document database.
List of Exercises:
Text Books:
1. Programming the World Wide Web, 7th Edition, Robet W Sebesta, Pearson.
2. Pro Mean Stack Development, 1st Edition, ELadElrom, Apress O’Reilly.
3. Full Stack JavaScript Development with MEAN, Colin J Ihrig, Adam Bretz, 1st edition, SitePoint,
SitePoint Pty. Ltd., O'Reilly Media.
4. MongoDB – The Definitive Guide, 2nd Edition, Kristina Chodorow, O’Reilly
Web Links:
1. https://infyspringboard.onwingspan.com/en/app/toc/lex_20858515543254600000_shared/overview
(Angular JS)
2. https://infyspringboard.onwingspan.com/en/app/toc/lex_auth_013177169294712832113_shared/overv
iew (MongoDB)
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Course outcomes:
Upon completion of this course, the students will be able to
x Identify and execute the basic data format.
x Perform the computations with Excel and pdf files
x Understand the concepts of data cleanup
x Explore and analyze the Image and video data
x Understand the concepts web scraping
UNIT I:
INTRODUCTION TO DATA WRANGLING: Data Wrangling, Importance of Data Wrangling, How is
Data Wrangling performed, Tasks of Data Wrangling, Data Wrangling Tools, Introduction to Python,
Python Basics, Data Meant to be Read by Machines, CSV Data, JSON Data, XML Data.
UNIT II:
WORKING WITH EXCEL FILES AND PDFS: Installing Python Packages, Parsing Excel Files, Parsing
Excel Files, Getting Started with Parsing, PDFs and Problem Solving in Python, Programmatic Approaches
to PDF Parsing, Converting PDF to Text, Parsing PDFs Using pdf miner, Acquiring and Storing Data-
Databases: A Brief Introduction, Relational Databases: MySQL and PostgreSQL, Non-Relational
Databases: NoSQL, When to Use a Simple File, Alternative Data Storage.
UNIT III:
DATA CLEANUP: Why Clean Data, Data Cleanup Basics, Identifying Values for Data Cleanup,
Formatting Data, Finding Outliers and Bad Data, Finding Duplicates, Fuzzy Matching, RegEx Matching,
Normalizing and Standardizing the Data, Saving the Data, Determining suitable Data Cleanup, Scripting the
CleanupTesting with New Data
UNIT IV:
DATA EXPLORATION AND ANALYSIS: Exploring Data, Importing Data, Exploring Table Functions,
Joining Numerous Datasets, Identifying Correlations, Identifying Outliers, Creating Groupings, Analyzing
Data, Separating and Focusing the DataPresenting Data, Visualizing the Data, Charts-Time-Related Data,
Maps, Interactives, Words-Images, Video, and Illustrations, Presentation Tools, Publishing the Data, Open
Source Platforms
UNIT V:
WEB SCRAPING: What to Scrape and How, Analyzing a Web Page, Network/Timeline, Interacting with
JavaScript, In-Depth Analysis of a Page, Getting Pages, Reading a Web Page, Reading a Web Page with
LXML, XPath-Advanced Web Scraping, Browser-Based Parsing, Screen Reading with Selenium, Screen
Reading with Ghost.PySpidering the Web, Building a Spider with Scrapy, Crawling Whole Websites with
Scrapy.
Text Books:
1. Data Wrangling with Python, Jacqueline Kazil& Katharine Jarmul, O’Reilly Media, Inc,2016
2. Data Wrangling with Python: Creating actionable data from raw sources,,Dr. TirthajyotiSarkar,
ShubhadeepPackt Publishing Ltd,2019
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21
Reference Books;
1. Hands-On Data Analysis with Pandas, Stefanie Molin, Packt Publishing Ltd,2019
2. Practical Data Wrangling, Allan Visochek, Packt Publishing Ltd,2017
3. Principles of Data Wrangling: Practical Techniques for Data Preparation, TyeRattenbury, Joseph M.
Hellerstein, Jeffrey Heer, Sean Kandel, Connor Carreras, , O’Reilly Media, Inc,2017