Dashrath Nandan BDA (Unit-2)Notes
Dashrath Nandan BDA (Unit-2)Notes
clAsSMAte
Date
Page
Narrdam.
Diadantges
Eis tault toleanel
-i) Hrly aualloble
i) Stabilit iuus:
Secwit (oncens
y Cemblexityy lakneys limiled
autpot for Skavi~kored data
tAboche Sbauk bewCl, Uni<id analic
Abache Shank s an Shon- andl
buscesing- dt sustot beh bafh
engine fen big data in 009 in UC Gerkely
steah bbcesing Shank" stanled
RD Lab and in boit became Gsen seue wnder BsDliceñce
XOmbenints tf Aache Shask: SPark CO1e:
SPank
SPak SQL_Shearing Henenal Executve engne
Fundamental lnit?
SPork CDAe
R "In-memory Combutaien
APIScala 'ython Jawa Baste /o funckonaliby.
dt ç genenal exeuhon engine for
D Abache Spak oHe : API that defines 'RoDe
spark plakom Home "to knd
his tompenent sheratu uith many
i SbakSÑL:Structuneod data.
Trldent
6-Event-tne Limike Comprehonsie
walemaks.
uith
|Suyport
*| KeyFeatuxesiPertomance
dnleqraBienPrBesingsdntenactine
wuith Biq Data Platonå, Dashboand
tlgh
Ease sf lse, Adianced Arnalyis 7Scalhbilty,alabonatens
¢ Sharing"
Lae Cases ;-Busines dntaltignetRitostinqsCuomes Analyia,
Heathcane Datk Anaysi, IdT Data Monifong.
Chalenge:-Cost, leaning Cunve ,Jesendeney ot Data Prepavahon
Pouon BI
data viaualizatien toel that inabes Aen to ©nalyge big
data and reate tinsiphtul, intratiue aashbedid an
* Key FeakoresiWide Range 8 Data nectiity, saalaliltby
RT Misualgatien, Avanced Aalyts andAI Tntepaten
"Seamlet Colaboahon. rle- leuel Seciry.
clasSate
Date
Page
*fowey BI Axchitectwe
Powen BI Seice
Power BI
Data Serce peliverg
4Powes BI Rebort Sewen
Abache Zephelin
Apache Zebbelin is an bben-Sewne web based nptthock
Hhat sushote interactiue'data analys Nismaligatos
and collabonation fo bip data ueitouo JVM
* AchiBecte-JVM- spank Inerpren rOub
zeesetin Sower fSpark SFanKSEA
*NoSOL DatabaAes
NaSOL databases ane non- Helational and designed fo
lenibility, scalabitt,anad handlng nstiuchuud
Examþles > MongoD8, Cassanda, Redis, and Couchbase .
*Key Fatwesi
Flexible Data Models : Cam Stove dataas key-vale faiy dlocumendea
Scalability : Horzortal Soaling andEastie sealing.
Big Data Handling "Fault Tolrane High Petomane
Donammic Schena oCat effechie keat Tine licatins
* Limitations:-olack of AcID Tonsactens "limibd uengirg
"ata Redundaney and Sntegiy " limitd uypost for ote
" lack ef buill -in Reportng Tlol " Dshibuted heiiectue
**Use Cassi-Big Data dRT. Analytes Social Mudaf Recoamndt
Enginw IóTPSenso Datl "Commere "CMS
*Types ef NoSCQL Databases
NOSOL
|Key,Value DB Doupent DE Colupn-DE
Redis 'Ceuch DB "Cassanda D8 Neot
Dynano DB "HBQse
MongoD8
1 Keu Vale Stoves : Stores data as key-value Pains
2-Deument Stores: Sores data as JSON- ke douuments.
Stores data in ceumn-Oninteo
3.Columm- Family Sores i tomat for tast fetrieual
4. Grebh DB: Store data s NEde Dnd elatienshs, seful
foe soal metusks and recomnendateng
clAssMAte
Date
Page
CouchDB
Abache CouchDB is an oben-sawre, NosQL doument
datobase thot tocusts on dcalability cay fuse and
data reblicatien t ses a schema-ce, JSON-bad
stoage model and RESTul HTIP APL foN data
intokacen
clssMste
Date
Page
Docmant
Raplia Document
HTTP Couch DB
REQUEST Engine Reblica +Document
DB2 DOcment
RD83
* Key fëokunes:
Storage format: tstores data ay JsoN document , Sehema- free
Document Model: Similar to MongaDB.
Oueey Language: t uses MabReduce to cHeate uiew.
: ReplicaHon Jt ecces in Multi- master ssliaten
Scalabilit i Support horizonia scaling ttosugh Cstering.
*UBe Cases i- 0Hline- Fist Abbs, Mobile Sync .
equining fst, Hesitle
Noe Mon go DB is beter sutted fe2 Qsslen Conpleo
guevying scalábilty and agpephen Capab
CoLchDB ercels in enuironment that need subpont fer
slicaken s dstrikuted yen and fline capablitet-
Ceumn Family StoHes(Cassandwa , HBase)
*Cassomdya No SQL
Abache Casanda isahighly scalable,dsbibuted muiple
database deuigned to handle bi¡ data acres
tolenance
Souers with high avaiabilithy Qnd faut
canamda
Data Models Similar to into regiong.
Regions and Rogibn Sewer : Dta in HBase bbaationed
Data Acce fouides ou (ateney rad-wnite Obenatong
Consisteney : Pouide Shong- Consstency at tsw-leuel
tnkeasaheA ILih Hadoep. "Tnteqyaked uel wuit h Hadsep
clssmste
Date
Page
Redlis
Redis (Remste Dictonay Sexver) ua high-Porformamce,
in-memny NoSQL dadbase buimarily used for caching,
yeal-time anayties,
UMain
and mesage brekoing
Reicated
Instance
Rebliakon Read Clet
Read Cint
Kecp data in memors fo fast acces lCan bersist dta
en dik Ghenally)
DataTpe Suppouhed- String, hashes, list, Set,Sorted Sek
*Advantngl:- Extrerny Fast, Subpont atomic obenaion on ds.
Digodu - Memory band, Not auitable fo Canjlex Quwying
XDunamo DB bcalable
Aullymanaged, highly aniable, and
Ko-alueand deuhent atabase seuie bueuided
b Aws.
toes data en SsD-backend torage uith ephonal
in-mernly Laking (DAX)
classMste
Date
Page
Pimoy key
Amazon
DAX
Dynamo Table
Clnt AWS Amazon Pk SK AHihety
Cloud Shell Elasfc Cache Dynamo
OB
*AI in Ba Data
Jhe integiaon G¢ AI and ML into the domain &t Bio Data
enhanis dlata puocesng, dlecision- making, fuomatin
+AI and ML þouide the'conputahonal peuer Qnd algo
þaedictions
to analyeBip Data, neeuer fattekns f make
One y he nmast bignicant abpliahien i NLP.
*NLPNawal longuage PDCM0n)
NLPisa branch f AI that nables mmachnes te
endeutmd, inlepet, andqemesat human languge
clssmate
Date
Page
*NP Technigues
J Text Poesing and Preboceslng in NLPi
Tökemization: Diuidirg text into smaller Unit, eq hlard or Semloncs.
Stemming t lemmatizaton : Reducing word to hein bau fom(kk
Stopuoxdkemoval: KRernouing (omen uords (like'ard", "s"
Text nonalizakon: Stardamdizing text, punchuahong
2- Sentinent and EmoHon Analyso in NLPtmgoaHig: speling
Emotion Delecion : Talentikying amd (alkgosizing embton expyesed in text
Cpiniern Nintig: Analyzing'pinions to undenstand þuttie
publie sentinent
5.Spezch Rosng: Kpch Recogniton, Text-to-Speuch (Trs).
4- D'aloguUe Syste' Chatbots anil VirtualAsiutants
Predtckve Analtte : tt ses AIAML tachnigues to forcat
tutne euteme. Data Collecion, DaBa freproceking,
Featre Engneesing Statstical tralysit ,MLmodes.
Tianahon , TextSummani
5. Lamouageenena Son : Machine
-Zation , Tect G1eneHon
inalyycs
Seanch ixt in mullle lang fu inise.
LNes Cognthe Voice aistants
2 Conyesálional AT: Chatbots dndConvents &peach-to-text f ua-lea
3-Sech Recogrthon emexahion
4.A1- weed AutomaHon : Reducts manual etorts
5Tndusty-Specife Solutons
hatson for Oncblegy hels in toeatment blanning.
"Heathcane: Hisk asesment, busonalized bankina
Frnance: Faud delechon ,
stomer Mecommed', Inwentry OpHnization
Retail : Porsonalied Leaning.
Legal t Educaen: Contract analyi,
*Challenges i- Data Puivacy Cempliance
"Technical Expertie.
Cetly lstoiFatin Niede.
alssmate
Date
Page
*ateon Seuices
Key hlatkon 6erucs aHe
Aoson DeovKY: A1-foweNed beanch and analyts brgin
ddealtaA exbatting Inatghts fyom shuchuned 4 Unluchuttd
data useflis bedmcane ,legal f Customer seuics:
key FeatweL b> NLP, Document clasihiatbn QlA,
mocdels.
Pre-haned nodlls
Aanced Search Catabilitier
2Watsen Studlo: lonþsehensiue enuinenment foe duýning
raining, and desleying ML modls uakáing
Peyect fo dah scentikl1, analyst and deueloteuele.
npnane , retail,
n custon pedthie models Lyeycle , Brigamming lang Sufert
key Feaheres t> End-to- End AI rteton
GutoAI Collaboratien Fahres , Coud
3.Waton Assisant Conuersatenal AI platssm for bildng
ntelligent litual agnt ancl chatbat.
imþroving us¿n laperitne Qnd autmatng
deal fer velail ftelecomunicl.
ustoner beruihin baikg, TempatU, Conkxd Manage,
Key Featoes:> NLU,Pre buil!
Backenddnegyatn. Liu Bqent Escalatiens
in Bia Data Decision -Mating
*Naten struclsed fUnsbikadda
doka tfa
* Role :>"Processld vast like a human
"llsesNLPfML to Hestond deliuers kIinsight.
ddentfispattern,tends, Qnd
Snteqyalon Methods
+RESTl APTs "IBM hlatson Studio "lutam Connechers
IBM Cloud fak tor DaBa. "Evend Sbeaming- kafea
Use Case-Data Ennichment, bedietie nalytids
Sentimet rasis, Chatbot and Vahual Astat.
Healhcar Analyte
*alalson Corobonends Wason Assistant, Watson DjsCovexy
Watson Skudio, Walson NZU, Wason Vual Rec9nihon,
Text- to-Sßeech Sheech- to- Teot, ete
Secnity anPrluay
d Data )
loneaithtegratin Data D
*Challenges:
koperitnes:
Jmprovedcustomer
SeruLies Lghod
Seruices - batson
otson baed leud thaughScalabilty
tine-
to- Jastn
utomaHon. making decision 4Enhaced
Anteguatien: Bemfit *
t. habenAsitant
delie
AI do and MengoDB
intenactien
data
in Customer StMorgoDB:
one Wasont m)
nstuchued
data amoundt
of Vast Store Hadoop 1)Natson+
i
analyss cogniiue fohlatson
r amd
dishibuted
daa for Spvk spak :+Apache )ason
Page
Date
classmate