Data EcoSystem and LifeCyle
Data EcoSystem and LifeCyle
3. Wrangling
• Data wrangling is a set of processes designed to transform raw data
into a more usable format
• It may involve merging multiple datasets, identifying and filling gaps
in data, deleting unnecessary or incorrect data, and “cleaning” and
structuring data for future analysis.
data wrangling tools –Eg: OpenRefine, DataWrangler, and
Module Code & Module Title Slide Title SLIDE 8
4. Analysis
• After raw data has been inspected and transformed into a readily
usable state, it can be analyzed.
• Analysis can be done using Algorithms, statistical models,
visualization tools.
5. Storage
• Throughout all of the data life cycle stages, data must be stored in a
way that’s both secure and accessible
Cloud-based storage solutions: store data off-site and access it remotely
On-site servers: give organizations a greater sense of control over how data is
stored and used
Other storage media: includes hard drives, USB devices, CD-ROMs, and floppy
disks
Module Code & Module Title Slide Title SLIDE 9
Module Code & Module Title Slide Title SLIDE 10
DATA LIFE CYCLE
• All data projects follow the same basic life cycle from start to finish.
This life cycle can be split into eight common stages, steps, or
phases:
8. Interpretation