Life Cycle of Data Analytics
Life Cycle of Data Analytics
The Data analytics lifecycle was designed to address Big Data problems and data
science projects. The process is repeated to show the real projects. To address the
specific demands for conducting analysis on Big Data, the step-by-step methodology
is required to plan the various tasks associated with the acquisition, processing,
analysis, and recycling of data.
Phase 1: Discovery -
o The team studies data to discover the connections between variables. Later, it
selects the most significant variables as well as the most effective models.
o In this phase, the data science teams create data sets that can be used for
training for testing, production, and training goals.
o The team builds and implements models based on the work completed in the
modelling planning phase.
o Some of the tools used commonly for this stage are MATLAB and STASTICA.
o The team creates datasets for training, testing as well as production use.
o The team is also evaluating whether its current tools are sufficient to run the
models or if they require an even more robust environment to run models.
o Tools that are free or open-source or free tools Rand PL/R, Octave, WEKA.
o Commercial tools - MATLAB, STASTICA.
o Following the execution of the model, team members will need to evaluate the
outcomes of the model to establish criteria for the success or failure of the
model.
o The team is considering how best to present findings and outcomes to the
various members of the team and other stakeholders while taking into
consideration cautionary tales and assumptions.
o The team should determine the most important findings, quantify their value to
the business and create a narrative to present findings and summarize them to
all stakeholders.
Phase 6: Operationalize -
o The team distributes the benefits of the project to a wider audience. It sets up
a pilot project that will deploy the work in a controlled manner prior to
expanding the project to the entire enterprise of users.
o This technique allows the team to gain insight into the performance and
constraints related to the model within a production setting at a small scale and
then make necessary adjustments before full deployment.
o The team produces the last reports, presentations, and codes.
o Open source or free tools such as WEKA, SQL, MADlib, and Octave.