Bana1 Visualization
Bana1 Visualization
PERSPECTIVE OF
z
BUSINESS ANALYTICS
STATISTICS
z
Descriptive statistics is what organizations use to summarize their data. This type typically involves
summary charts, graphs and tables depicting the data for easier comprehension, rather than relying on raw,
unorganized data. Among some of the useful data that comes from descriptive statistics are the mode,
median and mean, as well as range, variance and standard deviation. That said, descriptive statistics are not
meant to draw conclusions.
2. Inferential statistics
Inferential statistics offer a way to take the data from a representative sample and use it to draw larger
truths. It allows organizations to extrapolate beyond the data set, going a step further than descriptive
statistics. Statistical inference relies heavily on finding as representative a sample as possible from which to
draw conclusions about a wider population. As there will always be uncertainty about extrapolating from a
limited set of data to a wider population, statistical inference relies upon estimating uncertainty in
predictions.
Key takeaway: Descriptive statistics are used to describe data, while inferential statistics are
used to infer conclusions and hypotheses about the same information.
z
What are the benefits of statistical analysis?
4. Improve decision-making.
z
What is statistical analysis software?
This software can deliver the specific analysis an organization needs to better
its business.
Such software can quickly and easily generate charts and graphs when
conducting descriptive statistics while at the same time running the more
sophisticated computations that are required when conducting inferential
statistics.
The more popular statistical analysis software services include IBM’s SPSS,
SAS, Revolution Analytics’ R, Minitab, Stata and Tableau, which is now part of
Salesforce.
z
Software features
Typical analytical functions include standard
modeling, confidence intervals and probability
calculations. They provide the core value of
statistical software and are the primary reason
to invest in such systems in the first place.
Despite that, analytical features should not be
your primary concern when shopping for
statistical analysis software.
KEY TAKEAWAYS
Data mining is the process of analyzing a large batch of information to discern trends
and patterns.
Data mining can be used by corporations for everything from learning about what
customers are interested in or want to buy to fraud detection and spam filtering.
Data mining programs break down patterns and connections in data based on what
information users request or provide.
Social media companies use data mining techniques to commodify their users in order
to generate profit.
This use of data mining has come under criticism lately as users are often unaware of
the data mining happening with their personal information, especially when it is used
to influence preferences.
How
z Data Mining Works?
Data mining involves exploring and analyzing large blocks of information to glean
meaningful patterns and trends. It can be used in a variety of ways, such as database
marketing, credit risk management, fraud detection, spam Email filtering, or even to
discern the sentiment or opinion of users.
First, organizations collect data and load it into their data warehouses. Next, they store and
manage the data, either on in-house servers or the cloud. Business analysts, management
teams, and information technology professionals access the data and determine how they
want to organize it. Then, application software sorts the data based on the user's results,
and finally, the end-user presents the data in an easy-to-share format, such as a graph or
table.
z
Data Warehousing
Data mining uses algorithms and various techniques to convert large collections of data into useful
output. The most popular types of data mining techniques include:
Association rules, also referred to as market basket analysis, searches for relationships between
variables. This relationship creates additional value within the data set as it strives to link pieces of
data. For example, association rules would search a company’s sales history to see which products are
most purchased together; with this information, stores can plan, promote, and forecast accordingly.
Clustering is similar to classification. However, clustering identified similarities between objects, then
groups those items based on what makes them different from other items. While classification may
result in groups such as "shampoo", "conditioner", "soap", and "toothpaste", clustering may identify
groups such as "hair care" and "dental health".
Data Mining Techniques
z
Decision trees are used to classify or predict an outcome based on a set list of criteria or decisions. A
decision tree is used to ask for input of a series of cascading questions that sort the dataset based on
responses given. Sometimes depicted as a tree-like visual, a decision tree allows for specific direction
and user input when drilling deeper into the data.
K-Nearest Neighbor (KNN) is an algorithm that classifies data based on its proximity to other data.
The basis for KNN is rooted in the assumption that data points that are close to each are more similar
to each other than other bits of data. This non-parametric, supervised technique is used to predict
features of a group based on individual data points.
Neural networks process data through the use of nodes. These nodes is comprised of inputs,
weights, and an output. Data is mapped through supervised learning (similar to how the human brain
is interconnected). This model can be fit to give threshold values to determine a model's accuracy.
Before any data is touched, extracted, cleaned, or analyzed, it is important to understand the underlying entity and the
project at hand. What are the goals the company is trying to achieve by mining data? What is their current business
situation? What are the findings of a SWOT analysis? Before looking at any data, the mining process starts by
understanding what will define success at the end of the process.
Once the business problem has been clearly defined, it's time to start thinking about data. This includes what sources
are available, how it will be secured stored, how information will be gathered, and what the final outcome or analysis
may look like. This step also critically thinks about what limits there are to data, storage, security, and collection and
assesses how these constraints will impact the data mining process.
It's now time to get our hands on information. Data is gathered, uploaded, extracted, or calculated. It is then cleaned,
standardized, scrubbed for outliers, assessed for mistakes, and checked for reasonableness. During this stage of data
mining, the data may also be checked for size as an overbearing collection of information may unnecessarily slow
computations and analysis.
z
The Data Mining Process
With our clean data set in hand, it's time to crunch the numbers. Data scientists use the types of data mining above to
search for relationships, trends, associations, or sequential patterns. The data may also be fed into predictive models
to assess how previous bits of information may translate into future outcomes.
The data-centered aspect of data mining concludes by assessing the findings of the data model(s). The outcomes from
the analysis may be aggregated, interpreted, and presented to decision-makers that have largely be excluded from the
data mining process to this point. In this step, organizations can choose to make decisions based on the findings.
The data mining process concludes with management taking steps in response to the findings of the analysis. The
company may decide the information was not strong enough or the findings were not relevant to change course.
Alternatively, the company may strategically pivot based on findings. In either case, management reviews the ultimate
impacts of the business and re-creates future data mining loops by identifying new business problems or
opportunities.
z
Applications of Data Mining
Sales
Marketing
Manufacturing
Fraud Detection
Human Resources
Customer Service
z
Benefits of Data Mining
Data mining ensures a company is collecting and analyzing reliable data. It is often a more rigid,
structured process that formally identifies a problem, gathers data related to the problem, and
strives to formulate a solution. Therefore, data mining helps a business become more profitable,
efficient, or operationally stronger.
Data mining can look very different across applications, but the overall process can be used with
almost any new or legacy application. Essentially any type of data can be gathered and analyzed, and
almost every business problem that relies on qualifiable evidence can be tackled using data mining.
The end goal of data mining is to take raw bits of information and determine if there is cohesion or
correlation among the data. This benefit of data mining allows a company to create value with the
information they have on hand that would otherwise not be overly apparent. Though data models
can be complex, they can also yield fascinating results, unearth hidden trends, and suggest unique
strategies.
z
Limitations of Data Mining
1. Complexity - This complexity of data mining is one of the largest disadvantages to the
process. Data analytics often requires technical skillsets and certain software tools. Some
smaller companies may find this to be a barrier of entry too difficult to overcome.
2. Data mining doesn't always guarantee results. - A company may perform statistical analysis,
make conclusions based on strong data, implement changes, and not reap any benefits.
Through inaccurate findings, market changes, model errors, or inappropriate data populations
, data mining can only guide decisions and not ensure outcomes.
3. Cost - There is also a cost component to data mining. Data tools may require ongoing costly
subscriptions, and some bits of data may be expensive to obtain. Security and privacy
concerns can be pacified, though additional IT infrastructure may be costly as well. Data
mining may also be most effective when using huge data sets; however, these data sets must
be stored and require heavy computational power to analyze.
z
DATA VISUALIZATION
Good data visualization is essential for analyzing data and making decisions
based on that data. It allows people to quickly and easily see and
understand patterns and relationships and spot emerging trends that might
go unnoticed with just a table or spreadsheet of raw numbers. And in most
cases, no specialized training is required to interpret what’s presented in
the graphics, enabling universal understanding.
A well-designed graphic can not only provide information, but also heighten
the impact of that information with a strong presentation, attracting
attention and holding interest as no table or spreadsheet can.
z
How data visualization works
A graphic should always take into consideration the data type and
purpose. Some information is better suited to one type of graphic over
another: for example, a bar graph instead of a pie chart. But with most
tools, the user has a wide choice of visual analytics options, from
common charts such as line graphs and bar charts to timelines, maps,
plots, histograms, and custom designs.
z
References
https://www.investopedia.com/terms/d/datamining.asp
https://www.businessnewsdaily.com/6000-statistical-analysis.html
https://www.oracle.com/ph/business-analytics/what-is-data-visualiz
ation/#:~:text=Data%20visualization%20is%20part%20of,another%
20type%20of%20visual
%20presentation