The document outlines the data processing chain essential for business intelligence, emphasizing the importance of data modeling, warehousing, and mining to derive insights for decision-making. It discusses various data types, challenges, and techniques such as regression, decision trees, and cluster analysis, as well as the applications of BI in sectors like retail, healthcare, and marketing. Additionally, it highlights the necessity of data visualization and BI tools for effective communication and strategic advantage.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
6 views38 pages
Business Intelligence
The document outlines the data processing chain essential for business intelligence, emphasizing the importance of data modeling, warehousing, and mining to derive insights for decision-making. It discusses various data types, challenges, and techniques such as regression, decision trees, and cluster analysis, as well as the applications of BI in sectors like retail, healthcare, and marketing. Additionally, it highlights the necessity of data visualization and BI tools for effective communication and strategic advantage.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38
Data Processing Chain
• Data lies at the heart of business intelligence.
• Data can be modelled and stored in a database. • Relevant data can be extracted from the operational data stores according to certain reporting and analysing purposes, and stored in a data warehouse. • The data from the warehouse can be combined with other sources of data, and mined using data mining techniques to generate new insights. • The insights need to be visualized and communicated to the right audience in real time for competitive advantage. Data • Anything that is recorded is data. Observations and facts are data. Anecdotes and opinions are also data, of a different kind. • Data can be numbers, such as the record of daily weather or daily sales. • Data can be alphanumeric, such as the names of employees and customers. Data • Data is the source of insights and decision. The analysis is the essential part to transform data to information and knowledge. Many times, the process is complex, as have to deal with different types of data and all kinds of data use problems. • Different types of data • Numeric vs. textual • Structured vs. unstructured • Standard format vs. proprietary format • Internal vs. external data, system stored vs. file-based data • Raw fact data vs. simulated/forecast/estimated data • Simple fact data vs. calculated metrics data • Common data use challenges • Structured, unstructured, semi-structured • Information and knowledge management is the management of both structured data (15% of information) and unstructured data (85% of information), according to the Butler Group. • 80 percent of business is conducted on unstructured information (Gartner Group). 3 Database • A database is a modeled collection of data that is accessible in many ways. • A data model can be designed to integrate the operational data of the organization, • example of a sales organization. A data model for managing customer orders will involve data about customers, orders, products, and their interrelationships Data Warehouse • is an organized store of data from all over the organization, specially designed to help make management decisions. • Data can be extracted from operational database to answer a particular set of queries. • This data, combined with other data, can be rolled up to a consistent granularity and uploaded to a separate data store called the data warehouse. • the data warehouse is a simpler version of the operational data base, with the purpose of addressing reporting and decision-making needs only. • The data in the warehouse cumulatively grows as more operational data becomes available and is extracted and appended to the data warehouse. • Unlike in the operational database, the data values in the warehouse are not updated. Example of Datawarehouse • To create a simple data warehouse for the movies sales data, assume a simple objective of tracking sales of movies and making de about managing inventory. • In creating this data warehouse, all the sales transaction data will be extracted from the operational data files. • The data will be rolled up for all combinations of time period and product number. Thus, there will be one row for every combination of time period and product. The resulting data warehouse will look like the table • The data in the data warehouse is at much less detail than the transaction database. Data Mining • Data Mining is the art and science of discovering useful innovative patterns from data. Data mining should be done to solve high-priority, high-value problems. • Much effort is required to gather data, clean and organize it, mine it with many techniques, interpret the results, and find the right insight One should select the right data. CONT.. • A retail company may use data mining techniques to determine which new product categories to add to which of their stores; how to increase sales of existing products; which new locations to open stores in; how to segment the customers for more effective communication; and so on BI data mining process • The first important step is business understanding, that is, asking the right business questions. • A question is a good one if answering it would lead to large payoffs for the organization, financially and otherwise • There should be strong executive support for the data mining project, which means that the project aligns well with the business strategy. Data understanding • A second important step is to be creative and open in proposing imaginative hypotheses for the solution. • Thinking outside the box is important, both in terms of a proposed model as well in the data sets available and required. Data preparation • The data should be clean and of high quality. It is important to assemble a team that has a mix of technical and business skills, who understand the domain and the data. • It may be desirable to add new data elements from external sources of data that could help improve predictive accuracy. modeling • A host of modeling tools and algorithms should be used. • A tool could be tried with different options, such as running different decision tree algorithms. evaluation • Evaluate the model’s predictive accuracy with more test data. deployment • The dissemination and rollout of the solution is the key to project success. Otherwise the project will be a waste of time and will be a setback for establishing and supporting a data-based decision-process culture in the organization. The model should be embedded in the organization’s business processes data mining techniques • Decision trees: They help classify populations into classes.. Thus, decision trees are the most popular and important data mining technique. • There are many popular algorithms to make decision trees. • They differ in terms of their mechanisms and each technique work well for different situations. It is possible to try multiple algorithms on a data set and compare the predictive accuracy of each tree. Regression: • The goal is to find a best fitting curve through the many data points. • The best fitting curve is that which minimizes the (error) distance between the actual data points and the values predicted by the curve. • Regression models can be projected into the future for prediction and forecasting purpose Artificial neural networks (ANNs): • Artificial neural networks (ANNs): Originating in the field of artificial intelligence and machine learning, ANNs are multilayer nonlinear information processing models that learn from past data and predict future values. These models predict well, leading to their popularity. The model’s parameters may not be very intuitive. Thus, neural networks are opaque like a black box. These systems also require a large amount of past data to adequately train the system. Cluster analysis • This is an important data mining technique for dividing and conquering large data sets. The data set is divided into a certain number of clusters, by discerning similarities and dissimilarities within the data. There is no one right answer for the number of clusters in the data. The user needs to make a decision by looking at how well the number of clusters chosen fit the data. This is most commonly used for market segmentation. Unlike decision trees and regression, there is no one right answer for cluster analysis Association rule mining • Association rule mining: Also called market basket analysis when used in retail industry, these techniques look for associations between data values. • An analysis of items frequently found together in a market basket can help cross-sell products and also create product bundles Data Visualization • Data Visualization As data and insights grow in number, a new requirement is the ability of the executives and decision makers to absorb this information in real time. • There is a limit to human comprehension and visualization capacity. That is a good reason to prioritize and manage with fewer but key variables that relate directly to the key result areas of a role. BI Applications • BI tools are required in almost all industries and functions. • The nature of the information and the speed of action may be different across businesses, but every manager today needs access to BI tools to have up-todate metrics about business performance. • The following are some areas of applications of BI and data mining Customer Relationship Management • A business exists to serve a customer. A happy customer becomes a repeat customer. • A business should understand the needs and sentiments of the customer, sell more of its offerings to the existing customers, and also, expand the pool of customers it serves. • BI applications can impact many aspects of marketing Examples of CRP • Maximize the return on marketing campaigns • Improve customer retention (churn analysis) • Maximize customer value (cross-selling, upselling) • Identify and delight highly valued customers Manage brand image: Health Care and Wellness • BI applications can help apply the most effective diagnoses and prescriptions for various ailments. • They can also help manage public health issues, and reduce waste and fraud. Diagnose disease • Diagnose disease in patient Systems, such as IBM Watson, absorb all the medical research to date and make probabilistic diagnoses in the form of a decision tree, along with a full explanation for their recommendations • Decision trees can help doctors learn about and prescribe more effective treatments. Thus, the patients could recover their health faster with a lower risk of complications and cost Wellness management: • This includes keeping track of patient health records, analyzing customer health trends, and proactively advising them to take any needed precautions Education • As higher education becomes more expensive and competitive, it is a great user of data-based decision-making. There is a strong need for efficiency, increasing revenue, and improving the quality of student experience at all levels of education. • Student enrolment (recruitment and retention): Marketing to new potential students requires schools to develop profiles of the students that are most likely to attend Retail • Retail organizations grow by meeting customer needs with quality products, in a convenient, timely, and cost-effective manner. Understanding emerging customer shopping patterns can help retailers organize their products, inventory, store layout, and web presence in order to delight their customers, which in turn would help increase revenue and profits. Retailers generate a lot of transaction and logistics data that can be used to solve problems. Text Mining Applications • Text mining is a useful tool in the hands of chief knowledge officers to extract knowledge relevant to an organization. Text mining can be used across industry sectors and application areas, including decision support, sentiment analysis, fraud detection, survey analysis, and many more. Marketing: • The voice of the customer can be captured in its native and raw format and then analyzed for customer preferences and complaints. a. Social personas are a clustering technique to develop customer segments of interest. • Consumer input from social media sources, such as reviews, blogs, and tweets, contain numerous leading indicators that can be used toward anticipating and predicting consumer behavior. . Business operations:
• Social network analysis and text mining can be applied to e-mails,
blogs, social media and other data to measure the emotional states and the mood of employee populations. • Sentiment analysis can reveal early signs of employee dissatisfaction and this then can be proactively managed Legal: • In legal applications, lawyers and paralegals can more easily search case histories and laws for relevant documents in a particular case to improve their chances of winning. • Text mining is also embedded in e-discovery platforms that helps in the process of sharing legally mandated documents. • Case histories, testimonies, and client meeting notes can reveal additional information, such as comorbidities in a health care situation that can help better predict high-cost injuries and prevent costs. Governance and politics: • Governments can be overturned based on a tweet from a self- immolating fruit-vendor in Tunisia. • Social network analysis and text mining of large-scale social media data can be used for measuring the emotional states and the mood of constituent populations. Microtargeting constituents with specific messages gleaned from social media analysis can be a more efficient use of resources. • In geopolitical security, Internet chatter can be processed for realtime information and to connect the dots on any emerging threats. • In academic, research streams could be meta-analyzed for underlying research trends. Web mining • is the art and science of discovering patterns and insights from the World Wide Web so as to improve it. • The World Wide Web is at the heart of the digital revolution. More data is posted on the Web every day than was there on the whole Web just 20 years ago. • Billions of users are using it every day for a variety of purposes. • The Web is used for ecommerce, business communication, and many other applications. • Web mining analyzes data from the Web and helps find insights that could optimize the web content and improve the user experience. • Data for web mining is collected via web crawlers, web logs, and other means. Here are some characteristics of optimized websites: • Appearance: Aesthetic design; well-formatted content, easy to scan and navigate; and good color contrasts. • 2. Content: Well-planned information architecture with useful content; fresh content; search-engine optimized; and links to other good sites. • 3. Functionality: Accessible to all authorized users; fast loading times; usable forms; and mobile enabled. BI tools and applications, • With BI superior tools, now employees can also easily convert their business knowledge via the analytical intelligence to solve many business issues, like increase response rates from direct mail, telephone, e-mail, and Internet delivered marketing campaigns • With BI, firms can identify their most profitable customers and the underlying reasons for those customers’ loyalty, as well as identify future customers with comparable if not greater potential. • Analyze click-stream data to improve ecommerce strategies. • Quickly detect warranty-reported problems to minimize the impact of product design deficiencies • Discover money-laundering criminal activities. • Analyze potential growth customer profitability and reduce risk exposure through more accurate financial credit scoring of their customers • Determine what combinations of products and service lines customers are likely to purchase and when. • Analyze clinical trials for experimental drugs. • Set more profitable rates for insurance premiums. • Reduce equipment downtime by applying predictive maintenance • Determine with attrition and churn analysis why customers leave for competitors and/or become the customers. • Detect and deter fraudulent behavior, such as from usage spikes when credit or phone cards are stolen. • Identify promising new molecular drug compounds.