0% found this document useful (0 votes)
189 views33 pages

Open Source Business Intelligence Tools

This document provides an overview of open source business intelligence (BI) tools. It discusses key BI concepts like data warehousing, data integration, reporting, visualization, and predictive analytics. It then reviews several popular open source tools for each category, including Talend and Pentaho for ETL, BIRT and Pentaho for reporting, Pentaho CDE/CDF for dashboards, and R, Weka, and RapidMiner for statistics and predictive analytics. The document aims to help readers understand the different types of BI questions that can be answered and the phases of a typical BI implementation.

Uploaded by

ansana
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
189 views33 pages

Open Source Business Intelligence Tools

This document provides an overview of open source business intelligence (BI) tools. It discusses key BI concepts like data warehousing, data integration, reporting, visualization, and predictive analytics. It then reviews several popular open source tools for each category, including Talend and Pentaho for ETL, BIRT and Pentaho for reporting, Pentaho CDE/CDF for dashboards, and R, Weka, and RapidMiner for statistics and predictive analytics. The document aims to help readers understand the different types of BI questions that can be answered and the phases of a typical BI implementation.

Uploaded by

ansana
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
You are on page 1/ 33

Open Source Business Intelligence Tools

Alex Meadows TriLUG, January 2012

Agenda

Business Intelligence Overview Review of OSBI Tools


Data Warehousing Data Integration Reporting/OLAP Visualization Statistical Analysis/Predictive Analytics

What Is Business Intelligence?

Utilizing technology to identify and analyze trends in data to make better business decisions.

Overlapping Fields

Source: Back In Business, Klimberg, Miori (www.informs.org)

Competing On Analytics

Source: Competing on Analytics; Thomas Davenport, Jeanne Harris

Phases of Growth

The Three Types of Questions

What happened?

How was performance last week? How is performance right now? What can I do to reach our goals?

What is currently happening?

What will happen?

Data Warehousing

Store data outside of application/normal business environment (i.e. ERP systems) Specific for reporting/analytics Modeling Styles

3NF (normal database modeling) Data Marts (aka star schemas) Data Vault (hybrid 3NF/Data Mart) Anchor Modeling (6NF)

Data Warehousing

Databases

MySQL, Postgres, etc Infobright*, LucidDB, InfiniDB*, etc. Greenplum* (both RDBMS and Columnar) Hadoop, CouchDB, MongoDB, etc.

Columnar Data Stores

Hybrid Data Warehouse Databases

NoSQL

*Hardware and/or Software limitations in community editions

RDBMS vs Columnar

Source: http://www.calpont.com/column-oriented-database-bi

NoSQL?

Not Only SQL Unstructured/semi-structured data Huge (multi-terrabyte to petabyte+ data sets)

Source: http://www.information-management.com/specialreports/20040622/1005301-1.html

Data Integration

Syncing data across systems Includes:


ETL (Extract, Transform, Load) MDM (Master Data Management) EAI (Enterprise Application Integration) EII (Enterprise Information Integration)

Talend

Data Management Tool Suite


ETL MDM Data Profiling Data Quality

Code generator Eclipse based Extensible plugin architecture

Pentaho K.E.T.T.L.E.

Kettle Extraction, Transport, Transformation, and Loading Environment Focus on ETL Extensible plugin architecture Engine based

Reporting

Focus: Historical Analysis

Reporting Options
MDX BIRT Pentaho JasperReports SQL Power Wabit Saiku Pivot Table Charting SQL Other Sources* Drill Parameterized Through

*Flat Files, NoSQL, etc.

BIRT Example

Visualization

Focus: Trending and Present

Pentaho CDE/CDF

Dashboard framework and editor built into Pentaho BI Server Community developed uses open web languages (Javascript, HTML, etc).

Statistics/Predictive Analytics

Focus: All relevent data used to predict outcomes

Statistics/Predictive Analytics

R stats oriented Weka machine learning oriented RapidMiner mixed


Originally YALE Weka and R Plugins Like SAS Enterprise Miner

BI From Reporting to Statistical Analysis


ETL Jaspersoft * Pentaho SpagoBI * * Metadata Reporting Dashboards OLAP*** ** ** Statistics Automated Decisions

* Utilizes Talend ETL **Utilizes Weka Data Mining ***All use Mondrian for OLAP, with different front ends

Shameless Plug

RTP Pentaho User Group


On LinkedIn (soon to be also on Meetup) Meets quarterly

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy