Complete Pentaho Installation on Ubuntu, Part 14

By now you know the tools to get data from any data source in your company, clean, transform them and add appropiate performance inditactors (KPI), present them into reports, excel files or dashboards that is used by including menues in the PUC or automatically distributing them to your users that act on them or explore the data.

For those users specially statistically oriented ones there are additional set of tools that allows machine learning and data mining. In its own words “Its broad suite of classification, regression, association rules and clustering algorithms can be used to help you understand the business better and also be exploited to improve future performance through predictive analytics.”

Here is a nice history Weka Presentation.

How to Install

  • Download a stable version here.
  • Unzip its content into our Pentaho folder.
  • Open a terminal navigate to Pentaho/weka-3-6-5/
  • Type:
    java -jar weka.jar
  • Now you can click on the explorer application button.

    WEKA Explorer

    WEKA Explorer

  • Note that these tools work on flat files that you prepare on your BI suite.

There are executable windows version and mac notes on this page.


Article on forecasting using time series here.

Project documentation page: Tutorials (command line ang GUI interface), Manuals, FAQ, Docs, API, Wiki. The Pentaho forum. Pentaho Weka Flyer.

Youtube videos 1classifier.

Other tools,,

Complete Pentaho Installation on Ubuntu, Part 13

Additional Software

There are lots of plugins that enrich the pentaho BI suite. There is a plugin page for the BI Server PUC and for the PDI). I still haven’t tried them all but here are some interesting ones anyway:

BI Open Flash Charts

You can now add the Open Flash graphs in your dashborads in addition to the Open Flash Chart and JFreeChart. This plugin was started from the Pentaho framework so building with it should be familiar. But the stunning graphs come with a v3 which is not OS.

Here are links to the FusionCharts Blog, the Open source version, and the Pentaho plugin.

You can download the 0.02 version and its samples.

To install

  1. Extract the zip file into the /Pentaho/biserver-ce/pentaho-solutions/system.
  2. Extract the samples zip file into the /Pentaho/biserver-ce/pentaho-solutions/bi-developers.
  3. Change the file system/pentaho.xml to include xfusion on the acl-files list:
  4. Open the Pentaho User Console (PUC) and refresh the solution repository.

Here you can see installation, demo and usage in a Youtube video.

Ruby Step for PDI

Slawomir Chodnicki released on march 2011 the Ruby plugin step for Kettle 4 here.

There are examples included in the file you should download in the github page. Click on downloads and the select You already know thay it should be unziped on the plugins folder.

In a forum post some caracteristics are mentioned.

Excel Writer Output PDI Plugin

It is no longer a plugin as it is included in the 4.2 Kettle release, but post about its usage are still labeled as that 🙂

It let’s you set more options on formating and range.

Release notes and usability post.

BI iPhone plugin

Edit: The infomation below is no longer acurate:

  • Since 3.8 the BI server includes the iPad code, as stated by richad3 on the BI forum. But for 4.0 improvements were made on a week of fun.
    It seems the Enterprise plugin works great with the iPad, check the video. On the CE Edition the PAT/jPivot should work too, I’ll let you know what I find.
  • An alternate option for mobile devices is made on this OS project: PentaGoMo
  • The new site redesign has made the original code (the one that needed fixing) unavailable redirecting everything here.
    That’s bad if you still want to play with your Blackberry, Android or iPhone devices. I’ll let you know what I find.

The original information will remain here until new links and information is found:

The original article about a plugin for the iPhone is as old as this device, it was made available by the now vicepresident of engineering in Pentaho.

The BI PUC detects the browser and present a special menu. When you select an action (for report, dashboard, etc) a special program makes the parameter selection easier and presents one by one. Then the report is shown, you can see this video.

Unfortunatelly some corrections have to be made to the code to work with the new Pentaho BI version. Here are the download and correction instructions from Will Gorman on 2008 and additional ones on 2011 by Herwin Rayen. And a tech-tip so you can modify it further in the BI version 3.0.

There is also an Android app in the marketplace, here is the forum post that mentions it and the link to the 3.5 version.

In the Pentaho Blog an anouncement was made on summer 2011 about a iPad prototype. If you can’t wait “expand the plug in to cover iPad, just modify the Java source code, recompiled its class, then updated the JAR accordingly. This method can be used to expand the plug in to cover Windows Mobile, BlackBerry, and Android” – Paul Pambudi.

If you dont have all these devices you can check the emulators in Firefox’: User Agent Switcher add-on or web-based iPhone browser emulator.

BI BIRT Report ‘Plugin’

You can use the BIRT report view engine in the BI Pentaho User Console (PUC) which is newer and different than the plugin that is used in pentaho.

You have to download an eclipse runtime and extract a directory to your pentaho tomcat/webapss/pentaho folder and also download the samples to your pentaho-sulutions folder. Check the description, instructions and samples here.

Or you can use the plugin for PDI/Kettle to just run and burst your BIRT specifications. Check this link.

Data Cleaning [future] Plugin for PDI

Data validation needs coding in the actual PDI, but a nice open source utility for doing validation and correction exist and it seems it will become part of the PDI soon.

Data Cleaner can analyze, profile, transform and clean data on its own. But Matt Casters is working on a plugin so Kettle can use it. Here is the link that briefly shows its functionality and mentions the plugin here.

GeoSpatial Analisys on Kettle

On July 2011 version 2.0 of GeoKettle was announced by here. It is a step add in for Kettle 4.0 that allows “spatial analysis functions such as buffer calculations, overlays, metric operators, etc” from and to different file formats. It even reads sensors. Sounds like fun.

Weka Plugin

You can use the Knowledge Flow Plugin, that lets you use a weka predictive model as a step in a PDI transformation. Install an usage here.