Complete Pentaho Installation on Ubuntu, Part 14

By now you know the tools to get data from any data source in your company, clean, transform them and add appropiate performance inditactors (KPI), present them into reports, excel files or dashboards that is used by including menues in the PUC or automatically distributing them to your users that act on them or explore the data.

For those users specially statistically oriented ones there are additional set of tools that allows machine learning and data mining. In its own words “Its broad suite of classification, regression, association rules and clustering algorithms can be used to help you understand the business better and also be exploited to improve future performance through predictive analytics.”

Here is a nice history Weka Presentation.

How to Install

  • Download a stable version here.
  • Unzip its content into our Pentaho folder.
  • Open a terminal navigate to Pentaho/weka-3-6-5/
  • Type:
    java -jar weka.jar
  • Now you can click on the explorer application button.

    WEKA Explorer

    WEKA Explorer

  • Note that these tools work on flat files that you prepare on your BI suite.

There are executable windows version and mac notes on this page.

Documentation

Article on forecasting using time series here.

Project documentation page: Tutorials (command line ang GUI interface), Manuals, FAQ, Docs, API, Wiki. The Pentaho forum. Pentaho Weka Flyer.

Youtube videos 1classifier.

Other tools

http://www.r-project.org/, http://rattle.togaware.com/, http://www.knime.org/