Complete Pentaho Installation on Ubuntu, Part 1

[Note: There is an update on version 5.x install instructions]

Objetive

Install a complete Pentaho working environment on an Ubuntu system using MySql as Database.

Pentaho is the result of the merging several open source projects and some additional tools into a reliable and productive information system aka Business Intelligence Suite.

Audience

These notes are about setting up a system from zero on a Linux system. They don’t take you by the hand, they asume basic computer expertise (If you know your way arround windows, that’s enough).

Maybe you’re a power user and have outgrown your Excel system, and are looking for something more robust and automated. Here you can browse and see how much work is needed to install Pentaho and make your work/life a little easier.

Maybe you use these tools on MS Windows and want fast notes for making it work on Linux Ubuntu. Or you have set it up some Pentaho tools but want to add functionality.

And specially if you’re in a company that cannot afford or haven’t decided on a BI suite, then you can use this notes to start a project and start discovering relevant information about your company and clients.

Scope

I started using the ETL tool to extract and merge Oracle application data from the main system and merge it with data from other systems, even excel reports. Some time later I added filtering and clasification in ETL and developed some reports. Then the CFO asked for aggregated data to look for business trends (multidimensional analisys), and that led to a web server with automated jobs and reports.

Several articles/posts will be needed to show you how to set up it up. If you’re a small group of people, I recomend following this order, step at a time. That will give time to get used to each tool (needed for open source projects)  and use its functionallity to get results, then you can build on it. On the other hand if you can establish a company project, you can use agile and frequent deployments and initiate parallel tasks, and install all:

  1. Install the Pentaho BI Server, version 4.5 (since 3.8) [this document]
  2. Modify databases to MySQL. Add your Database too.
  3. Set Up New Add-Ins Like Saiku
  4. Add your Users
  5. Prepare your ‘Production’ Server
  6. Install PDI/Kettle and Agile PDI in a Development Environment
  7. Install Pentaho Design Studio
  8. Install Mondrian Schema Editor
  9. Install Pentaho Metadata Editor
  10. Install Report Designer
  11. Install the Dashboard Editor
  12. Install Pentaho/Mondrian Aggregation Designer
  13. Additional tools
  14. Install Weka

Suggestions are welcome.

What you’ll need

Standard equipment and software:

  • Set up a standard PC.
    Any PC on special deals these days will do. Im using a laptop with 3GB RAM, 230GMB Hard disk, two cores and its confortable.
    Remember that you can upgrade this installation to a high availability and reliability, just by adding more and better servers and using open source software.
  • Set up Java development (SDK). You can use this easy Synaptic guide or this one.
  • Install MySQL. Check this that configures root password and a tool to browse your database. Now that you installed the query browser use the same method to install SQL Adminitrator.

Install the Pentaho BI Server, version 4.5 (3.8, 3.9~4.0)

Get the Software

Create a Pentaho folder under your home directory, that’s where everything will end up.

In Pentaho sourceforge.net, click in  ‘Business Intelligence Server’, and click again in the latest stable version. To download it:

May 7th, 2012: biserver-ce-4.5.0-stable.tar.gz [343MB].
Note: 7-zip report ‘data after end of file’, but I have found no errors yet. Not even the ones I mentioned in the earlier releases.

If you need an older version:

  • October 2011: biserver-ce-3.10-stable, [217MB]  it’s the 4.1 version, really!
    This one fixes the save query button bug. But adds the broken navigation in jpivot. After installing it you need to follow dduenas guide on the Pentaho forum:
    1. Stop the server.
    2. In /biserver-ce/tomcat/webapps/pentaho/WEB-INF/lib/ rename
    jpivot-1.8.0-100420.jar into something like  jpivot-1.8.0-100420.bak
    3. Download jpivot-1.8.0-100420.rar [link] and copy it in that folder.
    4. Restart the server.
  • September 2011: biserver-ce-3.9-stable. (On July was the release candidate) That corresponds to the community edition for the Pentaho BI Server 4.0 Enterprise, released as explained here  (The CE is nice, just lacks the ‘agile’  funtionality in the BI and browser).
  • April 2011: biserver-ce-3.8.0-stable.tar.gz

Extract (you can double click on file) ‘biserver-ce’ and ‘administration-console’ to your Pentaho folder.

Set it Up

You’ll need to start a terminal window (Aplication->accesories->terminal) and navigate to your folder to make shure your shell files are excecutable (This explains the command in detail).

$ sudo chmod +x *.sh Pentaho/

Now we’ll add the JAVA_HOME variable to the start-pentaho.sh and stop-pentaho.sh files.

$ cd Pentaho/biserver-ce
$ gedit ./start-pentaho.sh

Add the line: export JAVA_HOME=”/usr/lib/jvm/java-6-openjdk”

(or export JAVA_HOME=/usr/lib/jvm/java-6-sun if you used the second guide) after line 13:

cd –

like this image:

Edit JAVA_HOME in start file

Save the file and exit the editor

Now you can start Pentaho BI Server in the terminal:

[ in the home/Pentaho/biserver-ce/ folder ]
$ ./start-pentaho.sh

Start your browser and go to http://localhost:8080/pentaho. Use  Joe / password to login and check the examples on the left navigation bar. They are really interesting.
On 3.9 (-4.5) there is no select box like in previous versions, but you can click on the ‘evaluation login’ link to see the users and passwords to use.

Pentaho BI Server Console

Pentaho BI Server Console

When you’re done close the browser, and in the terminal, shut down the server.

[ in the home/Pentaho/biserver-ce/ folder ]
$ ./stop-pentaho.sh

PD. If something goes wrong please check (or post on forums) the problem in the log files on:

  • /Pentaho/biserver-ce/tomcat/bin/pentaho.log
  • /Pentaho/biserver-ce/tomcat/logs/

Edits

  • June 2011:
    I found a well edited video on installing BI Server in a remote ubuntu machine here.
  • Important: September 2011. About Jpivot & PAT:
    On the stable release on 3.9 it was anounced the official replacement of jpivot by PAT. This tool works great for interactive report and olap query building on the enterprise version.
    On the community edition a message is displayed: “no longer be enhanced or offically supported”
    Dont worry, this has been known for a while. But jpivot works as allways. Check this post (No.8) on how to remove the warnings.
  • October 2011, Added some resources:
    – A brief description of the software projects in the BI suite.
    – Introduction to the Pentaho BI suite [pdf]
Advertisements

Web Page Testing with Firefox 4.0 (and HTML 5 Dojo Toolkit Data Tags)

We should verify that our new dojo-tagged form example (Demo_Form_v16.html) still works in our automated environment. But Firefox 4.0 no longer have the jssh plugin and firewatir is no longer an option. The watir installation page suggests use of watir-webdriver, which has a runtime dependency on selenium -I’ll have to check this version 2.0 option-.

So we’ll uninstall firewatir and then install watir-webbriver (see detailed install instructions) using the command shell:

$ sudo gem uninstall firewatir
$ sudo apt-get install ruby1.8-dev
$ sudo gem1.8 install watir-webdriver

Now the ruby definition code needs two modifications that you can see in green in the following code: 1) the require and, 2) instantiation of browser (ff):

#includes
require 'rubygems' # optional
require 'watir-webdriver'
require 'test/unit'

class TC_recorded < Test::Unit::TestCase
 def test_webpage
  ff = Watir::Browser.new :firefox

Our code won’t run yet, check this for other differences. In our case, our file only needs two more methods to be changed:

  1. ‘.clearSelection’ is now ‘.clear’ in multiselect
  2. ‘.contains_text(,)’ in assertion now drops the error text and becomes .text.include?()

Check the final code ‘demo_form.rb‘ that can be run with:

$ ruby demo_form.rb
Form web page test result

Form web page test result

watir-webdriver seems faster than firewatir and you can test against chrome as well, I like the new tool. Documentation: class list,

Note: Thanks to Al Hoang on a post for fixing ‘mkmf error’ install problem with Ruby on ubuntu, XPlayer for its test article, and Alister Scott for maintaing the watir site.


[Edit May 12th, 2011]

In my Ubuntu 10.04, I just uninstalled Ruby 1.8.7, using synamtic, also installed Ruby 1.9.1 and rubygems andreinstalled rails and click on apply. Result: The watir test works!. So go ahead, upgrade!

DojoToolkit 1.6 and HTML 5

This is our first step to make our form example a HTML 5 compliant web page, we’ll need to make small modifications to our dojo toolkit widget’s tags. These will not be necesary until 2.0 when the actual notation eill be deprecated, but we can start our migration now.

First, download the full 1.6 version (3.7MB) of dojo and replace our old 1.5 files in /var/www/dojo (in ubuntu). Then acording to the data-attribute documentation -you may want to check the 1.6 release too- we need to:

  1. Change the doc type (<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN” “http://www.w3.org/TR/html4/strict.dtd”&gt;) to:
    <!DOCTYPE htlm=html>
  2. Change dojo config (src=”/dojo/dojo/dojo.js” djconfig=”isDebug: true, parseOnLoad: true”) to:
    src=”/dojo/dojo/dojo.js” data-dojo-config=parseOnLoad: true, isDebug: true
  3. Do a global replace from ‘dojoType=’ to:
    data-dojo-type=’
  4. For each widget it is necesary to check attributes and move them inside the tag data-dojo-props.
    In our form, the date value was set to “value=’2010-08-31′”, the name must be repeated also as I found out thanks to this conversation between Jérôme Despatis and Peter Higgins, and now is:
    data-dojo-props=name:’dtText’, value:’2010-08-31′.
    In the Checkbox and Radio Button the checked state and value are also included.
  5. The events also need modification, as they are evaluated, so in the button the onclick=”getValues()” is now:
    data-dojo-props=onClick:function(){getValues();}.  Notice that the event is now in camelCase form.

This is the final code: Demo_Form_v16.html. [this is a pdf file wich contains the code]

And this is how it runs in Firefox 4.0, Opera 11.10 and Chromium 10.0.648:

The actual code running un three browsers

Firefox, Opera, Chrome