Complete Pentaho Installation on Ubuntu, Part 1

[Note: There is an update on version 5.x install instructions]

Objetive

Install a complete Pentaho working environment on an Ubuntu system using MySql as Database.

Pentaho is the result of the merging several open source projects and some additional tools into a reliable and productive information system aka Business Intelligence Suite.

Audience

These notes are about setting up a system from zero on a Linux system. They don’t take you by the hand, they asume basic computer expertise (If you know your way arround windows, that’s enough).

Maybe you’re a power user and have outgrown your Excel system, and are looking for something more robust and automated. Here you can browse and see how much work is needed to install Pentaho and make your work/life a little easier.

Maybe you use these tools on MS Windows and want fast notes for making it work on Linux Ubuntu. Or you have set it up some Pentaho tools but want to add functionality.

And specially if you’re in a company that cannot afford or haven’t decided on a BI suite, then you can use this notes to start a project and start discovering relevant information about your company and clients.

Scope

I started using the ETL tool to extract and merge Oracle application data from the main system and merge it with data from other systems, even excel reports. Some time later I added filtering and clasification in ETL and developed some reports. Then the CFO asked for aggregated data to look for business trends (multidimensional analisys), and that led to a web server with automated jobs and reports.

Several articles/posts will be needed to show you how to set up it up. If you’re a small group of people, I recomend following this order, step at a time. That will give time to get used to each tool (needed for open source projects)  and use its functionallity to get results, then you can build on it. On the other hand if you can establish a company project, you can use agile and frequent deployments and initiate parallel tasks, and install all:

  1. Install the Pentaho BI Server, version 4.5 (since 3.8) [this document]
  2. Modify databases to MySQL. Add your Database too.
  3. Set Up New Add-Ins Like Saiku
  4. Add your Users
  5. Prepare your ‘Production’ Server
  6. Install PDI/Kettle and Agile PDI in a Development Environment
  7. Install Pentaho Design Studio
  8. Install Mondrian Schema Editor
  9. Install Pentaho Metadata Editor
  10. Install Report Designer
  11. Install the Dashboard Editor
  12. Install Pentaho/Mondrian Aggregation Designer
  13. Additional tools
  14. Install Weka

Suggestions are welcome.

What you’ll need

Standard equipment and software:

  • Set up a standard PC.
    Any PC on special deals these days will do. Im using a laptop with 3GB RAM, 230GMB Hard disk, two cores and its confortable.
    Remember that you can upgrade this installation to a high availability and reliability, just by adding more and better servers and using open source software.
  • Set up Java development (SDK). You can use this easy Synaptic guide or this one.
  • Install MySQL. Check this that configures root password and a tool to browse your database. Now that you installed the query browser use the same method to install SQL Adminitrator.

Install the Pentaho BI Server, version 4.5 (3.8, 3.9~4.0)

Get the Software

Create a Pentaho folder under your home directory, that’s where everything will end up.

In Pentaho sourceforge.net, click in  ‘Business Intelligence Server’, and click again in the latest stable version. To download it:

May 7th, 2012: biserver-ce-4.5.0-stable.tar.gz [343MB].
Note: 7-zip report ‘data after end of file’, but I have found no errors yet. Not even the ones I mentioned in the earlier releases.

If you need an older version:

  • October 2011: biserver-ce-3.10-stable, [217MB]  it’s the 4.1 version, really!
    This one fixes the save query button bug. But adds the broken navigation in jpivot. After installing it you need to follow dduenas guide on the Pentaho forum:
    1. Stop the server.
    2. In /biserver-ce/tomcat/webapps/pentaho/WEB-INF/lib/ rename
    jpivot-1.8.0-100420.jar into something like  jpivot-1.8.0-100420.bak
    3. Download jpivot-1.8.0-100420.rar [link] and copy it in that folder.
    4. Restart the server.
  • September 2011: biserver-ce-3.9-stable. (On July was the release candidate) That corresponds to the community edition for the Pentaho BI Server 4.0 Enterprise, released as explained here  (The CE is nice, just lacks the ‘agile’  funtionality in the BI and browser).
  • April 2011: biserver-ce-3.8.0-stable.tar.gz

Extract (you can double click on file) ‘biserver-ce’ and ‘administration-console’ to your Pentaho folder.

Set it Up

You’ll need to start a terminal window (Aplication->accesories->terminal) and navigate to your folder to make shure your shell files are excecutable (This explains the command in detail).

$ sudo chmod +x *.sh Pentaho/

Now we’ll add the JAVA_HOME variable to the start-pentaho.sh and stop-pentaho.sh files.

$ cd Pentaho/biserver-ce
$ gedit ./start-pentaho.sh

Add the line: export JAVA_HOME=”/usr/lib/jvm/java-6-openjdk”

(or export JAVA_HOME=/usr/lib/jvm/java-6-sun if you used the second guide) after line 13:

cd –

like this image:

Edit JAVA_HOME in start file

Save the file and exit the editor

Now you can start Pentaho BI Server in the terminal:

[ in the home/Pentaho/biserver-ce/ folder ]
$ ./start-pentaho.sh

Start your browser and go to http://localhost:8080/pentaho. Use  Joe / password to login and check the examples on the left navigation bar. They are really interesting.
On 3.9 (-4.5) there is no select box like in previous versions, but you can click on the ‘evaluation login’ link to see the users and passwords to use.

Pentaho BI Server Console

Pentaho BI Server Console

When you’re done close the browser, and in the terminal, shut down the server.

[ in the home/Pentaho/biserver-ce/ folder ]
$ ./stop-pentaho.sh

PD. If something goes wrong please check (or post on forums) the problem in the log files on:

  • /Pentaho/biserver-ce/tomcat/bin/pentaho.log
  • /Pentaho/biserver-ce/tomcat/logs/

Edits

  • June 2011:
    I found a well edited video on installing BI Server in a remote ubuntu machine here.
  • Important: September 2011. About Jpivot & PAT:
    On the stable release on 3.9 it was anounced the official replacement of jpivot by PAT. This tool works great for interactive report and olap query building on the enterprise version.
    On the community edition a message is displayed: “no longer be enhanced or offically supported”
    Dont worry, this has been known for a while. But jpivot works as allways. Check this post (No.8) on how to remove the warnings.
  • October 2011, Added some resources:
    – A brief description of the software projects in the BI suite.
    – Introduction to the Pentaho BI suite [pdf]

42 thoughts on “Complete Pentaho Installation on Ubuntu, Part 1

  1. thank you for your article, but for me Pentaho don’t start when i do this cmd:./start-pentaho.sh
    ./start-pentaho.sh
    i have this :

    /home/mahdi/Pentaho/biserver-ce
    /home/mahdi/Pentaho/biserver-ce
    DEBUG: Using JAVA_HOME
    DEBUG: _PENTAHO_JAVA_HOME=”/usr/lib/jvm/java-6-openjdk”
    DEBUG: _PENTAHO_JAVA=”/usr/lib/jvm/java-6-openjdk”/bin/java
    Using CATALINA_BASE: /home/mahdi/Pentaho/biserver-ce/tomcat
    Using CATALINA_HOME: /home/mahdi/Pentaho/biserver-ce/tomcat
    Using CATALINA_TMPDIR: /home/mahdi/Pentaho/biserver-ce/tomcat/temp
    Using JRE_HOME: ”/usr/lib/jvm/java-6-openjdk”
    Using CLASSPATH: /home/mahdi/Pentaho/biserver-ce/tomcat/bin/bootstrap.jar

    and nothing after and then when i enter the url for lunch Pentaho i have nothing.

    • Mahdi,
      The string you post are the same that I have on my installation. I think apache is starting but something goes wrong.
      Check the files on /home/mahsi/Pentaho/biserver-ce/tomcat/logs/
      (it may be help to delete all af them and try to start the server, the new files will be smaller)
      catalina.out will contain the date, module and text message by line.
      There will be detail messages, warnings, etc. but look at the lines with ERROR, and post them.
      In a successfull startup you could see in the log:
      Pentaho BI server ready.
      INFO: Server startup in 34843 ms
      [Edit: June 25th, 2011]
      On the other hand those arte messages if you’re starting a second instance of tomcat. Why don’t you try http://localhost:8080/ before starting bi server. If you’re succesfull try changing the port of pentaho as explained in post 5 of this series.

  2. Hi. I would like to give you my contribution for this work. It helped me. Maybe it will help you to.
    How can I send it?

  3. Hi, thanks for creating this. I found this after installing BI Server 3.8 on Debian 6. My next step was to either use Mysql or Postgres for the repository. Not sure which one I should use.

    Cheers!

  4. this post really helps me a lot to install my Pentaho BI CE even though it is on SuSE server. By the way, this question maybe out of route, what plugin did you use on comment form? that the user able to login using their tweeter and facebook account.

    Thanks!

  5. Pingback: Complete Pentaho Installation « splittingelectrons

  6. hi…when i check the Pentaho/biserver-ce/tomcat/logs/ log file this is the error message i get

    usr/lib/jvm/java-6-sun/bin/java: not found….basically when i type localhost in the web page its not opening up

  7. Just a simple question :

    How important it is to migrate from HSQLDB to MySQL (for Quartz and Hibernate) ? Do we get better performance using MySQL (or any DB other than HSQLDB) ? If we keep the BI Server untouched configuration-wise (but removing all the samples, etc), it is good enough for production ?

    • The HSQLDB database in the Pentaho BI Demo is loaded each time into memory. So its performance is fine for presentations.
      In your production environment you’ll need your own permanent database (tunning, backups, maintenance, size) either on hard disk or SSD, so it was natural to move all to MySql for me, but you can live with both engines if you like.
      I like to keep the demo programs in new instalations for a few months as many users need to be shown what can be built.
      PD. Quartz and Hibernatecontrol schedules and access so yes, you should migrate those.

  8. Hi! I just like to ask what I should do when I have installed biserver-ce on ubuntu server too, but when I open it on a browser, it just tells me that “It Works!”, which is what I am not expecting to see…what should I be editing? please help…thank you very much…i need to have the login page for pentaho…thanks in advance

  9. Hi,

    Are this warnings normal?

    14:45:16,333 WARN [PackageManager] Unresolved dependency for package: org.pentaho.reporting.engine.classic.extensions.datasources.cda.CdaModule
    14:45:16,392 WARN [PackageSorter] A dependent module was not found in the list of known modules.
    14:45:28,526 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:29,492 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:29,801 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:30,712 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:31,992 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:32,632 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:33,074 WARN [AxisService] Unable to generate EPR for the transport : http
    Warning: Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
    14:45:33,698 WARN [DefaultSchemaGenerator] We don’t support method overloading. Ignoring [public java.lang.String serializeModels(org.pentaho.metadata.model.Domain,java.lang.String,boolean) throws java.lang.Exception]
    14:45:33,770 WARN [DefaultSchemaGenerator] We don’t support method overloading. Ignoring [public java.lang.String serializeModels(org.pentaho.metadata.model.Domain,java.lang.String,boolean) throws java.lang.Exception]

    • First.
      About the stylesheets and serializing, yes there is no problem.
      There are lots of warnings, even SEVERE ones, but nothin that makes the system unusable.
      Second
      The CDA makes lost of notes on the logs but not the one that you show. I would reinstall the whole ccc tools.

      • i’ve just noticed that if i install the entire ccc tools my server doesn’t stop when i do the shutdown. I have to do a kill -9 to kill it.
        I’ve figured out that this only happens when i install CDB, if choose not to install it, the server shuts down normaly. Anyone experienced this problem?

        Another problem i found is that Fusion Chars don’t show up as component in the CDE, is this normal? And if i run the CDE Fusion Chart sample it just shows a blank page.

      • There is definetly something wrong in your installation.
        That thing that it refuses to shut down doesnt even happen to me in windows 🙂
        All charts wok for me (I had to modify some of them in previus versions but thy worked Ok in 4.5.
        I have no clue, sorry.
        I think its better to start from zero.

      • What do you mean by reinstall the ccc tools? I am trying to install biserver-ce on Linux Ubuntu 12:10 and I get the same problem with Pentaho BI Server 4.8.0 and 4.5.0…
        Same as Ricardo above..
        14:45:16,333 WARN [PackageManager] Unresolved dependency for package: org.pentaho.reporting.engine.classic.extensions.datasources.cda.CdaModule
        14:45:16,392 WARN [PackageSorter] A dependent module was not found in the list of known modules.

        Sadly it works fine on Windows.

  10. I just installed from zero, didn’t change BD to MySQL, just installed the server and the cctools and fusion charts plugins.

    Regarding the problem of the server not stopping, i just confirmed installing only one component of ctools at a time, instead of installing all, the problem only happens if i install CDV or CDB.

    With the fresh installation the Fusion Charts still don’t appear in the Charts components section of CDE 😦

    I forgot to mention earlier that i’m using CentOS 6.3 (64 bit), dunno if has any influence.

  11. Hello
    I have the following problem.
    Pentaho not start. The catalina.out log file contains the following message.
    It may not be wrong.

    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/catalina/startup/Bootstrap
    Caused by: java.lang.ClassNotFoundException: org.apache.catalina.startup.Bootstrap
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    Could not find the main class: org.apache.catalina.startup.Bootstrap. Program will exit.

  12. Pingback: Pentaho CE 4.8 – BI Server Update | Interesting IT Tip's

  13. Pingback: Installation complète de Pentaho sur Ubuntu, Partie 1 | Eric Vernichon's Blog

  14. Francisco ,

    Great Post. Thank for the information. it was very useful.
    I have a problem.
    1)I need to access the Transformation and jobs create from community edition Penatho Data-integration 4.4.0 version in BI server 3.8.
    I use Ubuntu 12.4.0 .
    2) Whenever I try to create a new data source through BI Server 3.8 community edition , The submit button doesnt works.

    Any help will be appriciated.

    Thanks,
    Sripada

  15. Hi,

    Any chance you will be updating this information for the new BI Server 5.0? This was a great help to me when I started using Pentaho. BI server storage is very different now and some configs are different.

  16. Hi,
    This is the output of $JAVA_HOME from my PC. Please note that I am using rhel operating.

    [root@vertica-srv1 home]# echo $JAVA_HOME
    /usr/lib/jvm/jdk1.7.0_79/

    As per the instruction which you have mentioned above this is what i gave inside start-pentaho.sh

    [pentaho@vertica-srv1 data-integration-server]$ vi start-pentaho.sh
    ### ====================================================================== ###

    DIR_REL=`dirname $0`
    cd $DIR_REL
    DIR=`pwd`
    #cd –

    export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_79/
    . “$DIR/set-pentaho-env.sh”
    ========================================================================

    But after giving JAVA_HOME path inside start-pentaho.sh file. I still receive following DEBUG error :-

    [pentaho@vertica-srv1 data-integration-server]$ ./start-pentaho.sh
    DEBUG: Found JAVA two folders up
    DEBUG: _PENTAHO_JAVA_HOME=/home/pentaho/business_process/server/data-integration-server/../../java
    DEBUG: _PENTAHO_JAVA=/home/pentaho/business_process/server/data-integration-server/../../java/bin/java
    Using CATALINA_BASE: /home/pentaho/business_process/server/data-integration-server/tomcat
    Using CATALINA_HOME: /home/pentaho/business_process/server/data-integration-server/tomcat
    Using CATALINA_TMPDIR: /home/pentaho/business_process/server/data-integration-server/tomcat/temp
    Using JRE_HOME: /home/pentaho/business_process/server/data-integration-server/../../java
    Using CLASSPATH: /home/pentaho/business_process/server/data-integration-server/tomcat/bin/bootstrap.jar

    Please kindly suggest how could i get solved DEBUG issue

    DEBUG: Found JAVA two folders up
    DEBUG: _PENTAHO_JAVA_HOME=/home/pentaho/business_process/server/data-integration-server/../../java
    DEBUG: _PENTAHO_JAVA=/home/pentaho/business_process/server/data-integration-server/../../java/bin/java

    JUST INCASE IF YOU WANT TO SEE MORE JAVA OUTPUT INFORMATION THEN HERE IT IS :-

    [pentaho@vertica-srv1 data-integration-server]$ which java
    /usr/bin/java

    [pentaho@vertica-srv1 data-integration-server]$ java -version
    java version “1.7.0_79”
    Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
    Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

    [pentaho@vertica-srv1 data-integration-server]$ javac -version
    javac 1.7.0_79
    [pentaho@vertica-srv1 data-integration-server]$

    Its been a many days i have been doing RND on the above issue but still could not solved it. It will be great help if you could helped me solving the above issue.

    Looking forward for your kindful response.

    Thank You

    Ujjwal Rana

    • Hi Ujjwal Rana,
      Your java path seems ok.

      In the debug messages there is a strange folder “/home/pentaho/business_process/”. There is nothing on that level but “biserver-ce” and that is the tomcat-pentaho install.
      I suspect that there is an issue with the untar/unzipped process.

Leave a comment