Complete Pentaho Installation on Ubuntu, Part 2

[Edit]

This post was written with Pentaho version < 4.8 in mind. Most of it is still usefull specially the database setup. Some things change a little, like point nine, the datasource [re]definition that now must be done on Pentaho console itself or the examples section as they no longer exist.

Please check this series for version 5.x.


Modify Pentaho Demo Databases to MySQL Add your Database too

The Pentaho BI Server that you configured in last post uses HSQLDB engine. It creates and loads the database in memory each time you start the demo.

That’s fine for a demo, but we prefer a more permanent storage, so we’ll create six MySql schemas, their users, tables and corresponding ‘datasource’ definitions for our Pentaho system:

  • Two system databases: hibernate and quarz, the first one is used for configuration and the other one is for scheduling.
  • Another one for steelwheels data in the sampledata database, this is the one used in the Pentaho/HSQLDB demo.
  • Add a more complex data set: foodmart. This was the original database for mondrian examples (multidimensional project in Pentaho).
  • An empty databse for us to load and clean up data: loadig_area.
  • And the last one called datamart for our tables to report from.

Setting them up takes a few administration activities for which we’ll use the administrative MySql graphical tools. Then edit seven text files -more or less- to configure Pentaho to look for MySql insted of HSQLDB. We’ll be up and running in a couple of hours.

What you need Two things are for migration and one for your a special DB source of data:

  1. MySQL ‘database engine’ tested. MySQL root user with password. Tools: MySql Administrator and MySql Query Browser.
  2. SQL Scripts that create databases and load data content. We’ll use the scripts from Analysethis, PHI-Integration.com and OSBI sites. [The Pentaho wiki says you shouldn’t link to them until their migration is compleated, so if it is broken there are another links in the comments]
  3. Optional: Your IP, user, password, schema name and software driver for your particular database. You’ll get this from your database administrator. It we’ll be easy for him/her if you remind them that the access is only read-only no excecution allowed.

If you did the install MySql step in last post, you have a working DB engine now, if not, check the recomended guide or search the web for your favorite one.

Download two files with the scripts for SteelWheels & MySql (130KB) and Foodmart (7.9MB). Extract the scripts to something like /Pentaho/sql_scripts.

For the drivers, download:

  • MySql 3.14 connector (mysql-connector-java-3.1.14-bin.jar) or 3.17 from here. The 5.1 driver sometimes has issues on some systems.
  • Optional: If you have an Oracle system you’ll need the ojdbc14.jar and orai18n.jar. Edit june 2011: You can get them from de PDI. Download the file -we are going to install it on part 6– and copy them from /Pentaho/data-integration/libext/JDBC .

Steps to Do

# Activity HSQLDB ->MySql New Datasource
1 Create your databases  X
2 Load scripts to sample databases X
3 Create system/database users and grant them permisions X
4 Modify system connections to MySql: hibernate and quartz X
5 Redirect Hibernate itself to MySql X
6 Set the spring-security to be on MySQL database. X
7 Modify the list with drivers to use for each jndi connection  X X
8 Place the drivers in the tomcat\common\lib folder X X
9 Edit the scripts to avoid HSQLDB to start X
10 Start the console administraton for datasource setup X X
11 Test your system X X

1. Create your Databases

If you haven’t use MySql tools, then here you’ll get your first acquaintance with them. They will be very usefull for administration/backups and data querying.

Start MySQL Query Browser (Ubuntu menu: Application->Programming) and login as root (use localhost or empty for ‘Server Hostname’ if it reports an error)

Login MySql Query Browser

Login MySql Query Browser

and execute two commands:

create database loading_area;
create database datamart;

Exit the MySql Query Browser.

2. Load the scripts to sample databases

Start MySQL Administrator (Ubuntu menu: Application->Programming) and login as root (use localhost or empty for ‘Server Hostname’ if it reports an error):

Login to MySql Administrator
Login dialog for MySql Administrator

Restore the SQL script that creates the hibernate database, hibuser user and DATASOURCE table:

  • Select ‘Restore Backup‘ from the left panel.
  • Click the ‘change path‘ button at the left bottom and navigate to the folder you left them (/Pentaho/sql_scripts), select ‘OK‘ Note: Pentaho has this same script on ‘/home/[your_user]/Pentaho/biserver-ce/data/mysql‘.
  • Select ‘1_create_repository_mysql.sql‘ and the ‘utf-8‘ options.
  • Optionally you can review and select which the operations will be excecuted from the ‘selection’ tab at the window top.
  • Click ‘Restore Backup’ button on the bottom right.

To create the quartz database use:

  • Use ‘Restore Backup‘, using the same folder.
  • Select ‘2_create_quartz_mysql.sql‘ and the ‘utf-8‘ option.
  • Click ‘Restore Backup’ button on the bottom right.

Do the same to create the ‘sampledata’ database, but use the ‘3_create_sample_datasource_mysql.sql‘ script.

The users of Pentaho system (not the database) will be loaded when you restore the next scritp: ‘4_load_sample_users_mysql.sql‘.

With the next two we’ll load the actual data of the samples, these are: ‘5_sampledata_mysql.sql‘ and ‘foodmart_mysql.sql‘.

2. Create system/database users and grant them permisions

Continue in MySql administrator, but change the left panel option, click on ‘user administration‘ and then click the left bottom ‘New User‘ button. Now fill the ‘MyQSL User’ field with pentaho_user. In the passwords fields type password. like this:

My SQL New UserMy SQL New User

Note: You can see in the image the @ (at symbol) indented under the left pentaho_admin user. This is where you can give different permisisons to each IP/Equipment.

Lets continue and create with the same steps the pentaho_user.

Now, click on the ‘Schema Privileges‘ in the top tab. In the ‘User Accounts’ list  in the left bottm sould be still selected the ‘pentaho_user’ user. Select the ‘datamart’ schema and move from the right list to the left the correct privileges. Click ‘Apply Changes‘. It will end up like this:

My Sql Privileges

My Sql Privileges for ‘pentaho_user’ on ‘datamart’ schema

Privileges for ‘pentaho_user‘ on datamart, foodmart, hibernate, quartz, sampledata, loading_area schemas will be SELECT, INSERT, UPDATE, DELETE (omit DELETE for hibernate and quartz).

Give ‘pentaho_admin‘ full privileges on those tables. You can close the administrator now.

Check that ‘hibuser‘ has full access to hibernate database (given them by the script). Note: this step was reported several times so check it twice.

3. Modify system connections to MySql: Hibernate and Quartz

Use the file file navigator ‘nautilus’ (on the desktop menu: Places->Personal folder) to the following folder:

/Pentaho/bi-server/tomcat/webapps/pentaho/META-INF/

Note that this are the tomcat (web server) files. Edit (right click->open with->gedit):

context.xml

And change four lines with the classes, url strings and the SQL to verify the connection to:

<?xml version="1.0" encoding="UTF-8"?> <Context path="/pentaho" docbase="webapps/pentaho/">
    <Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
        factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
        maxWait="10000" username="hibuser" password="password"
        driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"
        validationQuery="select 1" /> 

    <Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
        factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
        maxWait="10000" username="pentaho_user" password="password"
        driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"
        validationQuery="select 1"/>
</Context>

Save the file and exit gedit.

4. Redirect Hibernate to MySql

Navigate to:

/Pentaho/bi-server/pentaho-solutions/system/hibernate/

This are the Pentaho config files. Open with gedit:

hibernate-settings.xml

Change it to:

<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>

Save the file. In gedit open in the same folder:

mysql5.hibernate.cfg.xml

Edit june 2011: This file should be ok, no need to edit. But you had to in previus versions, just to keep in the checklist.

Change three property tags in:

<!--  MySQL Configuration -->
<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>
<property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>
<property name="connection.username">hibuser</property>
<property name="connection.password">password</property>
<property name="connection.pool_size">10</property>
<property name="show_sql">false</property>
<property name="hibernate.jdbc.use_streams_for_binary">true</property>
<!-- replaces DefinitionVersionManager -->

Save the file, exit gedit.

5. Set the spring-security to the MySQL database

We’ll change the spring security files (until 3.5 the technology was ‘acegi’, but on 3.6 they rename it).  Navigate up one folder level, and edit the file:

/Pentaho/biserver-ce/Pentaho-solutions/system/ applicationContext-spring-security-hibernate.properties

Change it to:

jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.MySQL5Dialect

Save the file, close it and open on the same folder:

applicationContext-spring-security-jdbc.xml

Edit, change the header and two lines:

<!--  This is only for MySQL. Please update this section for any other database you are using -->
<bean id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver" />
<property name="url"
value="jdbc:mysql://localhost:3306/hibernate" />
<property name="username" value="hibuser" />
<property name="password" value="password" />
</bean>

6. Modify the file that list the jndi (datasources) connections

Edit:

Pentaho/biserver-ce/pentaho-solutions/system/simple-jndi/jdbc.properties

Change the driver definitions to:

SampleData/type=javax.sql.DataSource
SampleData/driver=com.mysql.jdbc.Driver
SampleData/url=jdbc:mysql://localhost:3306/sampledata
SampleData/user=pentaho_user
SampleData/password=password
Hibernate/type=javax.sql.DataSource
Hibernate/driver=com.mysql.jdbc.Driver
Hibernate/url=jdbc:mysql://localhost:3306/hibernate
Hibernate/user=hibuser
Hibernate/password=password
Quartz/type=javax.sql.DataSource
Quartz/driver=com.mysql.jdbc.Driver
Quartz/url=jdbc:mysql://localhost:3306/quartz
Quartz/user=pentaho_user
Quartz/password=password
SampleDataAdmin/type=javax.sql.DataSource
SampleDataAdmin/driver=com.mysql.jdbc.Driver
SampleDataAdmin/url=jdbc:mysql://localhost:3306/sampledata
SampleDataAdmin/user=pentaho_admin
SampleDataAdmin/password=password

Important: For new datasources, add at the bottom

FoodMart/type=javax.sql.DataSource
FoodMart/driver=com.mysql.jdbc.Driver
FoodMart/url=jdbc:mysql://localhost:3306/foodmart
FoodMart/user=pentaho_user
FoodMart/password=password
LoadingArea/type=javax.sql.DataSource
LoadingArea/driver=com.mysql.jdbc.Driver
LoadingArea/url=jdbc:mysql://localhost:3306/loading_area
LoadingArea/user=pentaho_user
LoadingArea/password=password
Datamart/type=javax.sql.DataSource
Datamart/driver=com.mysql.jdbc.Driver
Datamart/url=jdbc:mysql://localhost:3306/datamart
Datamart/user=pentaho_user
Datamart/password=password

You can delete the five line shark statements as this workflow is no longer used.

7. Place the drivers in the tomcat\common\lib folder

Note: Some time ago I had trouble with the 5.0.7 driver for MySQL so I  moved it to another location and used de 3.17 version in my windows system, but now I’m working fine in my Ubuntu box with 5.0.7 so test your system and use the latest version you can use.

Move the downloaded drivers to the

/Pentaho/biserver-ce/tomcat/lib/

This is important with Oracle if your ETL or reports use this driver.

8. Edit the scripts to avoid HSQLDB to start

Now we must stop HSQLDB from starting, edit:

/Pentaho/biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml

This is a very important file, so it has a lot of parameters. Move to line 87 (or search for HSQLDB) and comment the hsqldb startup lines -or delete them- like this:

<!-- [BEGIN HSQLDB DATABASES]
<context-param> <param-name>hsqldb-databases</param-name> <param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value> </context-param>
[END HSQLDB DATABASES] -->

and in line 215 (or next find):

<!-- [BEGIN HSQLDB STARTER]
<listener> <listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class> </listener>
[END HSQLDB STARTER] -->

Edit june 2011: For 3.8 on windows and for 3.9 4.0 in linux and windows you need to edit another file to avoid the following error:

In the log -or console- it says “java.sql.SQLException Could not retrieve datasource via JNDI” and the server won’t start. In the browser you get a listener error.

Some people try to edit the systemListeners.mxl file and comment the quartzSystemListener from starting. That works, but you loose the scheduling capabilities of the BI server. Some others try to edit the Quartz.properties file as it reports a MyDS problem. As I did for a day fumbling with the WEB-INF/web.xml and META-INF/Context.xml files.

The solution is to edit a new file at:

/Pentaho/biserver-ce/tomcat/conf/Catalina/localhost/pentaho.xml

Just edit the data as in our previus example of context.xml.

The tip to solve it came from Luis Benavides from http://dssintegration.com/ in a comment in this post.

9. Start the console administration for datasource setup

We need to update de Hibernate database with datasource connection information. We can insert the SQL string using the query browser but we have been using the graphical tools, so we now get the chance to use the Pentaho Administration Console (PAC), but need to edit it’s shell script , and add the JAVA_HOME variable in:

/Pentaho/bi-server/administration-console/start-pac.sh

DIR_REL=`dirname $0`
cd $DIR_REL
DIR=`pwd`
cd -

export JAVA_HOME="/usr/lib/jvm/java-6-openjdk"

. "$DIR/set-pentaho-env.sh"
setPentahoEnv "$DIR/../biserver-ce/jre"

After saving and in the comand prompt at:

/Pentaho/administrator-console/

type:

./start-pac.sh

Use you’re internet browser and go to http://localhost:8099, login with admin/password. You should see a connection error as it tries to connect with the Pentaho Server. But close it, for adding datasources its ok.

Click on Administration on the left, and select the ‘Database Connections’ tab. You can check the parameters for the SampleData datasource. We’ll be adding similar ones.

Pentaho Administration Console (PAC)

Pentaho Administration Console (PAC)

Click on the plus (1) icon, then fill the form with the values in the following table (2) -you know the user and password-, and test your connection (3).

Name Driver Class URL
FoodMart com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/foodmart
Datamart com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/datamart
LoadingArea com.mysql.jdbc.Driver jdbc:mysql://localhost:3306/loading_area

The advanced settings are: maxactcon=20, idleconn=5, wait=1000, query=SELECT 1

Exit the browser and turn off the PAC with  ./stop-pac.sh Note: We’ll use this console again to add new users. 10. Test your system To start Pentaho, in a command prompt at

/Pentaho/biserver-ce/

type:

./start-pentaho.sh

You should see if an error prevents Tomcat to show it’s startup in the command window. If somethings goes wrong you’ll see lots of text, remember that is easier to check the logs (/Pentaho/biserver-ce/tomcat/bin/Pentaho.log) they will tell you if you have made typing errors, for example if it says something like ‘SampleData being out of context’, then something in the datasorce scripts went wrong. There are many posible causes and the log is the best way to track errors.

To close the BI server application use:

./stop-pentaho.sh


[Edit] There is another thing that breaks when we change the database. The examples on bi-developers->’ChartBeans Examples’. We need to edit the files on:

/Pentaho/biserver-ce/pentaho-solutions/bi-developers/chartbeans/*.xaction

These files contains ‘actions’ to be carried out by our server, well review them on a future post. For now replace Sql text:

CAST(SUM(ORDERFACT.TOTALPRICE) AS INTEGER) AS “PRICE” to: CAST(SUM(ORDERFACT.TOTALPRICE) AS UNSIGNED) AS “PRICE”

And In file chartbeans_scatter.xaction, change the sql:

WHERE DEPT=’Sales’ OR DEPT=’Finance’ for: WHERE DEPARTMENT IN (‘Sales’,’Finance’)

Also edit the ‘Flash Chart List’ SQL and make a search and replace for several INTEGER to UNSIGNED in:

/Pentaho/biserver-ce/pentaho-solutions/steel-wheels/charts/pentahoxml_picker.xaction

Finally erase the file ‘chartbeans_simple_dial.xaction’ as it will not work on the CE (check this).

To make the simple dial example work (OpenFlash doesn’t work on the CE edition) we use the JFree drawing engine, make a change in:

/Pentaho/biserver-ce/pentaho-solutions/bi-developers/charts/chartbeans_dialchart.xml

from:
<chartModel chartEngine="OpenFlashChart"
To:
<chartModel chartEngine="JFreeChart"

One last note for windows users:

I had MySQL Workbench 5.2.34 CE working fine, but the following versions refuse to start reporting just: Err: MSVCR100.dll.

So, if you want to use this tool (5.2.37 is the latest one), you have to get two files that are no longer included (msvcp100.dll, msvcr100.dll). I copied them from the 2.34 version to their corresponding folder. Thats it.

Advertisements