Pentaho 5
This post is about installing the tools that were published on november 19th, 2013 on Sourceforge, that is version 5.0 and above. Installing a Database that can hold your data permanently and the Pentaho desktop tools and BI web server.
Future posts will be about creating a replica of the demo data on a permanent database that can hold your data, customize the apperance of web server, add notes on the new repository and articles about each Pentaho tool. They will be added through 2014 and the index of the series will be updated as work is done.
Open Source Software
You will download and install:
- Java Development Kit
- MySql Database and MySql Dashboard
- Pentaho Tools:
- The ETL or Data Integration. A stand alone app that can be used to access data on diferent formats and systems, process and distribute it to the apropiate people. You can schedule its excecution with the web server or use it as a datasource it with the report designer.
- Report Designer, a desktop banded report builder that let you ask for parameters and present your reports on web, excel or pdf.
- Metadata Editor A desktop modeler tool that lets you build a meta-model of your data to make easier to your users to navigate it and you to control their access to it.
- BI Server, a Tomcat Web Server preconfigured with users, demo data and the pentaho integrated projects.
Java SDK Installation
Use the 32 or 64 bit java version that corresponds to your operating system. To find out which OS version is on your computer:
- On linux: Open a terminal (launcher->Type Terminal, click on it), type on the command prompt:
uname -m
If the answer is i686 you have 32 bits. - On windows: Click on Start -> Right Click on Equipment. Select properties. Look up the OS information in the window.
ON Linux
If you know how about user administration, create and use a pentaho user and a pentaho group to install the software, if not, your current user will do.
- On Ubuntu:
Open the ‘Ubuntu Software Center’ (type it on unity search). On the search box type:
Executable java OpenJDK 7
click on its install button. - On any linux box:
In a Terminal, type three commands:sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java7-installer
In version 5.0 the scripts (bat & sh) looks for a PENTAHO_JAVA_HOME variable, so add it yo your profile. Please be carefull this is a configuration file.
- You need to know the location of the java files.
If you used the ubuntu center, it must be on
/usr/lib/jvm/java-7-openjdk-i386
Confirm this before continuing - In a command prompt edit your user .profile or the general profile:
sudo gedit /etc/environment
Add at the bottom the variable and the java (the one above the ‘bin’) folder:
export PENTAHO_JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
save and exit
You can check your java installation by typing on a terminal:
java -version
Windows
Open a web browser and go to:
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
Download either the 32 bit version: jdk-7u51-windows-i586.exe or the 64 bit file: jdk-7u51-windows-x64.exe and excecute it using administrator privileges (right click on it).
To configure the windows environment variable:
- Click Start Button, right-click equipment, select Properties.
- On your left, there should be an option: advanced system settings, click on it.
- Click on the environment variables button
- A dialog will open with two list, in the bottom one, type:
PENTAHO_JAVA_HOME
and then the java files path that should be something like:
C:\Program Files\Java\jdk1.7.0_51
You can check java by typing on a console: java -version
MySQL Database
LINUX
To install MySql :
- On Ubuntu:
Open the ‘Ubuntu Software Center‘ and then type ‘MySql’ to search for it, click on ‘MySQL Server‘ and then the Install button.
Do the same with ‘MySQL Workbench‘ so you can interact with the database with a graphical tool. - Or you can select other recomended ways for installing your software
http://dev.mysql.com/doc/refman/5.6/en/linux-installation.html
WINDOWS
This is the main oracle/MySql page. You can select different methods to install it on windows, I prefer to use a MSI installer as it installs and configures the database server to the appropriate access permissions. On your system it can be as simple as running the msi with administration access or exhaustive and detailed port, process and file ownership configuration.
General instructions to download and Install are in this page. The download page is http://dev.mysql.com/downloads/ so install the MySQL Community Server and MySQL Workbench. Remember to select your Windows 32-bit OR 64-bit and MSI not zip download button. You will have standard users, configuration and lots of literature to chatch up, but you are ready to go now.
You can start the server doube clicking on <Your path>\bin\mysql.exe. With the Workbench (or the command window) you can run SQL commands or table editing.
ETL aka Pentaho data Integration (PDI)
Now the easy part: to install Pentaho.
Access the Pentaho Project on source forge. Click on the Data Integration folder. The latest version is 5.0.1-stable (as this writing). Download the .zip file pdi-ce-5.0.1.A-stable.zip.
On your computer, It is easier to put al versions of the Pentaho Suite on a folder, so create a Pentaho5 directory under your home directory and unzip the downloaded file in it (right click and select ‘unzip here’). You should now have a /Pentaho5/data-integration directory with the ETL files.
To test the application:
- On linux:
Open a terminal window, change to the the app folder, something like
cd Pentaho5/data-integration
and then type
./spoon.sh - On Windows:
Use the File Navigator to get to your ETL files, like C:\Pentaho5\ Double click on spoon.bat.
There are a lot of ETL samples in the directory Pentaho5/data-integration/samples/transformations/ from reading text files, sort group, write to database tables. For example ‘Fixed Input…‘ reads a file ‘Textfile input – fixed length sample data.txt‘. To see it working, select the first icon (Fixed..) and hit the F10 key. You will see its contest if you click on launch.
You can also browse and install plugins if you click on help in the toolbar and select marketplace.
Exit the app.
Pentaho Report Designer
The process is similar for the Report Designer. The sourceforge folder is here. The latest version is 5.0.1-stable. You can download prd-ce-5.0.1-stable.zip directly.
Download the file and unzip it on your Pentaho5 folder (right click an unzip it). You should now have a /Pentaho5/report-designer directory.
To test the application;
- On linux:
Open a terminal window, change to the the app folder, something like
cd Pentaho5/report-designer
and then type
./report-designer.sh - On Windows:
Double click on report-designer.bat.
You will notice the desinger elemnets tools on your left, and all cramped on your right the Structure Tab of your report and its elements in the bottom and Data Tab to add your data sources and parameters.
There are a lot of samples in the directory Pentaho5/report-designer/samples from invoice status, sales sumaries, charts or advanced html. The inventory.prpt on the operational reports folder is a nice example. You can execute them by clicking on the green ‘play’ icon in the toolbar.
Exit the app.
Pentaho Metadata
The process of installation is the same as in the previous examples. The sourceforge folder is here. The latest version is 5.0.1-stable. You can download pme-ce-5.0.1-stable.zip directly.
Download the file and unzip it on your Pentaho5 folder (right click an unzip it). You should now have a /Pentaho5/metadata-editor directory.
To test the application;
- On linux:
Open a terminal window, change to the the app folder, something like
cd Pentaho5/metadata-editor
and then type
./metadata-editor.sh - On Windows:
Double click on metadata-editor.bat.
You can browse the steel wheels fisical (database) model, and how it is used then in the business model for browsing from tools like the WAQR -now deprecated- or report designer.
Exit the app.
Pentaho BI Server
The sourceforge folder is here. The latest version is 5.0.1-stable. You can download biserver-ce-5.0.1-stable.zip directly.
Download the file and unzip it on your Pentaho5 folder (right click an unzip it). You should now have a /Pentaho5/biserver-ce. The ce stands for community edition that is the open source or ‘basic’ version or without the much hyped capabilities on the new marketing and videos. Dont worry its pretty capable software.
You must run the Tomcat server to allow it to ‘deploy’ its web applications. It will uncompress folders and directories and set the server to a ‘localhost’ configuration.
- On linux:
Open a terminal window, change to the the app folder, something like
cd Pentaho5/biserver
and then type
./start-pentaho.sh - On Windows:
Double click on start-pentaho.bat.
The startup will take a few minutes. Remember that the server has versions of all the sofware installed, the web server and a database created and running in memory to allow the demo run, so be patient.
To access the login page open your browser (any moder browser is reorted to work ok) and type the URL:
http://localhost/pentaho:8080
Browse the files, in the public section you’ll find the standar report, olap report and dashboardof the Pentaho Suite.a
You can also add very valuable plugins like saiku o the ctools dashboard web tools (cdf, cde, cda) or waqr using the marketplace option.
To ‘close’ the web server you will need to execute the stop-pentaho script.
Ending comments
Thats it, those are the tools updated by pentaho on 2013. Open source tools like Saiku Reporting are not being ported. Aggregation o Mondrian Schema Editor are being updated. Tools like the Studio Designer are deprecated.
Remember: “Change is good. You go first.” 🙂
Installing the database and the server as services/startup deamons is a nice option it will be addressed on future post (you can access old articles to do that on this blog). The next post will be on customization of the bi server and then on a database (MySql) redirection, notes on the new repository and then a post on program migration, so stay tunned.
You can visit the main Index.
Pingback: Complete Pentaho Installation on Ubuntu, Part 1 | Interesting IT Tip's
When i did the biserver ce extraction on the server, and i have configured the jAVA environment variable, its showing directory listings, instread of login page 😦
Do you get the wellcome page from tomcat on http://localhost:8080 ?
If not, check the logs (/biserver-ce/tomcat/logs) for errors that prevented startup of the web server
Hey @francisco i was able to make it work, but weird thing is, my folders for images, css and js when i uploaded they are not visible 😦 but they do exists on server.
Hi MS,
Since version 5.0, they have been moving things from the filesystem to the Jackrabbit repository.
Maybe it’s cooler to enterprise administrators but for small teams, power users or consultans it’s a nightmare.
I prefer to keep my sources on the filesystem like we did in pre-5.0 versions and upload changes. That’s a lot of work on upgrades but there is no tool to keep it simple on the CE edition.
You have to browse your ‘solution’ using the ‘Pentaho User Console’ and check links if they are in ‘local’ notation you’ll have to upload your files.
Actually they were hidden, i went to view from top menu and selected Show Hidden Files, that did the trick. Nothing fancy. I need to know, if i can use Data Integration with Dashboard???
Its better to right click on files and set them visible and the turn hidden files back on.
Of course you can, check some examples on previous posts.
There are two things you shoud be aware: how do you pass parameters from the dashboard to the PDI and change -if you must- your path to fisical files/resources when you finally upload the transformations to the BI Server.
Yes you are right, is this the only platform to communicate with you?
Pingback: Pentaho 6.0 Install | Interesting IT Tip's
i put this and working localhost:8080/pentaho/Login
🙂
ohhh man 🙂 i have finially started work on Kettle and using Kettle for database querying and displaying data on dashboard. 3 things i wanna ask
1. Data on dashboard is not updated even when refreshed, i have noticed it when an order is placed and we refresh the dashboard , it fetches the data from cache. How can i solve this? I am using Kettle queries.
2. I need to design my own custom bootstrapped dashboard. where a side menu always exists and we call different reports to over it, i don’t wanna place left menu in 40 different dashboards 😦 how can i do it so that only one side menu works for the dashboard and all the reports etc.
3. How can i make the URL of dashboard SEF?
My first thought you coud to use the report designer. That would let you put your options (buttons, selects) on your left, and let you refresh subreports on other area.
The subreport could hide/unhide sections. The datasets should include where clauses on queries that should not run.
Then I thought that you may need another metaphor for your dashboard.
Check webdetails samples: http://www.webdetails.pt/showcase/
PD. Don’t know how to get Search Engine Friendly (SEF) urls on BI server. Why do you need them?
SEF means search engine friendly urls, like if i can make below url to:
http://www.webdetails.pt/pentaho/api/repos/:public:PublicDemo:TieWars:index.wcdf/generatedContent?userid=pentaho&password=demo
TO
http://www.webdetails.pt/pentaho/dashboard/?userid=pentaho&password=demo
Ah I see.
I used to use something like that before 5. I haven’t tried if the repository changet urls I guess they still work.
The problem I recall is that for security reasons they changed authentication privileges.
Don’t if they fixed that.
Check this dashboard:
URL: http://demo.ivy-is.co.uk/
Username: ivydemouser
Password: G01vy+P3n+ah0
But i don’t want redirections like this, its should act as ajax dashboard.
Wow. Those are beautifull dashboards!
And IVY’s components seems really nice.
As you can see in older post I use Webdetails components, If you select your aggregates well, you can impress some people. But if it’s for a big dataset the slow response time its anoying.
You shoud try to set a layout with a selector and two components with parameters that should update with changes (from the selector and clicked element in the first one) conected to SQL and ETL datasources. Then you could ‘taste’ the experience ‘AJAX’ with Pentaho BI.
I must confess that most of my users are financial guys and they like data tables, they could see trends even better than any visualization trick. They just like sumarization, KPIs and if they neet to check data they prefer a excel file.
I will make a note to start playng with the IVIs set, though.
Need any help?, I’ll be checking more frequently.
Okay frank 🙂 am back, i have make a new dashboard template with custom files and layouts 🙂 issue is, i had to fetch complex values being calculated on run time, multiplications etc to generate aggregate values, so i did that using Pentaho Data Integration, now few of the queries are very slow, due to the complexity of the calculations being performed on a big data set.
I came to a solution that i will generate aggregate values and will save them in a table and then going to query that aggregate table. but issue is everyday transactions against new and old entries in the database. so if someone in the admin side make changes to the values i will not have updated aggregates in new table.
How can i cope up with this situation? what is the best way to do it in pentaho? I am totally confused here.
Well, you were very fast implementing that.
First of all, make shure is the numerical data that is slowing down your ETL.
If it is, the try changing your steps, (calculator is a lot faster than modified java script).
Second maybe its the database, check indexes.
Third if its both the try to multitask, maybe run a process on oldsumaries and another on new ones. I’ve seen that Big data works with this and hardware (Netezza servers).
If that doesnt work then try OLAP cubes and Mondrian Agregation.
Good luck!