Pentaho 8.0

In sourceforge, Pentaho has been renamed “Hitachi Vantara | Pentaho”, the binaries for all projects in this version has been uploaded for the community. Pedro Alves has summarized the main features here.

The links to the Web server and the Desktop tools are:

Pentaho 8.0

  1. Server Page [url]
    • BI Server [1.2GB]
      Tomcat server to squedule jobs, grant access to run reports and jobs to users and design dashboards.
  2. Tools Page [url]
    • PDI 8.0 [979.8 MB]
      Pentaho Data Integration – Best ETL you’ll find.
    • PRD 8.0 [666.0 MB]
      Pentaho Report Designer. Reporter for different databases.
    • PME 8.0 [836.5 MB]
      Pentaho Metadata Editor – Grant access and model how to query your data.
    • PAD 8.0 [25.6 MB]
      Pentaho Aggregation Designer – Specify mondrian cube aggregations.
    • PSW 8.0 [84.4 MB]
      Pentaho Schema Workbench – Edit your mondrian cube.

Sound like its time for discovery.


Hitachi Vantara

Today September 19, 2017, Hitachi Vantara was announced, it is a new business entity that will unify the operations of Hitachi Data Systems, Hitachi Insight Group, and Pentaho.

So we’ll be saying both Pentaho and Vantara for a while to refer to this BI suite.  🙂

This is the press release:

The main new Site:

And new twitter account to follow:

The Community Edition at Sourceforge: with the Hitachi Logo and title “Hitachi Vantara | Pentaho”. The wiki in

Data integration is still “PDI” or “Kettle” in the community area:

It seems (#HitachiNEXT) they’ve got a nice business and consulting strategy. Lets see what they do with the original Open Source philosophy.

Pentaho 7.1 – Demo Install

The newest release Pentaho in its community edition is available for download at sourceforge (link) since the 22th of may of 2017.

Pedro Alvez announced its availability the same day with an excellent post in his blog (link). He mentions new visualizations, scalability with big data engines and repository improvements -someday they’ll deliver something you can work confortable with, maybe-, and a new web theme. We’ll see how they are supported in the CE edition. A new mobile access for the EE. The PDI team is exited about its ‘metadata injection’ improvements.

You’ll find it at

  • Web application: Business Intelligence Server [1.1 GB] with 58 closed reports.
  • The best ETL aplication: PDI [904 MB] with 142 closed reports.
  • Report Designer [606 MB] with 19 closed reports in their JIRA system.
  • Pentaho Metadata [787 MB] with 5 closed reports.
  • And Big Data Shims folder.

Steps for backup, install, upgrade to a MySQL local DB are the same that you are already familiar with and has been previously posted in this blog (link) and well documented at the Pentaho Wiki.

Knowage Suite 6.0 CE

I found Spago BI since version 5.1, that is about two years ago. It was a complete Open Source BI sotution, the only one, they said. They were proud of it!. And I was waiting for 5.3 to start a series of posts as they announced amazing changes.

On may 4th @SpagoBI anounced they were starting a #spagobirevolution. The same company (Engineering Group) did an extreme makeover to XU experince and released a comunity edition (CE) and an Enterprise Edition (EE) with a Knowage Brand.

What I have seen is that the CE is an interesting product, somewhat crippled as important functionality like scheduling and MDX calculated fields and more are excluded, but lets hope its cockpit designer, metadata, models and widgets deliver a promising alternative for data exploration.


Check their overview

Visit their site:

Install notes

[Updated on December 14, 2017]
Knowage has twitted about this install guide with instructions for windows, linux and DB migration scripts from 5.x.

[Updated on June 20, 2017]
Prerequisites: You’ll need java.
CE Manual: pdf

  1. Download [1.078GB from ow2] [798mb] from ow2
  2. Open a Terminal, create a folder, unzip its content:
    mkdir knowage6
    cd knowage6
    cp ../dowloads/
    unzip knowage
    chmod 777 ./
  3. An instaler will ask:
  4. a. Open a welcome dialog. click Next
    b. Ask if you accept the license agreement. Do it so an click next
    c. Ask for your preference on a charting library. Accept an click next
    d. Ask for six modules to be installed. If all are selected, Click next
    e. Select a destination folder. I selected HOME$/knowage6 it will create a Knowage-Server-CE
    f. It will ask for your MySql credentials (jdbc:mysql://localhost:3306/, user and password).
       it said that two schemas: kwowage-ce and foodmart-demo will be created
  5. The installer will extract the server, the .war files and update the database.
  6. Unselect all and click finish to end the installer.
  7. I had to modify the startup and shutdown scripts to set the correct path to java:
  8. cd /HOME$/knowage6/knowage-Server-CE/bin
    [add your JRE path to the first line like:]
    export JRE_HOME="/usr/lib/jvm/java-8-openjdk-amd64"
    [save your script and repeat for]
  9. Start your tomcat server:
    cd $HOME/knowage6/Knowage-server-CE/bin/
  10. Start yor browser and set it to:
  11. http://localhost:8080/knowage

You will find examples, enjoy.

If you have problems with the server you can check your ./logs folder.

Visiting Pentaho. Waiting for 7.0

[Update november 14th]

Ok, I’s here. It’s up since november 9th.

Go and get it at the wonderfull site: files.

[Update october 18th]

Pentaho Business Analytics 7.0 will be available for download in mid november. Theses are the announcement and overview.

[Original Post]

In october 2016 we should be getting next version of Pentaho. According to its Jira records it will be a stability oriented release, I count 58 fixes for BI Server, 139 in PDI, 10 in PRD, 6 in CDE/CTOOLS (you can check it for each project in 7.0.0 version at and few improvements.

Having a reliable suite is a benefit for all specially in this market where de community editions are being abandoned in FOSS and their enterprise editions are being developed attending corporate needs rather than the medium or small companys.

On that topic, I browsed the forums ( to see what  people are asking. This is what I found from the 1st to the 22th of september:

Project Questions Unanswered Notes
BI Platform 22 8 Installation and simple questions. 2 people even asked if anyone has any clues, they were lost.
Pentaho Reporting [PRD] 30 15 Advanced questions. 2 moved as they placed it in wrong forum.
Pentaho Data Integration [PDI] 127 23 Advanced questions. Some can be solved with the modified java step or mail step examples.
CTools 32 11 They ask how to modify graphs in specific ways, some of them answered in CTools web site, but need clarification on frameworks used.

Of these projects the BI Server is the one having most novices asking already answered questions. A FAQ should help these people. For the PDI and PRD there are some senior members giving excelent advice I felt good about humanity with the existence of these people.

I guess there will be no modifications in installing or upgrading instructions, maybe I should clean up the old ones in this site.

Well that’s my 2 cents on the CE application and support for the upcoming 7.0.0 version.

Pentaho 6.1

As it was announced by Pentaho-Hitachi, version 6.1.01 was realeased on the 9th of april of 2016 in EE and CE editions. Data IntegrationBusiness Intelligence ServerReport DesignerPentaho Metadata and Big Data Shims.

There are lots of fixes: 179 of 180 issues have been resolved in PDI, 59 of 59 issues for the BI Server and 14 of 14 issues for the PRD.

These are the main improvements acording to Pedro Alvez: Services, ODBC, metadata injection, several steps improvement and enabling tests in the BI server from the PDI, this was lost since studio went down the tube.

Steps for upgrading must be the same as previously posted. If not I will get back to you and annote this post. [Which is very much unlikely].


Pentaho 6.0.x Install


After reading the changelog, I thought that the numbering from 5.4 to 6.0 was not justified. But then I realized that this is the first version of Pentaho under the Hitachi brand, and, seting a round number setted a milestone as completed. A good sign is that they kept the release day of the community version (CE), the same as the commercial one.

Recomendation: Check the ETL, it is a wonderfull tool. Spend time with it, it will let you clean and process data from several sources (Text, NoSql, DB, Excel, SAP ERP) and send it on its way [to services or users] in different formats. Of course it can be used with other suites. Then proceed with the Web server that is primary a client side tool that show processes according to user privileges and calendarize processes. To build dashboards you can choose between two sets of bulding blocks: CTools or Ivy. They are rudimentary but let you set parameters and recover data from the ETL or Report Designer and let user click on graphs for you to run queries. The Report Designer that has its own [complex] way of doing things but will let you create fixed-format reports mainly in pdf or html. The Metadata and Schema tools will help you make business/olap models and data governance.

Demo Install

First Download each file and install it as stated in the guide, play with it and then proceed with another.

Download links

Pentaho files: sourceforge zip files 

  1. ETL Tool [>810MB]: v6.0v6.0.1
  2. Tomcat Web Server [>900MB]: v6.0, v6.0.1
  3. Report Designer [>530MB]: v6.0, v6.0.1
  4. Metadata Editor [>500MB]: v6.0, v6.0.1
  5. OLAP Schema Workbench [~30MB]: v3.11
  6. OLAP Aggregation Designer [~30MB]: v6.0


Install steps for the pentaho demo applications have been using similar steps since 4.8. So the old 5.x guide works with the new files. Please follow this post using the new files.

You can skip the Java SDK installation if you already have it on your system. On a new box you’re better off with java SDK 1.8.0 as Pentaho 6.0 now works with it. Warning: If you’re on 5.x you’re probably on java SDK 1.7.0 and you can work with it, just rememeber to backup your development files before upgrading to java 1.8.0.

In that post you’ll also find instructions to install MySql that is an option in case you want to use your data. BTW I now use MariaDB and its working fine.

Memory adjustements:
I tested the apps on a windows 32 bits system and I had to edit the startup batch files to  lower the memory limits:
DATA INTEGRATION 32 bits: spoon.bat
FROM      -Xms1024m -Xmx2048m
TO            -Xms768m -Xmx1024m
REPORT-DESIGNER 32 bits: report-designer.bat
FROM      -Xms1024m -Xmx2048m
TO            -Xms512m -Xmx1024m
BI SERVER 32 bits: start-pentaho.bat
FROM     -Xms2048m -Xmx6144m
TO           -Xms768m -Xmx1024m