Pentaho Visualizations

[EDIT: So sorry for not checking this out first, but this is not for us.
  Community Edition is not allowed to play. So skip this one.
  I will not delete it. It will be a reminder to look for an alternative on 2013.]

This is just a reminder for you to visit the event-page in which Pentaho is giving away visualization plug-ins for your Pentaho BI Server.

So far they have a ‘sunburst’ tool which is a double level pie graph and a timeline zoom graph.

Pentaho

(click on the image or open the link)

All they ask for is feedback about improvement or additional ideas.

Next one will be on December 19th, 02012

Pentaho CE 4.8 – BI Server Update

[This is the third edit -15th January 2013-on this post I think this is the final version so everything works]

On November 29, Pentaho (dmoran specifically) published on sourceforge the new versión of the Pentaho BI Server 4.8. That is just a few days behind the Enterprise Edition announcement, that is great news for us.

Also was made available the Data Integration, Pentaho Metadata, Report Designer applications that you should be able to update following the Tools Update post.

The most notorious inprovement in the PUC is a toolbar button to install, upgrade or delete the Saiku or C*Tools plug-ins, its called ‘The Pentaho Marketplace’. If you click on it you will see this dialog:

marketplace

If you’re server is connected to the internet you just need to click on the buttons to upgrade or install plugins!, This is the success dialog:

Dialog

The following instructions should work from upgrading the 3.5-3.10 server if you installed acording to this series, that’s the installation and MySql-DB post. If you made further modifications or want to do the things the ‘best practice way’ then you should do a ‘diff’ for the configuration files, at least the ones mentioned.

If you need details about the instructions refer to each post where it is explained in detail.

Part I: BI Server Demo Install

Just like in the first post:

  1. Download biserver-ce-4.8.0-stable.tar.gz [445.6.5 MB]
  2. Stop your server.
  3. Make a bakup of your /Pentaho/biserver-ce folder, a zip or gzip file should do it.
  4. Rename the folder to /Pentaho/biserver-ce-old.
  5. Move the dowloaded file into \Pentaho
    untar/gzip the two folders:  tar -zxvf biserver-ce-4.8.0-stable.tar.gz
  6. Change the .sh to exetutables [I didn’t have to do this but in case you need it]
    Antonio suggested (on first post comments) to use:
    find Pentaho/ -type f -name ‘*.sh’ -exec chmod +x {} \;
  7. Add the java variable to the start and stop .sh files, including the ones in the administration console folders:
    export JAVA_HOME=”/usr/lib/jvm/java-6-openjdk”
    [or java-7-openjdk]
  8. Start the Server and login with joe/password to check the demo at:
    http://localhost:8080/pentaho
    Be patient it may take a few minutes

Part II: Update Plugins

[Edit: JP -in the comments- posted that this step avoids an error on dropdown parameters that happens on some systems. As it works, I suggest you follow his advice.]

If your server is without internet access, follow the C*Tools and Saiku manual install instrucctions.

  1. Click on the marketplace icon on the toolbar.
    Locate the Comunity Dashboard Framework line and click on the view details button. Click on uninstall. Close the dialog.
  2. Restart you server.
  3. Reload the settings on the browser using the menu ->Tools -> Refresh and execute all the options. Clear your browser’s cache.
  4. Click again on view details button from the CDF option, but this time Install the ‘stable’ version not the TRUNK).
  5. Go ahead click on upgrade Saiku Analytics, CDA, CDE.
  6. You need to install the Saiku Reporting add-in manually (as described on Post 3): download it and unpack it into Pentaho/biserver-ce/pentaho-solutions/system.
  7. Reboot the server.
    I hope that in the next version of the marketplace they use use checkboxes and let the thing reiterate so you don’t have to click and waif for each one.
  8. You can check that after the procedure, the plugin-samples->CDF->CDF-> Samples->Charts samples->Chart Samples: works!

Part III: Upgrade your DB and Connections

Lets update the database config files (the same ones that we modified on post 2). Open a file explorer with the biserver-ce-old and the biserver-ce, as we will be copying files from the first one to the second, here is the list.Be carefull not to overwrite the wrong ones.

  1. Shut down the server.
  2. Copy the Hibernate and Quartz DB config files:
    tomcat/webapps/pentaho/META-INF/context.xml
    tomcat/conf/Catalina/localhost/pentaho.xml
  3. Copy the Hibernate security files:
    pentaho-solutions/system/hibernate/hibernate-settings.xml
    pentaho-solutions/system/applicationContext-spring-security-hibernate.properties
    pentaho-solutions/system/applicationContext-spring-security-jdbc.xml
  4. Copy the Datasources:
    pentaho-solutions/system/simple-jndi/jdbc.properties
    pentaho-solutions/system/olap/datasources.xml
  5. Copy your drivers (files not in the new lib folder) from:
    tomcat/lib/
    In my case: mysql-connector-java-5.1.17.jar, ojdbc14.jar, orai18n.jar
  6. Disable the In Memory database from the demo, edit:
    tomcat/webapps/pentaho/WEB-INF/web.xml
    search for: BEGIN HSQLDB in two places, and remove some coment so it will look like this:
    <!– [BEGIN HSQLDB DATABASES]
    <context-param> <param-name>hsqldb-databases</param-name> <param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value> </context-param>
    [END HSQLDB DATABASES] –>
    and :
    <!– [BEGIN HSQLDB STARTER]
    <listener> <listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class> </listener>
    [END HSQLDB STARTER] –>.
  7. If you jave generated metadata using tools or dowloading samples you shoud check
    biserver-ce-old/pentaho-solutions/admin/resources/metadata/
    and copy your ‘agile’ file to your new folder.

The following modifications may vary and are optional, they were mentioned in Post 5 – Your Server:

  1. Copy company logo, repace these files from your biserver-ce-old folder:
    biserver-ce/tomcat/webapps/pentaho/mantle/themes/onyx/images/logo.png
    biserver-ce/tomcat/webapps/pentaho-style/images/login/logo.png
  2. Avoid tip on ‘demo users’ at login screen, edit:
    biserver-ce/pentaho-solutions/system/pentaho.xml
    Set these properties to false:
    <login-show-users-list>false</login-show-users-list>
    <login-show-sample-users-hint>false</login-show-sample-users-hint>
  3. Increase Timeout, edit two files:
    biserver-ce/tomcat/conf/web.xml
    biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml
    Search for session-timeout and change it from 30 to the seconds you prefer:
    <session-timeout>180</session-timeout>
    While you are editing you can add these at the bottom, just after:
    <!– insert additional resource-refs –>
    Add:
    <security-role>
       <description>security role</description>
       <role-name>PENTAHO_ADMIN</role-name>
    </security-role>
    Its removes a warning on logs at startup, but there a re too many now.
  4. Add your publishing password. Edit:
    biserver-ce/pentaho-solutions/system/publisher_config.xml
    Set the property to your password
    <publisher-password>[your-password]</publisher-password>
  5. Remove Jpivot and waqr warnings.
    Those tool are deprecated, you should be using the Saiku tools, but If you have reports running on those tool, its not nice for your clients, so get to:
    biserver-ce/tomcat/webapps/pentaho/adhoc/styles/
    add at the bottom of these files: jpivot.css and jpivotIE6.css :
    #deprecatedWarning { display: none; }
    And add to (in the same folder): adhoc.css :
    #waqrDeprecatedAlert { display: none; }
  6. On a production system, I hide the .CDA files, its no use to the end user to see the datasource in the treeview so edit:
    biserver-ce/pentaho-solutions/system/cda/plugin.xml
    And comment the lines between:
    <!–
    <content-types>

    </content-types>
    //–>
  7. If you need advanced guidelines on changing http server name and port, connection pools, config for database on other machine, modify the messages, email account setup, and automatic startup, pool connection, etc. Check Post 5 for config notes.

Part IV: Copy your Solution Folders

  1. Check for your work (solutions) in the old
    pentaho-solutions/
    and copy them to your new folder. I have two solutions and a samples from old bi server installations.
  2. If you have kept your .xmi and other configuration files to each solution folder you can upgrade this easily.

That’s it, start your server, you’re on 4.8 now. Now you’ll need to use your own user-password.

PD. Remember to reload the settings on the browser with the menu ->Tools -> Refresh and execute all the options, and clean your browser cache.

Complete Pentaho Installation on Ubuntu, Part 14

By now you know the tools to get data from any data source in your company, clean, transform them and add appropiate performance inditactors (KPI), present them into reports, excel files or dashboards that is used by including menues in the PUC or automatically distributing them to your users that act on them or explore the data.

For those users specially statistically oriented ones there are additional set of tools that allows machine learning and data mining. In its own words “Its broad suite of classification, regression, association rules and clustering algorithms can be used to help you understand the business better and also be exploited to improve future performance through predictive analytics.”

Here is a nice history Weka Presentation.

How to Install

  • Download a stable version here.
  • Unzip its content into our Pentaho folder.
  • Open a terminal navigate to Pentaho/weka-3-6-5/
  • Type:
    java -jar weka.jar
  • Now you can click on the explorer application button.

    WEKA Explorer

    WEKA Explorer

  • Note that these tools work on flat files that you prepare on your BI suite.

There are executable windows version and mac notes on this page.

Documentation

Article on forecasting using time series here.

Project documentation page: Tutorials (command line ang GUI interface), Manuals, FAQ, Docs, API, Wiki. The Pentaho forum. Pentaho Weka Flyer.

Youtube videos 1classifier.

Other tools

http://www.r-project.org/, http://rattle.togaware.com/, http://www.knime.org/

Complete Pentaho Installation on Ubuntu, Part 13

Additional Software

There are lots of plugins that enrich the pentaho BI suite. There is a plugin page for the BI Server PUC and for the PDI). I still haven’t tried them all but here are some interesting ones anyway:

BI Open Flash Charts

You can now add the Open Flash graphs in your dashborads in addition to the Open Flash Chart and JFreeChart. This plugin was started from the Pentaho framework so building with it should be familiar. But the stunning graphs come with a v3 which is not OS.

Here are links to the FusionCharts Blog, the Open source version, and the Pentaho plugin.

You can download the 0.02 version and its samples.

To install

  1. Extract the zip file into the /Pentaho/biserver-ce/pentaho-solutions/system.
  2. Extract the samples zip file into the /Pentaho/biserver-ce/pentaho-solutions/bi-developers.
  3. Change the file system/pentaho.xml to include xfusion on the acl-files list:
    <acl-files>...,xfusion</acl-files>
  4. Open the Pentaho User Console (PUC) and refresh the solution repository.

Here you can see installation, demo and usage in a Youtube video.

Ruby Step for PDI

Slawomir Chodnicki released on march 2011 the Ruby plugin step for Kettle 4 here.

There are examples included in the file you should download in the github page. Click on downloads and the select RubyPlugin_1.1_Kettle_4.2.zip. You already know thay it should be unziped on the plugins folder.

In a forum post some caracteristics are mentioned.

Excel Writer Output PDI Plugin

It is no longer a plugin as it is included in the 4.2 Kettle release, but post about its usage are still labeled as that 🙂

It let’s you set more options on formating and range.

Release notes and usability post.

BI iPhone plugin

Edit: The infomation below is no longer acurate:

  • Since 3.8 the BI server includes the iPad code, as stated by richad3 on the BI forum. But for 4.0 improvements were made on a week of fun.
    It seems the Enterprise plugin works great with the iPad, check the video. On the CE Edition the PAT/jPivot should work too, I’ll let you know what I find.
  • An alternate option for mobile devices is made on this OS project: PentaGoMo
  • The new site redesign has made the original code (the one that needed fixing) unavailable redirecting everything here.
    That’s bad if you still want to play with your Blackberry, Android or iPhone devices. I’ll let you know what I find.

The original information will remain here until new links and information is found:


The original article about a plugin for the iPhone is as old as this device, it was made available by the now vicepresident of engineering in Pentaho.

The BI PUC detects the browser and present a special menu. When you select an action (for report, dashboard, etc) a special program makes the parameter selection easier and presents one by one. Then the report is shown, you can see this video.

Unfortunatelly some corrections have to be made to the code to work with the new Pentaho BI version. Here are the download and correction instructions from Will Gorman on 2008 and additional ones on 2011 by Herwin Rayen. And a tech-tip so you can modify it further in the BI version 3.0.

There is also an Android app in the marketplace, here is the forum post that mentions it and the link to the 3.5 version.

In the Pentaho Blog an anouncement was made on summer 2011 about a iPad prototype. If you can’t wait “expand the plug in to cover iPad, just modify the Java source code, recompiled its class, then updated the JAR accordingly. This method can be used to expand the plug in to cover Windows Mobile, BlackBerry, and Android” – Paul Pambudi.

If you dont have all these devices you can check the emulators in Firefox’: User Agent Switcher add-on or testiphone.com web-based iPhone browser emulator.

BI BIRT Report ‘Plugin’

You can use the BIRT report view engine in the BI Pentaho User Console (PUC) which is newer and different than the plugin that is used in pentaho.

You have to download an eclipse runtime and extract a directory to your pentaho tomcat/webapss/pentaho folder and also download the samples to your pentaho-sulutions folder. Check the description, instructions and samples here.

Or you can use the plugin for PDI/Kettle to just run and burst your BIRT specifications. Check this link.

Data Cleaning [future] Plugin for PDI

Data validation needs coding in the actual PDI, but a nice open source utility for doing validation and correction exist and it seems it will become part of the PDI soon.

Data Cleaner can analyze, profile, transform and clean data on its own. But Matt Casters is working on a plugin so Kettle can use it. Here is the link that briefly shows its functionality and mentions the plugin here.

GeoSpatial Analisys on Kettle

On July 2011 version 2.0 of GeoKettle was announced by Spatialytics.com here. It is a step add in for Kettle 4.0 that allows “spatial analysis functions such as buffer calculations, overlays, metric operators, etc” from and to different file formats. It even reads sensors. Sounds like fun.

Weka Plugin

You can use the Knowledge Flow Plugin, that lets you use a weka predictive model as a step in a PDI transformation. Install an usage here.

Complete Pentaho Installation on Ubuntu, Part 10

Install Pentaho Report Designer (PRD)

If you have installed de BI Server, PDI (ETL), PDS (sequence of actions), modeling (Query metadata and OLAP workbench), you have all you need to provide access to your data and automation of extractions, even mail distribution, using tables or excel formated files.

Now with the report application you can create specially formated output presentations with areas, colors, tables, graphs. Using subreports you can mash them up even from different datasources: jdbc, metadata, ETL.

I prefer to publish the reports to the BI server but you can use the PRD (Pentaho Report Designer) as a stand-alone application, just like the PDI.

Installation

  1. Download a stable version from the Pentaho project in sourceforge.
    Or you can download the 3.8 release candidate [67MB].
  2. Extract its content in the /Pentaho folder so you end up with a:
    /Pentaho/report-designer/
  3. Make shure the *.sh files are excecutable.
  4. Edit the
    Pentaho/report-designer/configuration-template/simple-jndi/default.properties
    change the database strings to MySql like in
    Pentaho/biserver-ce/pentaho-solutions/system/simple-jndi/jdbc.properties
    to use the JNDI connection options.
  5. Start the app with:
    $ ./report-designer.sh

In the startup dialog you can check the samples, the wizard, or start from zero. Now you have a standard reporting tool, where you define a datasource and then drag and drop fields into header, grouping or detail bands, then publish your reports.

Report-Designer Sample

Report-Designer Sample

Tutorial

Michael Tarallo from Pentaho has made an excelent tutorial about using the Pentaho Report Designer in a series of short videos. I think this is the place to start.

And a step by step guide in the Pentaho Wiki for creating your first report.

Example from the Beginning

Before I found the tutorial, these were my guidance notes.

  1. Click the New Button.
  2. In right panel, select the Data tab
  3. Right click on Data Set
    choose one of the options: JDBC (/JNDI), metadata, ETL (PDI), OLAP, XML or Table.

    Report DataSource

  4. If you choose Metadata:
    1. Browse to:
      Pentaho/biserver-ce/pentaho-solutions/steel-wheels/metadata.xmi
    2. In the dialog, type:
      SteelWheels for Domain
      Click on the Query plus sign and set a name for the query name
    3. And build a query by clicking on the pencil button.
      Like this:

      Report-Designer Medatdata

      Report-Designer Medatdata

    4. If you click ok, you’ll see all your queries and parameter datasources. Use preview to check your connection and data.

      report-designer Query

      report-designer Query

  5. If you choose JDBC:
    1. Select SampleData (our MySql datasource)
    2. Click on the plus button and define a query by selecting tables and fields, the joining fields are asumed when name matches.
    3. The designer is easy to learn or you can build your query on the MySql Browser and type it in the query field.
  6. After you close the datasource dialog you can drag columns to your report area and change the display properties on the Structure Tab.

Now

  1. Check the samples.
  2. Check this great post that uses sparklines: Report Parameter & Sparkline.
    and while you’re at it, read about Edward Tufte’s data visualization  history and articles.

Good luck!

[Edit] Resources

The PRD can substitute the .xaction files for initial parameter information. Then it can get data from OLAP, databases or ETL jobs and mash them into a page or file, which make it a very interesting tool, but each function has its details, so here are some of the articles about them:

  • Creating Parameters with Pentaho Report Designer: prashantaju.com
    Shows how to build a query, ask for a parameter, then modify the query to use it.
  • Several parameter tips on diethardsteiner.com: using it with queries, metadata, olap, single and multiple values, formulas and more.
  • PRD parameter type definition in pentaho.wiki
    Defines each type, shows the difference between ‘date’ and ‘date (sql)’.
    An llustration of each type and how appears in the PUC at prashantaju.com
  • Showing multivalue list and SQL query on prashantaju.com
  • How to ask for a SQL query parameter ‘* = all’ option: diethardsteiner.com
  • Calculate previous date with functions: bizcubed.com
  • Using ‘date picker’ and format date bizcubed.com
  • Cascade parameters (results of one parameter depends on the previus one): prashantraju.com
  • Overview of PRD 3.6 on five interesting videos.
  • Wiki aricles.
    Including this introductory guide.
  • PRD documentation.
  • Subreports:
    passing parameters and working with subreports on pentaho.wiki.
    Subreports, multiple reports, passing parameters on bizcubed.com.
  • Limit output type (PDF, HTML) parameters on sherito.org
    Set the default with attribute on setting options. Check the tip on the next section.
  • Tutorial and tips on PRD classic (that’s 3.5 and lower) at bizcubed.com, useful for maintenance. The tutorial for current version is also on bizcubed.com
  • Explanation on parameters and GWT fundamentals at sherito.org
  • Article on building interactive reports.
  • Note on How to call stored procedure/kettle files in Pentaho designer.
  • Pentaho Reporting 3.5 for Java Developers, Book Sample: booksonline.
  • Change row background color at guru4us.
  • Control page break in style sub section also at guru4us.
  • Using chart objet to display a value aginst results on prashantahu.
  • Link a report in html. In the example a top ten report lets you click on a single item.
  • Crosstabs.
  • An article about BIRT vs Jasper vs PRD.

Eight Tips

  1. Remember to use unique names for your parameters.
    If you use the same name as one of your data columns even on subreports, it will confuse the PRD and it will be sustituted by null.
  2. To ask for a date:
    • Add a parameter:
      Right click on right panel, option data, at the bottom, like
      – name: DATEGIVEN
      – type: Date (SQL)
      – format: MM/dd/yyyy
      – timezone: use server timezone
      this is important or it will add or substract additional hours
      – Some default on date like 31 of january: 01/31/2011
      – Mandatory
      – Display type: Date Picker
    • Add another parameter that will be the one in our query:
      – name: DATESTR
      – type: String
      – formula: =MESSAGE(“{0,date,yyyyMMdd}”;[DATEGIVEN])
      – Hidden
      – display type : None

      Parameter Date

      Parameter Date

    • You can use ${DATESTR} in a query or
      DATESTR in PDI ‘getsystem info’ step as argument #.
  3. To change the order in which the parameters appers in the bottom-right panel, right click on them and select bring them forward=up or back=down on the menu.
  4. To add a subreport you use the icon in the left toolbar at the bottom.
    Drag the icon from the left toolbar and you will be prompted to use it as an area or as a band (this one uses the horizontal area). Each one can use its own datasource, graphs, etc.
  5. If you use subreports, you need to define first the datasources on the master reports.
    When you open the subreport you need to define the parameters in the data panel in a confusing importing-exporting two list dialog. Use te same names and order than in the first list.
  6. Using the attributes panel, you can remove the autos-submit button parameter, or set the default output type, as seen on the picture.
    PRD attributes

    PRD Attributes

    To remove the auto-submit button permanently for all reports:
    Check the section “How to turn off auto-submit button in Pentaho?” in this wiki that says you need to edit the file:
    \Pentaho\biserver-ce\pentaho-solutions\system\reporting\plugin.xml
    And change:

    <id>RUN</id>
    <command>content/reporting/reportviewer/report.html?solution={solution}&amp;path={path}&amp;name={name}&amp;locale={locale}</command>
    </operation>

    Add “&amp;autoSubmit=false” like this:

    <id>RUN</id>
    <command>content/reporting/reportviewer/report.html?solution={solution}&amp;path={path}&amp;name={name}&amp;locale={locale}&amp;autoSubmit=false&amp;layout=flow</command>
    </operation>

    You can also add “&amp;layout=flow” to make all parameters appears ‘inline’, not each one in its own row. Or change the default option in the output type, but you need to restart the server to see the changes.

  7. You can use the same report and give different levels of summarization. You only need a parameter that gives the options and define in the report the hide functions in the detail or header/footer bands. [Bizcubed original article]
    • Define parameter SHOW-TOTALS.
      String value on a drop down table with Yes and no options.
    • In the report structure, in Details level (under Details Body), find in the top Format menu ‘conditional Hide’ and type:
      =IF([SHOW-TOTALS]=”NO”;”True”;”False”)
    • If there are more levels than detail and summary, you’ll need to use a more complex OR function:
      =IF(OR([SHOW-TOTALS]=”T”;[SHOW-TOTALS]=”D”);”False”;”True”)

    You can check the style tab on the bottom right panel, in the ‘size & position group’, you’ll see the function in the visible attribute, thirdh column=function

  8. Another function created by the Format Menu us the row-banding or alternate color banding. Where you can choose from some colors in a dialog.
    But as is explained on this forum thread, after using the menu you can go to the data tab on the top right panel and search for the function ‘row banding’, click on it and in the bottom panel choose the color you really want.