Pentaho Released Version 5.2 CE: This Week (october 6)!

Pentaho & Webdetails dit it again, as promised, on October, the new GA release, I can’t wait to see whats new (new theme & marketplace, improved PRD COSSTABS?) or check the release notes. Still no integrated Saiku on CE but you cant have it all 🙂

These are direct links:

BI Server 5.2

http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/5.2/biserver-ce-5.2.0.0-209.zip/download

Data Integration 5.2
http://sourceforge.net/projects/pentaho/files/Data%20Integration/5.2/pdi-ce-5.2.0.0-209.zip/download

Report Designer 5.2
http://sourceforge.net/projects/pentaho/files/Report%20Designer/5.2/prd-ce-5.2.0.0-209.zip/download

Pentaho Metadada 5.2
http://sourceforge.net/projects/pentaho/files/Pentaho%20Metadata/5.2/pme-ce-5.2.0.0-209.zip/download Mondrian

Mondrian Schema Workbench 3.8
http://sourceforge.net/projects/mondrian/files/schema%20workbench/3.8.0/psw-ce-3.8.0.0-209.zip/download

Mondrian Agregation Designer 5.2
http://sourceforge.net/projects/mondrian/files/aggregation%20designer/5.2/pad-ce-5.2.0.0-209.zip/download

The steps I took for the upgrade are on a new post.

Yeeeeiiii!

Bullet Graph Dashboard: PDI-MDX-SQL DataSource + BI Server + CDE

To Do

The idea to build this dashboard comes from joining Stephen Few’s spec [pdf], an excelent implementation from Slawomir Chodnicki and finding out that the C*Tools (Webdetails dashboard project) has published a Bullet Graph widget from the Protovis Library that is easier to use and customize.

Dashboard PDI Final

Final Dashboard

Components

We’ll need:

  1. CDE Editor. Install or update to the latest version [Instructions]
    Used to create and edit the parameter, datasource and mappings to the graph.
  2. Pentaho PDI/Kettle ETL tool [Download].
    Used for data retrieval by selected parameters, normalization and PKI process.
  3. The Pentaho BI Sever
    To excecute the dashboard via its menu and see the html result.

Building The Layout

Lets start a new CDE dashboard by a click on the CDE icon in the top toolbar: CDE icon

Header Rows

The CDE Editor will open with ‘Layout’ selected. This is the asociated toolbar:
cde layout toolbar

  • Click on the row icon (=)
    Select Row
    Set:

    • Name=header
    • BackgroundColor=99a5de
      (in the bottom text field, close clicking on the color ball)
    • In the ‘Corners’ attribute. Use the down-arrow to scroll to Round
  • Click on the row icon (=) three more times to have a total of 4 rows.
    Be carefull, the toolbar is dynamic and the icons will change.
  • On the first and second rows add one ‘Html’ element (<>)
  • On the third row, set Name=ParamRow
  • Type in the first html type a title:
    <h2>Sales Dashboard By Product Line</h2>
  • On the second Html type:
    <div class="topRow1">Sales YTD</div>
    <div class="topRow2">Actual amount (horizontal bar) and Previuos Year (blue mark)</div>
    <hr class="topRowHR"/>

It should look like this.

Dash 00

Save in your development folder with a meaningfull ttitle (used by the BI menu system)  and click Preview to see the headers. Close the dialog.

Parameter Row

  • The third row will need four columns.
  • The first and the third column will have also an Html element.
  • Put in the first one the text Year, and in next Month.
  • Select the second column and look for the name attribute set it to nYear (node for year select).
    Name the forth one nMonth.

Result Row

  • Add two columns
  • Name the first one nResultGraph
    and the other nResultTable
  • Set for each column span size=12 (there’s a total of 24 columns in the template)

Save the final layout, that should look like this.

Dash n

Important: If the message “Dashboard saved successfully” doesn’t appear, check your mofifications for errors. If you can’t fix it, exit and resume from the latest point saved.
To edit a dashboard, select it on the browse panel in the BI Server. Right click on it and then select Edit in the context menu.

Building Datasources and its Widgets

Three types of datasources:

  1. Two SQL queries are needed for Year and Month.
  2. The table will be  a simple MDX.
  3. The Graph will need additional columns for ranges, we’ll use an ETL transformation for this one.

SQL

Click on Data Sources ‘button’ in your top-right. The left side of the CDE will list all the connections the CDE can use to get data.

  • Click on SQL Queries, then on sql over sqlJndi
    The group SQL Queries will be added
  • Set:
    name=dsYear
    Jndi=SampleData
    query=
SELECT DISTINCT YEAR_ID FROM DIM_TIME ORDER BY YEAR_ID DESC

Click on Components ‘button’ in your top-right.

  • We need a parameter to hold the Year selection
    • Click on the Generic button on the left list
    • Click now on Simple parameter
    • Set Name=pYear
      Set de default to Property value=2005
    • Save
  • Drop Down
    • Click on Selects button on the left list
    • Click now on Select Component
    • Set Name=csYear
      Parameter=pYear (the one defined above)
      HtmlObject=nYear (attach point, that is the column we named that way in the layout)
      Datasource=dsYear
    • Save
  • Click Preview. Check the select box. Close the preview dialog.

For the Month SQL, the set up is a little diferent as we need to set up a listener for changes on the Year select box and then pass that parameter to the Component and then the SQL datasource.

  • Click on Data Sources
    • Click SQL Queries, then sql over sqlJndi.
    • Set:
      name=dsMonth
      Jndi=SampleData
      Click on Parameters, click on the Add Button.
      in the dialog: Name=pYear, value=2005, Type=String
      query=

      SELECT DISTINCT MONTH_NAME
       FROM DIM_TIME
       WHERE YEAR_ID=${pYear}  ORDER BY MONTH_ID
  • Click on Components
    • Click on the Generic button
    • Click on Simple parameter
    • Set Name=pMonth
      Set de default to Property value=May
    • Click on Selects button on the left list
    • Click now on Select Component
    • Set Name=csMonth
      Parameter=pMonth
      HtmlObject=nMonth
      Click on Listeners, on the select arrow choose pYear
      Click on Parameters, click on the Add button,
      in the dialog set Arg=pYear, Value=pYear (you can select with the “…” button)
      Datasource=dsMonth
    • Save

If you click Preview you should see something like this.

Preview 01

MDX

Let’s define the query against the ‘Steel Wheels Sales’ cube as our Datasource. There are some links at the bottom of the article about MDX and sumary functions.

  • Click on
    • Data Sources
    • MDX Queries
    • mdx over mondrianJndi.
  • Set:
    • name=dsTable
    • Mondrian squema=SteelWheels
      (jndi will be set to SampleData)
    • Query=
      with
      member [Measures].[Sales_Curr] as 'Sum(Ytd([Time].CurrentMember), [Measures].[Sales])'
      member [Measures].[Sales_Prev] as 'Sum(Ytd(ParallelPeriod([Time].CurrentMember,1)), [Measures].[Sales])'
      select
      {[Measures].[Sales_Prev], [Measures].[Sales_Curr]} ON COLUMNS,
      {[Product].Children} ON ROWS
      from [SteelWheelsSales]
      where [Time].[2005].[QTR2].[May]

And now the grid element to display our data.

  • Click on:
    • Components
    • Others
    • Table Component
  • Set:
    • Name=ctTable
    • HtmlObject=nResultTable
    • Datasource=dsTable
  • Save & Preview
  • Now click on Advanced Properties (below the Component button)
  • Click on:
    • Column Headers
      add button for: Sales, Previous, Actual
    • Column Formats
      add button for: %s, $%.2f , $%.2f
      Those are for: string (%s) and currency with two decimals ($%.2f)
  • Set:
    • Paginate=False
    • Show Filter=False
    • Sort Data=False
  • Save & Preview

For testing and format purposes the used query doesn’t change when the month or year are modified. Now we need to add listeners, the parameters in the component and the datasource to calculate the ‘quarter’ depending on the month.

Click on Components (you may be already there)

  • Click on:
    • (left) Generic button
    • Simple parameter
  • Set:
    • Name=pQtr
    • Property value=QTR2

Click on Data Sources

  • Click on:
    • (group triangle) MDX Queries
      select dsTable
    • Parameters
  • Add three lines
    • Name=pYear, Value=2005, Type=String
    • Name=pQtr, Value=QTR2, Type=String
    • Name=pMonth, Value=May, Type=String
  • Click on Query
    change last line from:

    where [Time].[2005].[QTR2].[May]

    To:

    where [Time].[${pYear}].[${pQtr}].[${pMonth}]
  • Save & Preview.
    You should see the same image than before. Close the dialog.

Click on Components

  • Click on:
    • (group triangle) Others
      select ctTable
    • Listeners
      Select in the drop down: pYear, pMonth. Click OK
    • Parameters
  • Add three lines:
    • Arg=pYear, Value=pYear
      (you can use the three dots (…) button to select  value)
    • Arg=pQtr, Value=pQr
    • Arg=pMonth, Value=pMonth
  • Click on:
    • Advanced Properties (at the top link on your right)
    • preExecution
  • Paste this javascript function to set Quarter based on month selected:
    • function Q (){
      var vQtr='';
      switch (pMonth) {
      case 'Jan':
      case 'Feb':
      case 'Mar':
      vQtr='QTR1'; break;
      case 'Apr':
      case 'May':
      case 'Jun':
      vQtr='QTR2'; break;
      case 'Jul':
      case 'Ago':
      case 'Sep':
      vQtr='QTR3'; break;
      case 'Oct':
      case 'Nov':
      case 'Dec':
      vQtr='QTR4'; break;
      }
      pQtr=vQtr;
      }
  • Save & Preview.
    Try changing the month to ‘Jan’. You should see something like this.Table Preview

Close the preview dialog.

ETL

In he last part we’ll use an ETL process and the Bullet Graph widget.

Start the PDI. Create a new transformation and save it in the demo/development folder with the name SalesDashboard.ktr.

This is the PDI transformation

Dashboard PDI

Dashboard PDI Transformation

  1. Prepare the default parameters for testing.
    On the menu click Edit ->Parameters
    Select Parameters Tab, Define pYear, pQtr, pMonth.
    Dash PDI 00
  2. Set the MDX
    Add Input->Mondrian Input
    Dashboard PDI 01
    Check the ‘Replace variable’  checkbox so the parameters are used.
    The query is:
with
member [Measures].[Sales_Curr] as 'Sum(Ytd([Time].CurrentMember), [Measures].[Sales])/1000'
member [Measures].[Sales_Prev] as 'Sum(Ytd(ParallelPeriod([Time].CurrentMember,1)), [Measures].[Sales])/1000'
select
{[Measures].[Sales_Prev], [Measures].[Sales_Curr]} ON COLUMNS,
{[Product].Children} ON ROWS
from [SteelWheelsSales]
where [Time].[${pYear}].[${pQtr}].[${pMonth}]
  • Rename columns
    Here we copy columns to new ones but set the new names. This is necesary if you want to remove the dynamic nature of column names in MDX results, for example when you use time in columns then you may get [Time].[2005].[QTR2].[May] or [Time].[2005].[QTR1].[Jan] then you better use ‘CurrentMonth’ to use in PDI steps.
    Add Scrpting->Modified Java Script
    Dashboard PDI 02
    The script is:

    var rowOut = getOutputRowMeta();
    newRow = createRowCopy(rowOut.size());
    var ProductLine = rowMeta.getString(newRow, 0);
    var Actual   = rowMeta.getNumber(newRow, 1);
    var Previous = rowMeta.getNumber(newRow, 2);
  • Get the maximum value for each column.
    Here max value for Actual is renamed to Max1 and Previous o Max2.
    Add Statistics->Group by
    Dashboard PDI 03
  • Calculate max values and two ranges.
    Add Scrpting->Modified Java Script
    This is an oversimplified ETL, with ranges set on maximum values for all rows. In any case you could use KPIs from tables and a lookup step.
    – maximum value from both columns
    – round it up
    – I set two marks one at 70% other at 90%, for grade of color in the graph bar.
    Dashboard PDI 04
    – The sample script use is:

    var Maximum=0.0;
    var Mark1=0.0;
    var Mark2=0.0;
    if ( (Max1!=null) || (Max2!=null) )
    if ( Max1 > Max2 )
    Maximum = Max1;
    else
    Maximum = Max2;
    if (Maximum!=0.0) {
    if (Maximum>=1.0) {
    Maximum = Maximum / 10.0;
    Maximum = ceil(Maximum) * 10.0;
    } else {
    Maximum = Maximum / 1.0;
    Maximum = ceil(Maximum) * 1.0;
    }
    Mark1=Maximum*0.7;
    Mark2=Maximum*0.9;
    }
  • Process measures
    Each row may need processing, if KPI is set for each measure then it should be calculated on this thread. Here we just add a subtitle string (k=thousand).
    Add Transform->Add constants
    Dashboard PDI 05
  • Remove excesive text from product line description
    The first colum has text like “[Product].[Ships]”, we can remove the fixed part with a regular expression to replace the unwanted and fixed text
    Add Transform->.Replace in String Step.
    Type in Search:  ^\[Product\]\.\[(.*)\]$
    Notice that no output field is specified so ProductLine will be used, that the dot and squared brackets are ‘escaped’ with backslash and that the replace string is (.*)=$1
    Dash PDI Regex
    For using this step: Reference1, Reference2
  • Join the data and its ranges.
    Add Joins->Join Rows (cartesian product)
    Dashboard PDI 06
    The important thing is to state the step from where is the data coming from, if you don’t do this you end up missing data or even complete rows: Add constants.
  • Sort as you like.
    Add Transform->Sort rows
    Dashboard PDI 07
  • Output
    Make available only the relevant columns:
    – Measure Title
    – Measure Subtitle
    – Value of Bar
    – Value of Mark
    – Min Range (darker shade in bar)
    – Max Range (lighter shade in bar)
    – Max value
    Add Transform->Select values
    Dashboard PDI 08

Selecting this last step and previewing the output you get.

Dashboard PDI Result

You can close the PDI.

Bullet Graph

Back to the CDE Editor.

Click on Data Sources

  • Click on:
    • MDXKettle Queries button
    • ketle over kettleTransFormFile
    • Select the ketle over kettleTransFormFile Component
  • Set
    • Name=dsBulletGraph
    • Kettle Transformation File=SalesDashboard.ktr
    • Click on Variables, a dialog will open
      add three lines for:
      – Arg=pYear
      – Arg=pQtr
      – Arg=pMonth
    • Click on Parameters, in the dialog add three lines:
      – Name=pYear, Value=2005, Type=String
      – Name=pQtr, Value=QTR2, Type=String
      – Name=pMonth, Value=May, Type=String
    • Click on Output Options, in the dialog add six lines:
      – Index=0, Index=1, Index=2, Index=3, Index=4
      Index=5, Index=6
    • Click on Columns, in the dialog add six lines:
      – Index=0, Name=Product Line
      – Index=1, Name=Subtitle
      – Index=2, Name=Actual
      – Index=3, Name=Previous
      – Index=4, Name=Mark1
      – Index=5, Name=Mark2
      – Index=6, Name=Maximum
    • Click on Query=OutputBullet
      (last activity)
    • Save CDE

Click on Components

  • Click on:
    • Charts
    • CCC Bullet Chart
      a group will be added
    • Select the new CCC Bullet Chart Component.
  • Set
    • Name=cgBulletGraph
    • Width=420
    • Height=220
    • Datasource=dsBulletGraph
    • Clear Title and Subtitle
    • Click Bullet Ranges
      click on the minus button (-) to delete each line
      Close the dialog
    • Click on Parameters, in the dialog add three lines:
      – Arg=pYear, value=pYear
      – Arg=pQtr, value=pQtr
      – Arg=pMonth, value=pMonth
    • Set HtmlObject=nResultGraph
    • Click on Listeners
      Select pYear, pMonth

Save and preview. We are almost finished

Dashboard PDI Almost

Fine Formatting

There are three formating options in the CDE

  1. Format options in CDE, either in the component advanced properties or html or column layout
  2. CSS class from the web page
  3. Some properties of the components

To work with the CSS or DOM elements we add a resource file:

  • Create a text file and type:
    That is mainly header fonts and colors, parameter style and table formatting.

    #header { margin-top:1em; }
    h2 { font-size:1.4em; padding-top:7px; padding-left:1em; height:1.6em; }
    .topRow1 { font-size:1.2em; font-weight:700; margin-left:1em; }
    .topRow2 { font-size:1em; margin-left:1em; }
    hr.topRowHR { height:2px; width:900px; color:red; }
    #ParamRow { font-size:0.9em; vertical-align:bottom; margin-bottom:2em; }
    #ParamRow select { font-size:0.9em; vertical-align:top; }
    #ParamRow input { font-size:0.9em; vertical-align:top; }
    #nResultTable { margin-top:2em; }
    table tr.even, table tr.odd{ background-color: white; }
    table thead th { border-bottom: 1px solid #CCC; background-color: #fff; }
    table tbody td.number, th.number{ text-align: right; }
    table tbody td.string, th.string{ text-align: left; }
    table tfooter td { background-color: white; }
    .ui-widget-header { background: white; color: white; border-bottom:0px none; border-top:1px grey solid; border-left:0px none; border-right:0px none;}

    Save it in the demo/development folder as Dashboard.css

  • On the CDE, click on Layout option
  • Add a css resource file, select from the options:
    css
    external
  • Set:
    name=cssDashboard
    Resource file=Dasboard.css

This will be the result

Dash Final 1

We can modify the Graph if we access its properties, in this case the ‘Extension Points’. Lets change the Marker shape and color.

Click on Components

  • Click on the triangle of the Group Charts
  • CCC Bullet Chart
  • click on Advanced Properties.
  • Click on Extension points a dialog will open.
    Add Three lines. Fill with:

    • Arg:bulletMarker_shape  Value:bar
    • Arg:bulletMarker_lineWidth  Value:3
    • Arg:bulletMarker_strokeStyle  Value:#2e559c

Note:
Check posible values for Extension Points at CCC (Bullet Chart) and Protovis sites.
In the CCC (Bullet Chart) page check the keyword to access: base, mark, measure, title, rule, range. In the rest of the document and examples look for the second part: font, shape. Join them with an undescore: bulletMarker_lineWidth.
Or if you want to change the color of the inner bar, you could try:
Arg: bulletMeasure_fillStyle, Color: green or #0F0

You can use other properties to change the appearence, for example the tooltip to show the value of a value can ve modified.

Click in the Advanced Properties of the Graph, in Value format:

function f(value){return ('$'+sprintf("%.1f", value)+'m')}

The final product should be the initial dashboard image.

.

Additional Links

  • Bullet Chart on webdetails: here.
  • Class, Field, Method and Extension Points on webdetails: here.
  • MDX function YTD youtube video by phiintegration.
  • MDX Time Series Functions Part I. Series by William Pearson.
  • Another fine example about building CDE dasboards OSBI.
    Its in French, but the code and diagrams are easy to understand.

Complete Pentaho Installation on Ubuntu, Part 13

Additional Software

There are lots of plugins that enrich the pentaho BI suite. There is a plugin page for the BI Server PUC and for the PDI). I still haven’t tried them all but here are some interesting ones anyway:

BI Open Flash Charts

You can now add the Open Flash graphs in your dashborads in addition to the Open Flash Chart and JFreeChart. This plugin was started from the Pentaho framework so building with it should be familiar. But the stunning graphs come with a v3 which is not OS.

Here are links to the FusionCharts Blog, the Open source version, and the Pentaho plugin.

You can download the 0.02 version and its samples.

To install

  1. Extract the zip file into the /Pentaho/biserver-ce/pentaho-solutions/system.
  2. Extract the samples zip file into the /Pentaho/biserver-ce/pentaho-solutions/bi-developers.
  3. Change the file system/pentaho.xml to include xfusion on the acl-files list:
    <acl-files>...,xfusion</acl-files>
  4. Open the Pentaho User Console (PUC) and refresh the solution repository.

Here you can see installation, demo and usage in a Youtube video.

Ruby Step for PDI

Slawomir Chodnicki released on march 2011 the Ruby plugin step for Kettle 4 here.

There are examples included in the file you should download in the github page. Click on downloads and the select RubyPlugin_1.1_Kettle_4.2.zip. You already know thay it should be unziped on the plugins folder.

In a forum post some caracteristics are mentioned.

Excel Writer Output PDI Plugin

It is no longer a plugin as it is included in the 4.2 Kettle release, but post about its usage are still labeled as that 🙂

It let’s you set more options on formating and range.

Release notes and usability post.

BI iPhone plugin

Edit: The infomation below is no longer acurate:

  • Since 3.8 the BI server includes the iPad code, as stated by richad3 on the BI forum. But for 4.0 improvements were made on a week of fun.
    It seems the Enterprise plugin works great with the iPad, check the video. On the CE Edition the PAT/jPivot should work too, I’ll let you know what I find.
  • An alternate option for mobile devices is made on this OS project: PentaGoMo
  • The new site redesign has made the original code (the one that needed fixing) unavailable redirecting everything here.
    That’s bad if you still want to play with your Blackberry, Android or iPhone devices. I’ll let you know what I find.

The original information will remain here until new links and information is found:


The original article about a plugin for the iPhone is as old as this device, it was made available by the now vicepresident of engineering in Pentaho.

The BI PUC detects the browser and present a special menu. When you select an action (for report, dashboard, etc) a special program makes the parameter selection easier and presents one by one. Then the report is shown, you can see this video.

Unfortunatelly some corrections have to be made to the code to work with the new Pentaho BI version. Here are the download and correction instructions from Will Gorman on 2008 and additional ones on 2011 by Herwin Rayen. And a tech-tip so you can modify it further in the BI version 3.0.

There is also an Android app in the marketplace, here is the forum post that mentions it and the link to the 3.5 version.

In the Pentaho Blog an anouncement was made on summer 2011 about a iPad prototype. If you can’t wait “expand the plug in to cover iPad, just modify the Java source code, recompiled its class, then updated the JAR accordingly. This method can be used to expand the plug in to cover Windows Mobile, BlackBerry, and Android” – Paul Pambudi.

If you dont have all these devices you can check the emulators in Firefox’: User Agent Switcher add-on or testiphone.com web-based iPhone browser emulator.

BI BIRT Report ‘Plugin’

You can use the BIRT report view engine in the BI Pentaho User Console (PUC) which is newer and different than the plugin that is used in pentaho.

You have to download an eclipse runtime and extract a directory to your pentaho tomcat/webapss/pentaho folder and also download the samples to your pentaho-sulutions folder. Check the description, instructions and samples here.

Or you can use the plugin for PDI/Kettle to just run and burst your BIRT specifications. Check this link.

Data Cleaning [future] Plugin for PDI

Data validation needs coding in the actual PDI, but a nice open source utility for doing validation and correction exist and it seems it will become part of the PDI soon.

Data Cleaner can analyze, profile, transform and clean data on its own. But Matt Casters is working on a plugin so Kettle can use it. Here is the link that briefly shows its functionality and mentions the plugin here.

GeoSpatial Analisys on Kettle

On July 2011 version 2.0 of GeoKettle was announced by Spatialytics.com here. It is a step add in for Kettle 4.0 that allows “spatial analysis functions such as buffer calculations, overlays, metric operators, etc” from and to different file formats. It even reads sensors. Sounds like fun.

Weka Plugin

You can use the Knowledge Flow Plugin, that lets you use a weka predictive model as a step in a PDI transformation. Install an usage here.