Complete Pentaho Installation on Ubuntu, Part 8

Install the Schema Editor (OLAP).

The modeling tool for creating star schemas a.k.a. dimensional models is the Workbench Editor, courtesy the mondrian project developers.

Videos shown by Pentaho with the 4.0 version using Agile BI, suggests that you will be able to avoid using this editor for starting and simple OLAP models. On them you start with a datasource, explore interactivelly to create a pivot table in the BI web interface and finally save the query and models. This is very powerfull as usually you discover valuable relations exploring and adding data to your models.

In the community edition we can do:

  • In the ‘good-old’ way  <g>. First build a basic OLAP model, load it into the server, test it, explore the information and add more al over again.
    The Editor helps a lot and its usability is fine. Also the XML file is easy to read with a normal editor -you should do it to learn more on how mondrian will join the tables-. The tool documentation is clear enough to start building models, the hard part is learning the MDX syntax and how you query your models.
  • Using the new model perspective in PDI [development environment].
    Select the model button (1), click the select datasource button (2) and your connection and table (I guess one view o table with all our measures and select fields), then OK and click on the auto-populate button (4), you can trim the model and save it [.xmi], then press the go button (5) to start playing with your [snapshot] data.

    Madel view in PDI

    Madel view in PDI

    I could not make it work from a transformation-step, right clicking and selecting model or visualize, that would be awesome. In this pdf it seems to work on the enterprise edition.

Installation

Steps to install the workbench editor:

  1. Download the latest stable version from the sourceforge project page.
  2. Unpack it into the /Pentaho folder.
  3. Copy MySql driver (mysql-connector-java-3.1.14-bin.jar or mysql-connector-java-5.0.7.jar) from the PDI (/Pentaho/data-integration/libext/JDBC) or BI server (/Pentaho/biserver-ce/tomcat/lib) folder into:
    /Pentaho/schema-workbench/drivers

Steps to use it:

  1. Check if you have set up a BI Server publishing password (view post 5).
  2. Make shure the set-pentaho-env.sh and workbench.sh files are executables in the schema-workbench folder (use the nautilus file browser or sudo chmod +x *.sh command). You can delete the .bat files.
  3. Start the app from the terminal:
    /Pentaho/schema-workbench$ ./workbench.sh
  4. Define the connection in the menu: options->connections.
    You should be familiar with this dialog by now. Use a native (JDBC) and the pentaho_user to the datamart database in the localhost server.
  5. I suggest you open the example from:
    /Pentaho/biserver-ce/pentaho-solutions/steel-wheels/analysis/steelwheels.mondrian.xml

    Schema editor

  6. When you need to update your model select File->Publish
    Fill the publish password information, as well as user and password (joe/password). The next dialog lets you select the folder and confirm the datasource (SampleData or Datamart in our examples). The next dialog confirms if the update went ok or not.
  7. Remember to reload the model in the BI Server cache, use your web browser and in the pentaho console: Tools->Update->Empty the Mondrian Schema Cache.

Interesting modeling information:

  1. Mondrian documentation at pentaho.
  2. Mondrian technical guide: here.
  3. Using OLAP: Jpivot and Mondrian: here.
  4. Mondrian Schema and JPivot presentation: here.
    Skip slides 11 to 34 that are about installation.
  5. Online demo of jpivot: here
    You can browse a demographic cube and get ideas on usability in your models.
  6. Videoblog in http://openbi.ning.com.
  7. A simple date dimension for Mondrian cubes: here.
    Explains the mondrian schema to work with dates.
  8. This demo show how to use MDX YTD function in Mondrian’s JPivot: video.
  9. How to do a MDX percent of totals in mondrian.
  10. How to do trend lines in mondrian: here.
  11. Using a mondrian cube in an .xaction file: here.
  12. Modeling information from Ralph Kimball: articles and papers.
  13. Webminar from informatica.
  14. Tutorial on MDX at database Journal [William Pearson]
  15. Articles in spanish:  in [DMQY]TD, Closing period , parallelperiod.
    In indonesian: prevmember (google translates it, but the code is self explanatory).
  16. Lots of articles on MDX/OLAP in the web refer to propietary tools. Use the ideas and queries, just avoid  their special functions. Check this blog for example.

Additional notes and definitions [from mondrian docs]:

  • member is a point within a dimension determined by a particular set of attribute values.
    That can be a gender hierarchy ‘M’ and ‘F’. Or a city ‘San Francisco’, ‘California’ and ‘USA’ in the store hierarchy.
    [Although you don’t believe it this definition has generateted over 300 coments on a linked-in thread].
  • hierarchy is a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation.
  • level is a collection of members which have the same distance from the root of the hierarchy.
  • dimension is a collection of hierarchies which discriminate on the same fact table attribute (for example, the day that a sale occurred).

Some MDX code to keep at hand:

  • Hot to use time on two axes (use cero insted of row or columns):
    SELECT VisualTotals
    ({[Date].[Calendar].[Calendar Quarter].&[2001]&[3]
    ,[Date].[Calendar].[Month].&[2001]&[7]}) ON 0
    FROM [Adventure Works]
  • How to add a member:
    WITH
    MEMBER [Measures].[Last Period Unit Sales] AS
    ([Measures].[Unit Sales]
    ,[Time].PrevMember)
    MEMBER [Measures].[Unit Sales Growth] AS
    [Measures].[Unit Sales] – [Measures].[Last Period Unit Sales]
    SELECT
    {[Measures].[Unit Sales],[Measures].[Last Period Unit Sales],
    [Measures].[Unit Sales Growth]}
    ON COLUMNS,
    {[Time]} ON ROWS
    FROM [Sales]
  • Format strings and time hierarchies:
    WITH
    MEMBER [Measures].[Sales Previous Period Growth] AS
    ‘(([Measures].[Sales]) – ([Measures].[Sales], [Time].CurrentMember.PrevMember))/
    ([Measures].[Sales], [Time].CurrentMember.PrevMember)’, FORMAT_STRING=’0.00%’
    SELECT
    {[Measures].[Sales],[Measures].[Sales Previous Period Growth]}
    ON COLUMNS,
    [Time].Members ON ROWS
    FROM [SteelWheelsSales]
  • How to change color of a cell:
    Using the MDX documentation, you can check how to Format Strings, Order of sets, or use Function List.
    You can use a MDX to set green background on cell less than $100,000, or a red background if they are greater than $100,000:

    WITH MEMBER [Measures].[Profit] AS
       '([Measures].[Store Sales] - [Measures].[Store Cost])',
      FORMAT_STRING = Iif([Measures].[Profit] < 100000, '|#|style=green', '|#|style=red')
    SELECT {[Measures].[Store Sales], [Measures].[Profit]} ON COLUMNS,
      {[Product].CurrentMember.Children} ON ROWS
    FROM [Sales]

    Or in the model like:

    <CalculatedMemberProperty name="FORMAT_STRING" expression="Iif(Value < 0, '|($#,##0.00)|style=red', |$#,##0.00|style=green')"/>
Advertisements

Complete Pentaho Installation on Ubuntu, Part 7

Install Pentaho Design Studio

The Pentaho BI suite started too ambitious as it would have the shark open source workflow engine to coordinate activities like asking for parameters, excecuting jobs and showing the results on the web application. But a compromise was reached and we now have the .XACTION files.

Sequencing of activities are a key factor for usability and scheduling of tasks on your site. Pentaho give us a tool based on Eclipse to edit XML files but building the sequences with it’s components is complex and difficult task, for me, its the hardest to make it work in the  Pentaho BI suite.

Installation

You need to find out if you have a 32bits or 64bits operating system. So open a Terminal and type:

/Pentaho/design-studio$ uname -m

The output could be:

  • i686 or i386 for 32 bits.
  • x86_64 for 64 bits.

Now you know what version to install so:

  1. Browse to sourceforge in the Pentaho Design Studio (PDS) files and download the latest stable version. The equivalent for the 4.0 enterprise edition in the open source version is 3.7. We’ll not enjoy the Agile BI wizardry, but thats the way it will be, get: pds-ce-linux-64-3.7.0-stable.tar.gz or pds-ce-linux-32-3.7.0-stable.tar.gz.
  2. Open it and extract it to our HOME/Pentaho folder. You should have a new design-studio folder.
  3. In the Terminal type:
    /Pentaho/design-studio$ gedit eclipse.ini
    add at the top:
    -vm
    /usr/lib/jvm/java-6-openjdk/bin/java
    save and exit.
  4. Start the editor:
    /Pentaho/design-studio$ eclipse
  5. Accept the default workspace or choose a different folder. Check the do not ask again option.
  6. Now configure your project (as recommended in the wiki):
    – Select File -> New -> Project.
    – Choose General -> Project from the New Project wizard
    – Press the Next> button.
    – Enter Pentaho Solutions or any other name as the project name.
    – Uncheck the Use default check box
    – Browse to the /Pentaho/biserver-ce/pentaho-solutions folder.
    – Select Finish.
    – In the navigator panel you could see the files.
    Open any .action file to test your PDS.
  7. You can exit the editor

Its time to browse the reference material.

References

  1. Pentaho Design Studio User Guide, ver 1.2 2006 [PDF 1.2MB].
    Skip to the Quick Tour section and creating your first action sequence.
  2. Create Action Sequences [pdf]

Notes

Tip #1: The Secure/Filter Prompt componet has a terrible name, as it only makes shure that you chose a valid option. You can use a Sql Query Lookup component to get a result with values and descriptions and then pass the ‘array’ to the prompt. Or you can type the options in a list if you define an INPUT first and pass it to the prompt as input. Here is the XML Sql lookup for a pair (City and City name). Notice the datamart jndi datasource:

<!-- Ask for a City and its Name -->
    <action-definition>
    <component-name>SQLLookupRule</component-name>
    <action-type>Get Citys</action-type>
    <action-outputs>
    <query-result type="result-set" mapping="city_result"/>
    <city type="string"/>
   <city_name type="string"/>
    </action-outputs>
    <component-definition>
    <jndi><![CDATA[Datamart]]></jndi>
    <query>
    <![CDATA[SELECT DISTINCT `City`, CONCAT_WS('-', City, CityName) AS city_name
      FROM `dim_cities`ORDER BY `City`]]>
    </query>
    <live><![CDATA[false]]></live>
    </component-definition>
 </action-definition>

Use the values to ask for a choice:

<!-- Aks for a City -->
  <action-definition>
    <component-name>SecureFilterComponent</component-name>
    <action-type>Ask for Report Parameters</action-type>
    <action-inputs>
   <city_opt type="string"/>
    <city_result type="result-set"/>
    </action-inputs>
    <action-outputs/>
    <component-definition>
    <selections>
    <city_opt style="select" prompt-if-one-value="true">
    <title>Choose a City:</title>
    <filter value-col-name="city" display-col-name="city_name">city_result</filter>
    </city_opt>
    </selections>
    </component-definition>
 </action-definition>

Tip #2: It is very difficult to trace parameters between actions. I use the hello word component to check if I’m building them right (paste at the botom of your .xaction files and add your variables):

<action-definition>
  <component-name>HelloWorldComponent</component-name>
  <action-type>Values</action-type>
  <action-inputs>
      <VARIABLE1 type="string-list"/>
      <VARIABLE2 type="string-list"/>
  </action-inputs>
  <component-definition>
    <quote>
 <![CDATA[Result:- Var#1: {VARIABLE1}. Var#1: {VARIABLE2}.]]>
   </quote>
  </component-definition>
 </action-definition>

What you get

When you finish your .xaction and reload the solutions (In the tools menu)in your browser  you will see the file, and you can schedule it by selecting it, right click on it and set the options to run at a specified hour or run weekly, et. You can also use the administration console (PAC) to do this.

If you want a more descriptive title than the file-name.xaction copy and rename the file with sufix .properties to change the title string. For localization use the file-name_xx.properties file where xx is your language code, for example es for spanish.

Workarounds

There are on the net several articles on how to improve the parameter window experience. Here are some resources:

  1. Create a Custom parameter Template: Bizcube Page.
    Shows how to use a .html template with the Prompt/Filter Component.
  2. Use AJAX for filling dependent select fields: Pentaho Wiki Page.
    This example also uses an .html file and shows how to fill master-slave fields.
  3. Use a Calendar Date Picker: Pentaho Forum Thread.
    Uses AJAX to pup up a calendar so you can choose a date.
  4. Advanced code: Secure/Filter source code.

Firefox Fix

If you use the BI Suite and the Prompt/Filter component with Firefox you need to make a code modification to avoid the error: “Please select something for undefined” when you click the  OK button.

Head over and edit:

/Pentaho/biserver-ce/tomcat/webapps/pentaho/js/parameters.js

Search for if( lastName != null ) {, its on the ‘function getParameters’ and change it to:

if( elements[i].name != lastName ) {
  if ( (lastName != null) && (lastName != "") ) {
    var ckRtn = checkParams( form, element.type, lastName, gotOne );
    if( ckRtn == 0 ) {
      return null;