Complete Pentaho Installation on Ubuntu, Part 8

Install the Schema Editor (OLAP).

The modeling tool for creating star schemas a.k.a. dimensional models is the Workbench Editor, courtesy the mondrian project developers.

Videos shown by Pentaho with the 4.0 version using Agile BI, suggests that you will be able to avoid using this editor for starting and simple OLAP models. On them you start with a datasource, explore interactivelly to create a pivot table in the BI web interface and finally save the query and models. This is very powerfull as usually you discover valuable relations exploring and adding data to your models.

In the community edition we can do:

  • In the ‘good-old’ way  <g>. First build a basic OLAP model, load it into the server, test it, explore the information and add more al over again.
    The Editor helps a lot and its usability is fine. Also the XML file is easy to read with a normal editor -you should do it to learn more on how mondrian will join the tables-. The tool documentation is clear enough to start building models, the hard part is learning the MDX syntax and how you query your models.
  • Using the new model perspective in PDI [development environment].
    Select the model button (1), click the select datasource button (2) and your connection and table (I guess one view o table with all our measures and select fields), then OK and click on the auto-populate button (4), you can trim the model and save it [.xmi], then press the go button (5) to start playing with your [snapshot] data.

    Madel view in PDI

    Madel view in PDI

    I could not make it work from a transformation-step, right clicking and selecting model or visualize, that would be awesome. In this pdf it seems to work on the enterprise edition.

Installation

Steps to install the workbench editor:

  1. Download the latest stable version from the sourceforge project page.
  2. Unpack it into the /Pentaho folder.
  3. Copy MySql driver (mysql-connector-java-3.1.14-bin.jar or mysql-connector-java-5.0.7.jar) from the PDI (/Pentaho/data-integration/libext/JDBC) or BI server (/Pentaho/biserver-ce/tomcat/lib) folder into:
    /Pentaho/schema-workbench/drivers

Steps to use it:

  1. Check if you have set up a BI Server publishing password (view post 5).
  2. Make shure the set-pentaho-env.sh and workbench.sh files are executables in the schema-workbench folder (use the nautilus file browser or sudo chmod +x *.sh command). You can delete the .bat files.
  3. Start the app from the terminal:
    /Pentaho/schema-workbench$ ./workbench.sh
  4. Define the connection in the menu: options->connections.
    You should be familiar with this dialog by now. Use a native (JDBC) and the pentaho_user to the datamart database in the localhost server.
  5. I suggest you open the example from:
    /Pentaho/biserver-ce/pentaho-solutions/steel-wheels/analysis/steelwheels.mondrian.xml

    Schema editor

  6. When you need to update your model select File->Publish
    Fill the publish password information, as well as user and password (joe/password). The next dialog lets you select the folder and confirm the datasource (SampleData or Datamart in our examples). The next dialog confirms if the update went ok or not.
  7. Remember to reload the model in the BI Server cache, use your web browser and in the pentaho console: Tools->Update->Empty the Mondrian Schema Cache.

Interesting modeling information:

  1. Mondrian documentation at pentaho.
  2. Mondrian technical guide: here.
  3. Using OLAP: Jpivot and Mondrian: here.
  4. Mondrian Schema and JPivot presentation: here.
    Skip slides 11 to 34 that are about installation.
  5. Online demo of jpivot: here
    You can browse a demographic cube and get ideas on usability in your models.
  6. Videoblog in http://openbi.ning.com.
  7. A simple date dimension for Mondrian cubes: here.
    Explains the mondrian schema to work with dates.
  8. This demo show how to use MDX YTD function in Mondrian’s JPivot: video.
  9. How to do a MDX percent of totals in mondrian.
  10. How to do trend lines in mondrian: here.
  11. Using a mondrian cube in an .xaction file: here.
  12. Modeling information from Ralph Kimball: articles and papers.
  13. Webminar from informatica.
  14. Tutorial on MDX at database Journal [William Pearson]
  15. Articles in spanish:  in [DMQY]TD, Closing period , parallelperiod.
    In indonesian: prevmember (google translates it, but the code is self explanatory).
  16. Lots of articles on MDX/OLAP in the web refer to propietary tools. Use the ideas and queries, just avoid  their special functions. Check this blog for example.

Additional notes and definitions [from mondrian docs]:

  • member is a point within a dimension determined by a particular set of attribute values.
    That can be a gender hierarchy ‘M’ and ‘F’. Or a city ‘San Francisco’, ‘California’ and ‘USA’ in the store hierarchy.
    [Although you don’t believe it this definition has generateted over 300 coments on a linked-in thread].
  • hierarchy is a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation.
  • level is a collection of members which have the same distance from the root of the hierarchy.
  • dimension is a collection of hierarchies which discriminate on the same fact table attribute (for example, the day that a sale occurred).

Some MDX code to keep at hand:

  • Hot to use time on two axes (use cero insted of row or columns):
    SELECT VisualTotals
    ({[Date].[Calendar].[Calendar Quarter].&[2001]&[3]
    ,[Date].[Calendar].[Month].&[2001]&[7]}) ON 0
    FROM [Adventure Works]
  • How to add a member:
    WITH
    MEMBER [Measures].[Last Period Unit Sales] AS
    ([Measures].[Unit Sales]
    ,[Time].PrevMember)
    MEMBER [Measures].[Unit Sales Growth] AS
    [Measures].[Unit Sales] – [Measures].[Last Period Unit Sales]
    SELECT
    {[Measures].[Unit Sales],[Measures].[Last Period Unit Sales],
    [Measures].[Unit Sales Growth]}
    ON COLUMNS,
    {[Time]} ON ROWS
    FROM [Sales]
  • Format strings and time hierarchies:
    WITH
    MEMBER [Measures].[Sales Previous Period Growth] AS
    ‘(([Measures].[Sales]) – ([Measures].[Sales], [Time].CurrentMember.PrevMember))/
    ([Measures].[Sales], [Time].CurrentMember.PrevMember)’, FORMAT_STRING=’0.00%’
    SELECT
    {[Measures].[Sales],[Measures].[Sales Previous Period Growth]}
    ON COLUMNS,
    [Time].Members ON ROWS
    FROM [SteelWheelsSales]
  • How to change color of a cell:
    Using the MDX documentation, you can check how to Format Strings, Order of sets, or use Function List.
    You can use a MDX to set green background on cell less than $100,000, or a red background if they are greater than $100,000:

    WITH MEMBER [Measures].[Profit] AS
       '([Measures].[Store Sales] - [Measures].[Store Cost])',
      FORMAT_STRING = Iif([Measures].[Profit] < 100000, '|#|style=green', '|#|style=red')
    SELECT {[Measures].[Store Sales], [Measures].[Profit]} ON COLUMNS,
      {[Product].CurrentMember.Children} ON ROWS
    FROM [Sales]

    Or in the model like:

    <CalculatedMemberProperty name="FORMAT_STRING" expression="Iif(Value < 0, '|($#,##0.00)|style=red', |$#,##0.00|style=green')"/>