Pentaho Metadata and its Editor

This week I split Pentaho Metadata and its Editor into two separate projects.  Before this split, we built a single metadata jar file that contained both UI code and core library code.  With this split, we now have a separate jar for the core library, which will be deployed to the Pentaho BI Server and other dependent projects.  The original jar file was close to 1.5MB, now the library only jar is around 374K.

This split also resolves a circular project dependency that was created when we separated out the Metadata Query Language (MQL) Editor.  In order to properly build Pentaho Metadata, one would have to first build the main project, deploy the jar to the MQL Editor, and then check in the new MQL Editor jar back into the main Metadata project and build again.  Now we no longer have this circular dependency.  First the core library is built, followed by the MQL Editor, and then finally the Metadata Editor is built.

You can access the latest Pentaho Metadata Library, Pentaho Metadata Editor, and Pentaho MQL Editor via SVN following these three links:




MySQL Conference Highlights

I just got back from spending the week in Santa Clara, California at the MySQL Conference. I really enjoyed spreading the word about Open Source Business Intelligence. Many folks attending the conference were new to Business Intelligence and Pentaho, so I enjoyed demoing our products, showing off Adhoc Querying with Metadata and building Transformations with Kettle.

I was very impressed with Marten Mickos and Jonathan Schwartz’s keynotes. Both focused on the value and benefits of Open Source technology. Marten celebrated the MySQL community by recognizing some of the top community contributors of MySQL, and Jonathan discussed the big picture of how Sun powered by Open Source can change the world.

Other great highlights included Lance Walter giving a talk on Operational Business Intelligence, and Julian Hyde discussing Interactive OLAP, which included a demo of Halogen. To cap off the week, I visited my good friend Dan Morrill at the Googleplex and got a tour of their wonderfully creative environment, including pin ball machines and space ships!

Hudson: A Continuous Build and Test Platform

Aaron Phillips, one of our newest Senior Engineers at Pentaho, introduced the team to Hudson this week. Hudson’s main purpose is to build and test software projects continuously.  It’s easy to install, and from the web UI it’s possible to configure new projects very easily.  Aaron demonstrated this by quickly setting up Hudson on a virtual machine, and then showing off its capabilities by building Pentaho Metadata, showing successful JUnit test percentages and code coverage reports from Cobertura.

If you have a need to build a Java project continuously, and want to monitor unit test and code coverage changes over time, you should definitely take a look at Hudson.  I’m very impressed.

BarCamp Orlando

Michael Tarallo, Doug Moran, and I attended and presented at BarCamp Orlando today. Despite a bit of a down pour, we all had a great time and got to meet a lot of developers here in Orlando and all across the continent. Some of the interesting talks I attended included a demo of Izea’s SocialSpark from Peter Wright, and a rant on Web 2.0 from Sunir Shah of FreshBooks, and of course Doug’s and Michael’s great intro to Pentaho. Robert Dempsey of Atlantic Dominion Solutions gave an after lunch talk on Scrum, and later I followed it up with a talk on Scrum at Pentaho and Open Scrum. Check out slides of my presentation in pdf.

Sprinting At Pentaho - Mondrian Platform Sprint

For the past couple of months, Pentaho’s Orlando developers have been using the Scrum methodology for managing our projects.  For the past two weeks, I’ve been working part time on the Mondrian Platform Sprint.  We’re delivering some exciting new features into open source.  The first story we’re tackling is the ability to create new analysis views within Pentaho’s BI Platform.  What I’ve enjoyed most about this story is working closely with Mat Lowery and Nick Baker.  We’re all working together on the various components, and things are coming together nicely.

The second story we’re delivering on this sprint is the ability to publish Mondrian Schemas to the BI Platform from Mondrian’s Schema Workbench.  I’ve taken the publishing dialog source code from Pentaho’s Report Designer and bended it to fit within the Workbench.  It’s been a while since I’ve had the pleasure of writing Swing UI code, I’m enjoying every minute.

Both of these stories will make Mondrian more usable within Pentaho’s BI Platform.  What is great about sprinting is that we know we’re focused on high priority items that make our products easier to use.

Google’s Summer of Code

Google has selected Pentaho to be one of the many Summer of Code mentoring organizations for 2008.  We have a long list of fun projects that folks can participate in.  I’ve volunteered to be one of our project mentors, and have come up with a few projects that might interest developers who are new to the BI world and want to get their feet wet.  Here is a high level list of project ideas that I’d love to see accomplished, all within the scope of Pentaho Metadata:

Add Publishing of OLAP Models to Pentaho’s BI Platform - In Pentaho’s Metadata Editor, there is a lesser known OLAP Schema Designer, which Matt Casters wrote a while back.  Eventually, we see this as the primary place where warehouse designers will build and manage Mondrian ROLAP schemas.  Being able to easily publish to Pentaho’s BI Platform, like we already do with Metadata XMI files, will make it easier for designers to deploy their OLAP schemas to a runtime environment.

Add String Functions to Metadata’s open formula API - We currently have a limited set of open formula functions implemented across the various database dialects that are supported within Metadata.   Adding string functions will expand the capabilities of MQL.  This is an interesting problem, because the major database implementations handle string functions differently.  Expanding Metadata’s open formula API will allow more people to implement more robust cross-database metadata solutions.

Enhance MQL Query Editor - The MQL Query Editor is used in the Metadata Editor, Report Design Wizard, and Report Designer today for generating metadata based queries.  To expand the ease of embedding MQL Query Editor in various environments, one task I’ve proposed is porting the editor over to Pentaho’s new XUL Framework.  Another feature that is missing today in the MQL Query Editor is the ability to write free form constraints.  The lower level XML API supports this, but the MQL Query Editor currently does not.

There are plenty of contribution ideas that Pentaho and our community have come up with, we could easily employ the entire Summer of Code student base!  I hope some of the students participating find that our projects are technically challenging and generally fun to implement.

Pentaho Metadata Update…

This past month, Alex Silva, one of Pentaho’s Senior Engineers, has been working on upgrading Pentaho Metadata to work with Kettle 3.0 and also to remove a lot of the new JDK 1.5 warnings that appear when running in Eclipse.  The new code is now available in the trunk.  In this upgrade, Pentaho Metadata now only has a couple of dependencies on Kettle 3.0.  The first dependency is the use of Kettle’s DatabaseMeta object which helps in the generation of dialect specific SQL.  The second dependency is within the UI code inside the Metadata Editor.  The Metadata Editor uses Kettle’s PropsUI, DatabaseDialog, and other UI related classes.  I can imagine eventually that all of these components might end up in a more common form, available for developers.  I’m especially interested in seeing a combination of Kettle’s DatabaseMeta and Pentaho Metadata’s dialect package becoming more generic and useful for third party developers.

Mondrian 3.0 Milestone 1

Over the past few weeks, we’ve been working hard to get Mondrian 3.0 feature complete and relatively bug free. This is a major release, with large architectural and functional changes. I spent the last part of 2007 implementing the new shared dimensions architecture for Mondrian. This change cleaned up a lot of code in Mondrian, along with enabling new features like sharing the same dimension multiple times in a single cube, and also better management of shared dimensions within Virtual Cubes.

In January, I worked with Julian on squashing bugs. Earlier in the month we triaged all the bugs that have been logged since 2006. Now that we’ve assigned and prioritized them, tracking down bugs has been a great way for me to become more familiar with the Mondrian code base.

Some of the other major features that Julian has been working on this round include better role support, olap4j API support, and a much richer set of VBA and Excel function support.

Read about all the changes in the change log, and of course download this milestone release, kick the tires, and let us know what you think!


Bill Seyler has been working on an open source web 2.0 GWT pivot viewer based on the new olap4j API.  He recently posted the first cut over on this google code project.  Julian Hyde has a few words to say about the release over on his blog.  I’ve been working with Bill and Julian on the olap4j query implementation behind Halogen.  By separating the UI from concepts like pivoting, etc, it makes it much easier for other UI developers to incorporate multi-dimensional queries into their own applications.  Hopefully olap4j will encourage new OLAP UI development in the open source community!

Orlando Dev Summit Thoughts…

Last week, the architects of Pentaho’s products convened in Orlando to sync up and discuss plans for 2008.  Matt Casters of Kettle, Julian Hyde of Mondrian, Thomas Morgner of JFreeReport, and Mark Hall of Weka attended the festivities.  Some of the highlights included roadmap discussions on Pentaho Metadata, a great Weka brown bag, and lots of discussion on improving communication and process as Pentaho scales its engineering department.

One aspect of the discussion included having Mondrian, our OLAP server, work more closely with Pentaho Metadata.  Over a year ago, Matt Casters wrote a prototype of a Mondrian Schema Generator based on Pentaho Metadata.  This functionality currently exists as an easter egg within Pentaho Metadata Editor, by clicking “CTRL-ALT-o”.  While brain storming, we’ve thought of many different ways that Metadata, Mondrian, and the Pentaho Platform can work together to make a better user experience.

« Previous PageNext Page »