Archive for the 'metadata' Category

Pentaho Metadata Dialect SPI

Pentaho Metadata allows business users to build reports and dashboards without the need to learn SQL.  Pentaho’s Metadata system takes business user queries and translates them into SQL based on a model described by an administrator.  Today, Pentaho Metadata supports the most popular backend SQL database dialects, generating the correct SQL for Oracle, MySQL, Microsoft SQL Server, and more.

Before, if you wanted to contribute a new SQL dialect for another database for use with Pentaho Metadata, you’d have to submit a patch to the project.  Now, all you need to do is follow the simple instructions here, which utilize Java’s Service Loader API, and plug in your own dialect.  This is a great enhancement!

Agile BI Milestone 1 is now available

We’ve been working hard at Pentaho to deliver the first milestone of Agile BI.  A little over a month ago, James Dixon, our CTO, presented to the community the initial concept, which includes integrating dimensional modeling and visualization within Spoon, our ETL environment.  Since that time, James and the engineering team at Pentaho have been sprinting towards this release, making the source code available in the open, along with adding additional capabilities including the ability to persist models and visualizations.

You can download the first milestone and begin experimenting with the beginnings of what we are considering phase 1 of the Agile BI initiative, which includes the ability to quickly model and visualize a single fact table metadata and olap model.  Feedback is always welcome, check out the Agile BI forum for more discussion.

Data Access

Over the next few weeks, part of the engineering team at Pentaho will be working towards making it easier to get access to your data.  The two use cases we’re addressing in the short term include accessing your SQL data, along with uploading a flat file (CSV, Excel) to drive a report or chart from within Pentaho’s user console.  Our general approach for both of these scenarios is to use Pentaho’s Metadata layer to abstract the querying of data sources.  This allows us to use a common interface and common widgets in our client apps.  To do this, we’ll be extending Pentaho Metadata to include new physical model implementations.  The team has started to prototype some of these capabilities.  We’ve added a web services layer to Pentaho Metadata, and also have started work on the physical models as well as our common widgets, which use our Java / GWT XUL UI framework.

We’re also spending a lot of time thinking about the long term direction of our metadata layer.  I’ve created a community project page for the Metadata project, with links to documentation, binaries and source to make it easier to get involved in the project. Doug will be hosting a live community webex in the next couple of weeks to have a general conversation about where we should take the metadata layer.  We want to make it as easy as possible for folks to start using Pentaho, and we’re going to make that possible through our rich and easy to use metadata layer.

Pentaho BI Server Community Edition, 2.0 RC1

This week, the Pentaho team released our first release candidate of our 2.0 Business Intelligence Server.  The entire Orlando team worked many months on the brand new features incorporated into this release.  I’m going to highlight some of the most exciting features that we’re all proud of at Pentaho:

A New User Console
In this release, we built from the ground up an entirely new user interface designed to greatly enhance the user experience of our Business Intelligence Server.  Taking advantage of Google’s Web Toolkit’s amazing technology, we were able to deliver a complete Web 2.0 dynamic experience to our business users.  We’ve modernized the look and feel of our analysis views and adhoc reporting tool, along with greatly simplifying permissions management.

Data Source and User Management
For the first time, our Community Edition Business Intelligence Server also comes shipped with an Administration console, for management of Users and Relational Data Sources.  In addition to our default support for direct JDBC and JNDI connections, we also have the ability to define a Pentaho Managed Data Source.  This makes it easy for business users to add their own data sources, without having to know the ins and outs of the Java Enterprise Container hosting Pentaho’s BI Server.

Metadata Row Level Security
Another important enhancement to our product offering is the ability to manage row level security from within our Metadata Layer.  In our first release of Pentaho Metadata, we implemented Model, View and Column Level Security.  In this release we extend our Security functionality to also include Row Level Constraints.  Check out the documentation!

There are countless other features that went into our Community Edition release.  At the same time, we’re also releasing our Enterprise Edition software, which includes features such as enhanced ETL Administration, along with many additional BI Server Administration and Configuration capabilities.  These are exciting times!

Row Level Security in Pentaho Metadata

Last Thursday, our Scrum Team kicked off a new Sprint around Pentaho Metadata’s Row Level Security.  In April, Jake Cornelius, one of our Product Managers here at Pentaho, started a Pentaho Metadata Row Level Security discussion in Pentaho’s forum calling for community feedback.  We received lots of great input, and we’re now entering the implementation stage of development.  If you are interested, keep a close eye on Pentaho Metadata’s SVN trunk, where we’ll be implementing this new feature.

We’re tackling this feature from two perspectives, and we have two Scrum stories describing what we hope to accomplish for a first version.  To make sure we have all the plumbing in place, we’ve defined a story focused on expert users, who’ll be able to describe a global security constraint that applies to an entire Metadata Model.  In parallel, we’ve also described a story that focuses on a simple user interface.  The second story will include a lot of up front UI design to make sure we deliver an easy to use interface that solves a large percentage of row level security needs our customers and community have.

Pentaho Metadata and its Editor

This week I split Pentaho Metadata and its Editor into two separate projects.  Before this split, we built a single metadata jar file that contained both UI code and core library code.  With this split, we now have a separate jar for the core library, which will be deployed to the Pentaho BI Server and other dependent projects.  The original jar file was close to 1.5MB, now the library only jar is around 374K.

This split also resolves a circular project dependency that was created when we separated out the Metadata Query Language (MQL) Editor.  In order to properly build Pentaho Metadata, one would have to first build the main project, deploy the jar to the MQL Editor, and then check in the new MQL Editor jar back into the main Metadata project and build again.  Now we no longer have this circular dependency.  First the core library is built, followed by the MQL Editor, and then finally the Metadata Editor is built.

You can access the latest Pentaho Metadata Library, Pentaho Metadata Editor, and Pentaho MQL Editor via SVN following these three links:




MySQL Conference Highlights

I just got back from spending the week in Santa Clara, California at the MySQL Conference. I really enjoyed spreading the word about Open Source Business Intelligence. Many folks attending the conference were new to Business Intelligence and Pentaho, so I enjoyed demoing our products, showing off Adhoc Querying with Metadata and building Transformations with Kettle.

I was very impressed with Marten Mickos and Jonathan Schwartz’s keynotes. Both focused on the value and benefits of Open Source technology. Marten celebrated the MySQL community by recognizing some of the top community contributors of MySQL, and Jonathan discussed the big picture of how Sun powered by Open Source can change the world.

Other great highlights included Lance Walter giving a talk on Operational Business Intelligence, and Julian Hyde discussing Interactive OLAP, which included a demo of Halogen. To cap off the week, I visited my good friend Dan Morrill at the Googleplex and got a tour of their wonderfully creative environment, including pin ball machines and space ships!

Hudson: A Continuous Build and Test Platform

Aaron Phillips, one of our newest Senior Engineers at Pentaho, introduced the team to Hudson this week. Hudson’s main purpose is to build and test software projects continuously.  It’s easy to install, and from the web UI it’s possible to configure new projects very easily.  Aaron demonstrated this by quickly setting up Hudson on a virtual machine, and then showing off its capabilities by building Pentaho Metadata, showing successful JUnit test percentages and code coverage reports from Cobertura.

If you have a need to build a Java project continuously, and want to monitor unit test and code coverage changes over time, you should definitely take a look at Hudson.  I’m very impressed.

Google’s Summer of Code

Google has selected Pentaho to be one of the many Summer of Code mentoring organizations for 2008.  We have a long list of fun projects that folks can participate in.  I’ve volunteered to be one of our project mentors, and have come up with a few projects that might interest developers who are new to the BI world and want to get their feet wet.  Here is a high level list of project ideas that I’d love to see accomplished, all within the scope of Pentaho Metadata:

Add Publishing of OLAP Models to Pentaho’s BI Platform - In Pentaho’s Metadata Editor, there is a lesser known OLAP Schema Designer, which Matt Casters wrote a while back.  Eventually, we see this as the primary place where warehouse designers will build and manage Mondrian ROLAP schemas.  Being able to easily publish to Pentaho’s BI Platform, like we already do with Metadata XMI files, will make it easier for designers to deploy their OLAP schemas to a runtime environment.

Add String Functions to Metadata’s open formula API - We currently have a limited set of open formula functions implemented across the various database dialects that are supported within Metadata.   Adding string functions will expand the capabilities of MQL.  This is an interesting problem, because the major database implementations handle string functions differently.  Expanding Metadata’s open formula API will allow more people to implement more robust cross-database metadata solutions.

Enhance MQL Query Editor - The MQL Query Editor is used in the Metadata Editor, Report Design Wizard, and Report Designer today for generating metadata based queries.  To expand the ease of embedding MQL Query Editor in various environments, one task I’ve proposed is porting the editor over to Pentaho’s new XUL Framework.  Another feature that is missing today in the MQL Query Editor is the ability to write free form constraints.  The lower level XML API supports this, but the MQL Query Editor currently does not.

There are plenty of contribution ideas that Pentaho and our community have come up with, we could easily employ the entire Summer of Code student base!  I hope some of the students participating find that our projects are technically challenging and generally fun to implement.

Pentaho Metadata Update…

This past month, Alex Silva, one of Pentaho’s Senior Engineers, has been working on upgrading Pentaho Metadata to work with Kettle 3.0 and also to remove a lot of the new JDK 1.5 warnings that appear when running in Eclipse.  The new code is now available in the trunk.  In this upgrade, Pentaho Metadata now only has a couple of dependencies on Kettle 3.0.  The first dependency is the use of Kettle’s DatabaseMeta object which helps in the generation of dialect specific SQL.  The second dependency is within the UI code inside the Metadata Editor.  The Metadata Editor uses Kettle’s PropsUI, DatabaseDialog, and other UI related classes.  I can imagine eventually that all of these components might end up in a more common form, available for developers.  I’m especially interested in seeing a combination of Kettle’s DatabaseMeta and Pentaho Metadata’s dialect package becoming more generic and useful for third party developers.

Next Page »