PostNuke

Flexible Content Management System

News

Some thoughts on going beyond data abstraction in PostNuke

Contributed by on Mar 06, 2004 - 05:47 AM

For what it's worth, here's one of the major shortcomings that our multi-million dollar CMS shared with PostNuke:

A data abstraction layer (like ADODB, PEAR DB, or JDBC for that matter) isn't the same as an abstract persistence layer. In a data abstraction layer, the complexity of the specific database implementation is hidden from the developer. A Data Abstraction layer, for example, lets you do things like this:

$myItem = new Article();
$myItem->setName("someName");
$myItem->setAuthor("Michael");
$myItem->setBody("This is an article body");
$myItem->save();


Obviously, there's nothing wrong with that. Nothing, that is, as long as you don't want to change your Article content type. If you ever need to add a new field, or modify an existing field in the Article class (and you probably will), you will need to hunt down all the client code that gets and sets the existing fields, as well as any hard coded SQL, HTML form fields, display pages, etc. and manually add the new field. In a real world system, this could amount to a fair amount of time and energy to make and test relatively simple content item changes.

An abstract persistence layer, on the other hand, is something that sits on top of your data abstraction layer and allows you to persist, update, delete and retrieve objects without having to know the details of the specific content item implementation.. Saving a new Article with an abstract persistence layer might look like so:

$myItem = ContentItemFactory::getContentType("Article");
$myItem->mapData($_POST);
$myItem->save();

Retrieving a specific Article might look like this:
$myItem = ContentItemFactory::getContentType("Article");
$myItem->getByID($_GET);
$myItem->render();

To make something like this work is pretty easy in Java, and will probably be fairly easy in PHP 5. In PHP 4, you can do it without interfaces and abstract classes, as long as you're in full control of your API.

The key elements are:

1. Look at your content model from the perspective that all persistable content types have basically the same persistence requirements (that is - if it's going to be in one or more DB table, you have to Create/Retrieve/Update/Delete content items - regardless of whether the type is Articles, Banners, Downloads, or Comments, etc.)

2. Abstract out the things that make one content type different from another. In terms of a CMS, the things that tend to differ from one content type to another are the table they're stored in, what the field names and types are, etc. You can easily abstract this "content type metadata", and store it in flat XML configuration files or in a db table. The choice there is pretty arbitrary. Note that once you do this, you open the door for your CMS to be able to manage a new content type called "Content Type". This also allows you to decouple your content management and display applications from the implementation details of the objects they're managing and displaying.

3. Define the rules for mapping metadata into the various parts of the CMS. There are plenty of places that metadata can be used, and once of the most useful is in defining how to generate SQL queries for specific runtime content types works. That's the whole point behind the abstract persistence layer. In postnuke, our modules and blocks shouldn't have to know how to save, delete, update, or retrieve objects from the underlying database (which they don't - the ADODB layer takes care of that). They also shouldn't care what fields are in a particular table or object (which they currently do). The CMS, however, can't persist or retrieve objects without knowing how to do so. Giving this knowledge to the CMS via easily configurable metadata would actually make postnuke worth millions.

If the PN community were to implement a metadata-driven abstract persistence layer as part of the PN architecture, it would go a long way toward insulating future developers from some of the most tedious types of changes they have to make - verifying that they "connected all the dots" from the input html form - through the system - into the database - and back again. In addition, it would also go a long way toward making it very easy to add new content types or modify data fields in existing content types.

As an aside - you could also apply this metadata-driven abstract persistence layer pattern to:

  • Define the fields and their attributes to display on content entry forms (fieldname, type, maxlenght, entrylabel, validationRule, etc)
  • Define what fields to display on content display pages and lists
  • Define the rules for content management workflow and role based security, etc.

    Anyway - that's my $0.02 for now.

    Cheers, Michael
  • 3213