-
Moving on: Better PostNuke ShortURLs
(News)
-
URLs have consisted of an index file in the site's root with a long convoluted query string that is both user- and search-engine unfriendly, meaning the search engines choke on them and thus doesn't index PostNuke sites well, while making the URLs hard to post to people in email or in forums. For instance, a news link looks like this:
/index.php?name=News&file=article&sid=123&mode=thread&order=0&thold=0
For some time now, PostNuke users have cried out for better Search-Engine Friendly URLs, and for the past few years, the only thing available has been a theme hack first detailed by Karateka (possibly E. Soysal before that, the links in the article are dead) way back in 2002, since worked on by ColdRolledSteel (Craig Saunders), and consequently me.
The advent of the ShortURL hack has seen sites hosted on Apache servers with the URL Rewriting module (mod_rewrite) enabled get URLs like
/Article123.html
for the above link, where certain assumptions have been made about the default settings for mode, thread and threshhold. A big improvement, but not very descriptive, and it comes at the cost of heavy post-processing of the site's content for links. Also, Search Engines use link keyword relevance in their rankings, and Article123 doesn't say much about the link, except that it's an article with the id 123.
As Karateka pointed out at the time in his article, a problem in implementing friendlier URLs with virtual directories is that all paths in PostNuke are relative, ie relative to the site root folder where index.php is located, and fixing it then would have required extensive changes in the core. That is, a URL like /Example/view.html would result in the browser looking for all links relative to its present location, ie in the nonexistant subfolder called Example, and subsequently it would fail to find the linked stylesheets, images etc, and all links from the page would similarly fail.
Unfortunately this situation has not changed in the intervening years, but as PostNuke modules are becoming API-compliant, they reference the same system function to build their URLs, so fixing this function and other associated functions to use root-relative links(1) will fix all compliant module URLs. But that leaves all other links, like images, Javascript, and stylesheets. The move to templating with Xanthia (for themes) and pnRender (for modules) is also making it easier, since Xanthia templates use a Xanthia variable to reference the theme's image directory path. So fixing Xanthia and pnRender will fix most paths in Xanthia themes. The exception are stylesheet and Javascript link paths and any links in the theme header, for which new path variables need to be introduced, so some updating of Xanthia themes is required. This makes the transition period to PN 0.8 an ideal time to introduced these changes, since few Xanthia themes have been released so far, and core modules are only just being converted to pnRender.
I stopped work on ShortURLs some time ago (before pn0.75) on the advice that a core module was being developed; however I have seen no evidence of this to date, and there is no indication in the upcoming PN 0.76 or CVS that there is anything coming. I got curious a month or so ago, and was somewhat dismayed at what I found.
Since then no progress seems to have been made on PostNuke ShortURLs. In fact, the current Xanthia filter hack has regressed, becoming bloated with complex and wholly unnecessary Regular Expression rules, many badly written with duplication and a number of bugs, especially in the accompanying htaccess file, going from the 15 rules proposed by Karateka to a massive 89. So, I set out to try and fix it, but ended up revisiting the idea of a core implementation using virtual directories to more logically structure the URLs in a way that is not only Search-Engine Friendly, but more User-Friendly.
Along the way, I've also been sidetracked and made a direly-needed new themable tab system for the Administration area based on AlistApart.com's Sliding Doors technique and consequently overhauled most of the Admin templates and a few User templates too, partly out of necessity due to the new Adminpanel, partly because they badly needed it. Those of you who have tried the pn0.76 Release Candidates would know that the templated output in them leaves something to be desired, drab and somewhat unprofessional-looking due to all the styling and CSS-classes having been ripped out, leaving a basic grey and white look with overly large headings and no theme tables for backgrounds. Hardly what you would call of Release Candidate quality. So pnRender and its plugins have been fixed to allow the use of Xanthia-like theme-colour tags as well as a tag for root-relative paths needed for ShortURLs, and the opentable functions have been fixed so that proper themed borders can be used. In fact most of the changes are in fixed templates, plugins, and module files.
My proposed implementation still retain the Xanthia filter for backwards compatibility with older themes, modules and blocks, but has been wholly rewritten and pared down to 24 rules, including a rule to fix all links to be root-relative. As PostNuke is in transition to be fully pnAPI-compliant by PostNuke 0.8, the remaining ones can gradually be removed altogether as themes, modules and blocks are updated. There's also a version for AutoTheme.
This particular scheme is experimental and may be tweaked or improved upon. It seeks to reduce the reliance on the Regular Expression(2) post-processing for links and introduce more user-friendly URLs that have more relevance for people and search engines alike by using virtual directories to visually distinguish sections of the site by module and function, such as
/Example/View.html
and for the News articles introduce Category, Topic, and Title information in the link:
/Category/Topic/ArticleXXX-title-of-story.html
For instance for a news story in the category Computers and the topic Postnuke called "PostNuke Shorturls", you'd have the URL
/Computers/Postnuke/Article123-PostNuke-Shorturls.html
This is a clear, concise and informative link that tells the user and search engine alike something about the link before going there, while retaining backwards compatibility with links of the old ShortURL scheme. It more closely emulates the way we think and organise information, using the folder analogy where we have a clearly-labelled Computer category folder, under which we have the various sub-categories - Topics - with various articles. In this case, we're using a virtual file anchored by the word "Article", clearly identifying it as such, followed by the article number and title. There is backwards compatibility, so that older links for Article123.html will still work.
In this instance I've excluded the News keyword altogether for brevity in favour of the Category and Topic keywords which insinuate News anyway, though there is nothing against being consistent with all the other ShortURLs and having the Module appear first, as in
/News/Computers/Postnuke/Article123:-PostNuke-Shorturls.html
This is for the special case of the core News module though, a more generic method is needed overall for URLs with various unknown parameters passed in the query string. This implementation uses the scheme:
/Module/Function-Param1:Value1-Param2:Value2... -ParamN:ValueN.(p)htm(l)
where the Query string parameters are tagged onto the virtual filename grouped by colons and separated by hyphens, the idea being to use commonly-used characters we might normally use in a list to make it look as natural and readable as possible. It may be a less-commonly used character than the hyphen is needed, like the tilde ~ character, since some parameter values may use a hyphen, in particular usernames. This is not a problem if passed as the last parameter, where it may contain any character. So if the module developer kept this in mind, it might not be an issue. I'm not aware of it being one so far. The PostCalender ShortURL plugin deliberately places uname, if present, last.
The extension is not necessary, but used for convenience. The 3 types used are either one of html, htm, or phtml, the latter useful to distinguish when you want to link to real HTML files on the site. The extensions as well as the option to use ShortURLs or not is set in the Settings panel, though I've only offered the option of html and phtml, since frankly the MS DOS-holdover extension htm annoys me.
Older URLs are marked with a + before the Function name, as in
/PNphpBB2/+profile-mode:editprofile.html
so that the server can translate it correctly. If the directory doesn't actually exist, entering
/Example/
will redirect to the Example module main page (Apache only)
/Example/main.phtml
which in return gets rewritten invisibly to
/index.php?module=Example&func=main
Otherwise, if it does exist, the index file of the relevant directory will be opened.
Similarly, with
/HTML/filename.html
if the file exists, it will be opened, else PN will look for
/index.php?module=HTML&func=filename
It is still possible to tag on query strings like
/ModName/main.phtml?theme=seabreeze
or
/ModName/main-theme:seabreeze.phtml
will both be translated to
/index.php?module=ModName&func=main&theme=seabreeze
There are any number of possible ShortURL systems, the simplest being to simply chop the URL into virtual directories, like /News/123/ from the above News example as some do. Xaraya uses a variant of this for news, though it doesn't use mod_rewrite, so appears like
/index.php/news/123
Again, this is concise, but contains few meaningful keywords other than the module name News. You can combine the two methods for News and have
/News/Category/Topic/123/title-of-article
which works very well, but loses some of the elegance of the above philosophy, since the latter part breaks up the virtual file into 3 with no anchor words, which is not how we organise information.
For generic URLs, there are a number of methods; for instance Mambo, another CMS, use generic ShortURLs like
/component/option,com_newsfeeds/catid,5/Itemid,7/
for a News URL like
/index.php?option=com_newsfeeds&catid=5&Itemid=7
where the querystring values are grouped by commas and separated by forward slashes (virtual directories). It is a ShortURL, though in this case not shorter, and doesn't have any useful keywords, other than "newsfeed", and is not very human-readable. For a generic URL, this is somewhat unavoidable, but can be better than that.
This implementation also contain a way to customise ShortURLs on a per-module basis through a file called shorturls.php placed in the module folder (see the Example module), such as the News URLs, or 3rd party modules like PostCalendar, which instead of the full URL like
/index.php?module=PostCalendar&func=view&tplview=&viewtype=day&Date=20050405&pc_username=&pc_category=&pc_topic=&print=
with the above generic ShortURLs would be rendered as
/PostCalendar/view-viewtype:day-Date:20050405.html
but with customised URLs become
/Calendar/05-04-2005/day.html
The beauty is, though, once we've created the groundwork in the core of PostNuke, any implementation will be fairly easy.
1) Root-relative links: Links relative to the server site root (eg /nuke/filename.html), which stays static, as opposed to relative to the present file (eg filename.html).
2) Regular Expression (RegEx): A complex pattern-matching language that can look a bit like a mathematical formula, used in the Xanthia ShortURL filter at /modules/Xanthia/plugins/outputfilter.shorturls.php.
----------------------------------------------------------------
If this were Mambo, I'd charge you 80 Euros for all this (the price for SEF Advance), but because you're all such nice people (except that guy up the back, you know who you are :) ), I'll let you have it for free.
A PDF of the ReadMe included in the package, but with additional screenshots, is found here (570kb).
I've also written a more technical ReadMe on installing ShortURLs, included in the package under the docs folder, and also found here.
here's a test of the tab system using the Aqua theme. It also comes with an XP-styled theme and the default-CSS-based one. I hope you like it, because it took a lot of work to perfect.
OK, screenshots: Well, no point having screenshots of URLs, so here's some of the tab system and modified SeaBreeze and PostNukeBlue themes' Admin templates instead:
1. The main adminpanel in PostNukeBlue with the Aqua-themed tabs, hovering over the Settings panel.
2. Same as above, but with the Theme Override set under Modify Config and with a tabs.css stylesheets in the theme's style folder. The rounded corners are only visible in Mozilla/FireFox.
3. The Luna tab theme in SeaBreeze, hovering over the 3rd Party tab.
4. The Xanthia Admin tabs using Aqua tabs in PostNukeBlue, hovering on Theme Settings.
And finally, the downloads:
I started out fixing PN0.75, so there are 2 downloads: One for PN0.75, and one for PN0.76rc4. I'll update it once the PN0.76 final is released.
Please backup your site before installing these patches, since a lot of system files are replaced. The PostNuke 0.76rc4 ShortURL package is rather large, consisting of some 400 files in a 1Mb zip file. The PN0.75 package has some 170 files and is around 800kb. Most of the changes are drop-in changes that doesn't necessitate updating of modules, but there are some exceptions in the PN0.76 package, in particular the Settings and Polls modules, where you need to first go to the Module list, regenerate, and update. Specific patches for popular 3rd party templated modules like AutoTheme and PNphpBB2 are included, but only a limited number of 3rd party modules have been tested with this package. No changes are made to the database, but it is still a good idea to back that up as well. You have been warned.
PostNuke 0.75 ShortURL package (833kb)
PostNuke 0.76rc4 ShortURL package (1Mb)
Two of the updated core themes:
PostNukeBlue (249kb)
SeaBreeze (120kb)
Feel free to discuss this proposal in the forums.
Enjoy!
Martin Andersen 8/7/200
Generated on July 9, 2005.
-
Open source solution saves e-commerce website
(News)
-
Website to help computer value added resellers (VARs) improve their technical know-how, consulting, marketing and management skills.
“An Internet portal, DoctorVAR.com (http://www.doctorvar.com) would aggregate up-to-date information technology news and some of the best articles and resources for VARs,†said Linda Christie, company president. “After talking to several Web developers, we chose a custom designed php solution with SQL databases to store thousands of stories and favorite links.â€
However, two months into the project, it became evident that the developer didn’t have adequate programming staff to launch the site within the promised three-month schedule. Unfortunately, Christie felt she had few alternatives. “I’d already spent hundreds of hours working on the site design and adding thousands of favorite links and articles to the database—work that could be lost if I changed vendors.â€
After returning from a two month assignment in Europe, during which no progress was made, Christie spent a couple of days with the programmer to iron out the final details. “At last we were making progress. I updated some of the content and began writing press releases for the big day.â€
One night the site went offline. The next day it was still down, even the backend admin area. Then the dreaded call came: hackers had broken into the server hosting facility. “What about the backup? I asked.â€
“The last backup file was corrupted,†was the answer. A two-month old zip file didn’t match the current software version, making site restoration almost impossible—but they said they would try. “They’d lost dozens of other sites and had no backups for them either,†Christie said. “So at this point, I lost all confidence in the developer—not to mention over ten-thousand records I’d uploaded.â€
Christie wasn’t sure what to do. “Our e-commerce project—a major commitment in time and resources--was already four months past due. And I couldn’t afford the time or money to start coding the site from scratch.â€
Christie began searching online for a new developer. Soon, one of the people she contacted emailed her a slew of probing questions. “I felt like I was taking a test,†Christie said. “But the quality of his inquiries gave me confidence this person wanted to clearly understand the scope of the project, as well as my level of expertise to manage the site.â€
Soon Christie scheduled a meeting with Scott Kroeger, owner of Hudson Avenue Technologies in Omaha NE, to discuss the challenges of launching such a complex site on a limited budget. Scott recommended utilizing a proven and supported open source content manager: PostNuke. “Not only is this software free, but the friendly user interface would allow me to perform all of the daily administration, even make page layout changes,†Christie said. “Scott said his goal was to make me as independent of him as possible by the end of the project.â€
One of the primary reasons Christie contracted with Kroeger was that he wanted to work himself out of a job instead of creating a customized program that would require his ongoing support. “I’d been burned once already,†Christie said. “So I was excited about integrating supported public domain software that could be maintained by a multitude of providers, should Scott and I part for whatever reason. Plus there would be no software debugging needed.â€
After resolving some technical difficulties with the PostNuke implementation, Kroeger proceeded to deliver the site on schedule and within the company’s limited budget. “Within two weeks, I was able to start laying out pages and uploading data. And by the end of two months the site was up. Scott integrated free PostNuke modules to provide an ezine, forum, job bank, and banner/ad management, as well as an html-oriented content manager, Content Express, that simplifies adding html pages, uploading content, and searching the entire site.
These two companies have shown that partnering with a developer that understands your business needs and integrates off-the-shelf solutions can help you quickly ramp up the right solution for the right price—without having to invest in custom software development and personnel.
For additional information about DoctorVAR.com visit their Web site at http://www.doctorvar.com.
Linda Freeman is a freelance writer based in Omaha NE.
Copyright 2003, Write Solutions, Inc., Tulsa OK. Reprinted by permission.
Generated on February 22, 2003.
-
The friendly URL case
(News)
-
BUT - what I would like to propose is that we give this fantastic piece of code its life on the web. That is, giving the "PostNuke-:mod_rewrite" solution it's own homepage where we can have a list of supported scripts. The implementation by Sascha only supports the basic core modules. And even the Web_Link solution is not completely finalized. (it does not provide friendly URL's when you have a category with more than 10 web links...)
It would be great if users could submit their own working scripts for modules like XForum, PostCalendar, Gallery and more.
I am using all these modules already and I am not strong programmer enough to implement my own scripts for the different modules.
That is why a web page like this could help me and my fellow nukers a lot.
Well, if any of you out there knows how to fix this, then we can just simply change this thread into this webpage, with solutions to all the missing scripts, like: Web_Links, XForum, PostCalendar and Gallery.
- - -
What we are talking about is basically the three following categories. (taken from Sascha's script)
1. The $in var in the array. Example:
"'(?
http://domain/modules.php?op=modload&name=Web_Links&file=index&req=viewlink&cid=2&min=50&orderby=titleA&show=10
http://domain/modules.php?op=modload&name=XForum&file=viewthread&tid=1
http://domain/index.php?module=PostCalendar&func=view&viewtype=details&eid=1
http://www.odinn.net/modules.php?set_albumName=odinn_privat&id=aaa&op=modload&name=NS-Gallery&file=index&include=view_photo.php
All the best to all the Post Nukers out there !
- Zonik, zonik@strik.is
Generated on April 24, 2002.
-
Search engine friendly URLs revisited.
(News)
-
Why common methods to enable search engine friendly dynamic websites do not apply for PostNuke
Search engine friendly URLs for dynamic websites are always a hot subject.
There are several nice solutions: PHPBuilder dealed with this subject and the force type solution several times [1 2]. However, PostNuke can't benefit from Tim's solution, because all paths in PostNuke are relative. So, if you have an URL such as http://www.postnuke.com/article/1/, postnuke will try to find the topic image in http://www.postnuke.com/article/images/topics/some_fancy_topic.gif, but the picture is located in http://www.postnuke.com/images/topics/some_fancy_topic.gif
So, without a core rewrite that would change all relative to absolute paths, this method does not apply for us.
Prerequisites for my proposed method
E. Soysal has covered this subject twice for Nuke-Sites [1 2].
A lot has changed in PostNuke and the way URLs are being built since then, so it might be a good idea to revisit this subject.
To successfully use this method, you need apache as a webserver and mod_rewrite enabled. Now, if you don't know if you have this on your host, please ask your system administrator. If not, it might be an idea to collect hosts that have mod_rewrite enabled in the comments to this article. Don't forget: you can always ask and talk to your server admins and maybe even convince them. :)
What we want to achieve
Did you ever try to tell your friends about a Web_Links directory on a postnuke site? Chances are high that you gave up on it, because the URL was too long.
Did you look up your site on Google? How much of your site was really indexed by Google? It's very likely that if you are using 0.7+, only the mainpage was indexed by Google.
So, our goal is to make your site more user friendly and search engine friendly at the same time, without touching the PostNuke core or slowing down your system's performance too much.
So, we want to have an URL like http://www.postnuke.com/Web_Links.html rather than http://www.postnuke.com/modules.php?op=modload&name=Web_Links&file=index.
Time to dive into the code
You may wonder how to achieve our goal without touching the core. Well, themes are not considered as core files, so themes/yourtheme/theme.php is our candidate for a hack. :)
There are 4 steps that you need to take:
1) We will first start to buffer your output. We do this by adding the following line in your function themeheader:
ob_start();
2) Now, go to the end of your themefooter function and add those lines:
$contents = ob_get_contents(); // store buffer in $contents
ob_end_clean(); // delete output buffer and stop buffering
echo replace_for_mod_rewrite($contents); //display modified buffer to screen
3) replace_for_mod_rewrite() needs to be explained as well:
Add the following function above your first function in your theme:
function replace_for_mod_rewrite(&$s)
{
$in = array("'(?
In $in, we have an array of patterns that we want to replace. In $out, we have the array of new patterns.
So, I'll try to explain what some of these patterns do, so that you can start from there to do your own changes or extensions to this.
"'(?
- sid=([0-9]*) indicates that you may have any digit from 0 to 9 after sid=, the occurrence may be 0 to unlimited times.
- mode=([a-zA-Z]*) indicates that you may have any alphabetical character after mode=, from 0 to unlimited times.
So, we have managed to convert the links on the fly, what is missing now is the appropriate apache directives.
4) Basically all converted URL-calls need to be reverted by apache with mod_rewrite. You will only need a .htaccess file for this. For your convenience, just copy and paste the code below into a .htaccess and upload it to the webroot of your PostNuke installation.
RewriteEngine On
#Articles
RewriteRule ^article([1-9][0-9]*).* modules.php?op=modload&name=News&file=article&sid=$1
#Topics
RewriteRule ^Topic([1-9][0-9]*)-all.* modules.php?op=modload&name=News&file=index&catid=&topic=$1&allstories=1
RewriteRule ^Topic([1-9][0-9]*).* modules.php?op=modload&name=News&file=index&catid=&topic=$1
#FAQ
RewriteRule ^FAQ([1-9][0-9]*)-([0-9]*).* modules.php?op=modload&name=FAQ&file=index&myfaq=yes&id_cat=$1
#Sections
RewriteRule ^Sections([1-9][0-9]*).* modules.php?op=modload&name=Sections&file=index&req=listarticles&secid=$1
RewriteRule ^Sections-article([1-9][0-9]*)-page([1-9][0-9]*).* modules.php?op=modload&name=Sections&file=index&req=viewarticle&artid=$1&page=$2
RewriteRule ^Sections-print-article([1-9][0-9]*).* modules.php?op=modload&name=Sections&file=index&req=printpage&artid=$1
#NS-type modules
RewriteRule ^NS-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).* modules.php?op=modload&name=NS-$1&file=index&$2=$3&$4=$5
RewriteRule ^NS-Polls-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).html modules.php?op=modload&name=NS-Polls&file=index&req=$1&pollID=$2
RewriteRule ^NS-Polls-([a-zA-Z0-9_]*).html modules.php?op=modload&name=NS-Polls&file=index&pollID=$1
RewriteRule ^NS-([a-zA-Z0-9_]*).html modules.php?op=modload&name=NS-$1&file=index
#General Stuff
RewriteRule ^([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).* modules.php?op=modload&name=$1&file=index&$2=$3&$4=$5&$6=$7&$8=$9
RewriteRule ^([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).* modules.php?op=modload&name=$1&file=index&$2=$3&$4=$5&$6=$7
RewriteRule ^([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).* modules.php?op=modload&name=$1&file=index&$2=$3&$4=$5
RewriteRule ^([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).* modules.php?op=modload&name=$1&file=index&$2=$3
RewriteRule ^([a-zA-Z0-9_]*)-([a-zA-Z0-9_]*).html modules.php?op=modload&name=$1&file=$2
RewriteRule ^([a-zA-Z0-9_]*).html modules.php?op=modload&name=$1&file=index
Finally, you can see my proposed solution in action on the following (test) sites so far:
http://www.mountainwatersspa.com/
http://www.lobosoft.com/
I'm not claiming that my code is the best solution, maybe even not for this method, but it works.
So, there you have an example and some basic instructions, now go and conquer Google et al and don't forget to mention other
Generated on April 4, 2002.