PhpWebChecksum to be released soon

The first version of my php script to monitor changes in your website will be released soon. I already set up a sourceforge account for PhpWebChecksum which will be mainly used for bugtracking, and maybe source code management (CVS) and file storage for releases for now. The projects homepage will be hosted on techbits.de under /projects/phpwebchecksum which is already available as well. At this point I have to make sure I can use Keith Devens’ PHP XML Library which I included in my php script and figure out how I’ll solve the pass-by-reference issue when switching between PHP4 und PHP5.

Here is another teaser screenshot of the main form with header, footer and some design improvements:

main form with some design improvements

Checking websites for intrusions

When I recently installed this blog a thought about how you could monitor a website for intrusions. Almost all sites use some kind of content management system, blog or other portal software. Unfortunately we all know that software does have flaws an that there are script kiddies out there who do not hesitate to exploit them as soon as they are found. Since most of the small sites and blogs are hosted on simple PHP/MySQL webspace it is not as easy to monitor the integrity of your site when the web application has hundreds of files buried deep in a directory hierarchy and you only have FTP access to browse through it.

I googled for tools that create checksums for websites but I didn’t find much, so I started on creating a PHP application for that that purpose. My prototype has the following functionality:

  • generating an xml list with checksums (SHA1) and file dates for a complete directory tree
  • the xml list can be downloaded to be stored locally
  • the xml checksum list can later be uploaded to be compared against the current state of the website
  • a comparison is computed and display showing all modified, new and missing files with the information what (date, size, checksum) has been modified.

Here are two screenshots that show the current development version:

Main Form - generate and compare checksumsChecksum Comparison View

I will continue working on this tool and make it available as open source when it’s fairly stable.

Search engine friendly urls without mod_rewrite

When I set up my wordpress blog yesterday I wanted to use search engine friendly URLs which wordpress usually supports by the use of Apache’s mod_rewrite. Unfortunately my hoster doesn’t support .htaccess files in the small web package I currently purchased, which I found rather disappointing. Generally, there might by a couple of reasons why the default way of rewriting URLs might not work: Your hoster disabled .htaccess files (AllowOverride None), mod_rewrite is not loaded oder not available on the server or your site runs on IIS which naturally doesn’t Apache’s rewriting. Luckily there are two ways around this limitation that are supported by WordPress out of the box.

The first way that is suggested in the Using Permalinks Section of the WordPress Codex is by using URLs like index.php/some/path/. All you have to do is specifing the custom permalink structure in Options>Permalinks. Apparently this type of permalinks without the use of mod_rewrite worked in WordPress at least since v1.2 but for me, the ugly index.php/ path isn’t something I want to have permanently in my URLs. I favour the following sollution.

You can set your index.php of WordPress as the 404 error page for your website. This has the effect that WordPress is called for all your virtual URLs which do not exisit as actual files on the webserver. I’ve tested this with version 2.0 and so far it works pretty well. There may be problems with HTTP POST operations according to experiments with the textpattern weblog. If it turns out to work properly it should be added to the WordPress documentation since it produces the same clean URLs as mod_rewrite with the bitter aftertaste of being an ugly hack though.

Internally both methods rely on index.php to analyze the URL. Actually in WordPress v2.0 this has become the default way of URL rewriting any way. If you’d have a look at a .htaccess file of a WordPress v1.5 installation you’ll see about 30 rules for all the different URLs (search/, category/, author/, …). In WordPress v2.0 the .htaccess looks much cleaner: all requests are forwarded to index.php – pretty much the same way a redirection of the 404 error page would do it.