Mikkel Høgh

Coding the web since 1999

A tip for using PostgreSQL with Drupal 6

If you are using PostgreSQL for hosting your Drupal sites, you might have noticed a lot of warnings in your logs like these:

Aug  8 18:41:05 s002 postgres[90076]: [5-1] WARNING:  nonstandard use of \\ in a string literal at character 32
Aug  8 18:41:05 s002 postgres[90076]: [5-2] HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
Aug  8 18:41:05 s002 postgres[90076]: [6-1] WARNING:  nonstandard use of \\ in a string literal at character 122
Aug  8 18:41:05 s002 postgres[90076]: [6-2] HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.

The immediate cause for this is bug #426008 in Drupal core, but the issue stems from the fact that PostgreSQL does not conform exactly to the SQL standard with regards to backslashes in strings. The reasoning behind this and why its going away (as a default setting) in PostgreSQL 9.1 can be read in this excellent blog post by Bruce Momjian.

But how do I fix it?

The good news is that this behavior is configurable. You can set standard_conforming_strings = on in your postgresql.conf and be done with it. This will be the default setting from PostgreSQL 9.1, and hopefully the other applications using your database do not depend on the legacy behavior (if they do, they need fixing).

If that’s not suitable for your setup, there is a few other suggestions in this forum thread.

Presenting Django Password Required

Have you ever wanted to password-protect your Django-site, without requiring user registration, do you find HTTP Basic Auth to be a very blunt instrument for protecting sites or do you want to do StackOverflow style beta-testing?

Then Django Password Required is for you. It provides a simple @password_required decorator for your views, and lets you configure a password in your settings.py file. The authentication is stored in the user’s session data, using Django’s own session system. This means that Django Password Required can co-exist with django.contrib.auth, so you can allow users to log in after they’ve provided the password to access the site.

I use it for a little skunkworks project that does not have user logins per se, but since it is not open to the public yet, I need to protect it, at least from webspiders and random visitors. I don’t mind if the password is spread by word-of-mouth, since the site contains nothing sensitive or private.

Initially I used HTTP Basic Auth, but setting that up with Apache is an all-or-nothing deal, requires you to enter the password quite often on iPhone/iPad, and interferes with AJAX requests/API calls. So I created this lightweight app, so as to require a password, store that the user is logged in via a cookie bound to a server-side session, with a long lifetime so you won’t get nagged for the password very often.

Bug reports/suggestions, documentation, source code, etc. It all happens on Github. Enjoy.

Introducing Herd Fire

If you, like me, are an avid Firefox user, you will likely have felt the burden of using the same Firefox profile for a variety of tasks. Having NoScript or ImgLikeOpera installed is handy when surfing, but just annoying when working on developing websites. Having FireBug installed will slow down JavaScript execution on all pages, unless you disable some of its features, regardless of whether you’re using it or not. Every extension you install slows down Firefox ever so slightly.

And that just extensions. Same goes for many aspects of Firefox configuration, language, about:config, etc. Would it not be better to have several Firefox profiles, one for each task? If you ask me, it would.

One problem though – even if you find the hidden Firefox profile manager, Firefox will not let you launch multiple instances of it without a bit of coercion. Previously, I resorted to all kinds of commandline trickery to manage my profiles until I found a script somewhere on the web (I’ve been unable to find it again. If you know it, please post a comment – I’d like to give proper attribution) that helped me set up copies of Firefox.app for each profile, but it had its limits. It did not work for Firefox.app itself, only for its named copies. It also renamed the Firefox binary, causing trouble for other scripts. So I’ve rewritten it in Python, improving a few key things:

  1. It modifies Info.plist to use a launching script instead of renaming firefox-bin.
  2. It sets the normal Firefox.app to use the profile named “default”.

Instructions for use

  1. Download Herd Fire.
  2. Copy your Firefox.app to create a named copy (I’m using the name “example” here):
    Copying Firefox.app
  3. Run Herd Fire ( run it from the folder its located in, or stick it in a folder on your path):
    Running herdfire
  4. Launch your new Firefox copy.
    Launching Firefox
  5. If there is not a Firefox profile with the extra name you gave your Firefox.app copy (in this case “example”), the profile manager will appear. In that case, use it to create a new profile with the correct name.
    Profile manager
    Pick a name
  6. Firefox-example.app will now always start with the “example” profile activated. Firefox auto updater might break this. In that case, all you need to do is to run herdfile again.
    New Firefox profile

The code is in a GitHub repository, so please don’t hesitate to fork, file bugs, etc.

Attention all Drupal Git-mirror users

A long-standing issue with the Git mirrors of Drupal’s CVS has been fixed thanks to Damien Tournoud.

The problem is that CVS outputs dates in RCS tags in the somewhat nonstandard format 2009/10/19 (ISO 8601 specifies dashes, not slashes as separator). The git-cvsimport tool used for creating the mirrors, however, uses cvsps, that updates the RCS tags to use the correct format (2009-10-19). Adhering to standards is generally a good thing, but in this case it was causing merge conflicts when trying to merge patches created with Git into Drupal (or vice versa).

Damien found a way to resolve the issue, however:

Adding DateFormat=old to the CVSROOT/config file fixes the problem.

Changing this, however, required a reimport of the entire repository. Due to the way Git works with commit-ids being a cryptographic hash of their contents, changing the contents (even if just the RCS tags) means a rewrite of Git history.

So while the new repository contains the same code, you will not be able to merge new changes from it into your current checkouts. Damien will continue both imports for a while, but updates for the old repository with the incompatible date format will be discontinued at a future date.

What is the bottom line then?

The executive summary

  1. The Git mirror at git://github.com/drupal/drupal.git has been rewritten with it’s RCS tag date format compatible with CVS defaults. Please use this mirror for all your future projects.
  2. The Git mirror at git://github.com/mikl/drupal.git will continue to have the CVS-incompatible format, and will, for a time, continue to be updated, so you will be able to use it for a little while longer.
  3. There is now no excuse for not using Git for your Drupal core development work. Enjoy.

Finally, I’d like to thank Damien for doing all the hard work. I was maintaining the git-cvsimport process myself for a while, and I do not miss it.

New blog, same as the old one…

So, I finally did it. I’ve long wanted to do something about this blog, to try and push a better design on it and generally trim everything.

I wanted to try something new and challenging, so now I’ve rebuilt my blog with Django Mingus.

Building stuff with Django tends to be a lot of fun. I have quite a few ideas that I’d like to try out, so you may see some of my work moving into Mingus.

Rotating Apache httpd logfiles on FreeBSD

With the disk space available on modern servers, you tend to notice some things a lot less. Like the boring fact that without log rotation, an Apache access log can grow to gigabyte size in no time.

FreeBSD’s Apache HTTPD port does not ship with configuration for the FreeBSD log rotation utility, newsyslog, so your logs won’t be rotated by default.

That, however, is fairly easy to fix by tweaking /etc/ newsyslog.conf a bit.

Here’s how I did it:

/var/log/httpd-access.log www:www 440 9 * $W1D4 J /var/run/httpd.pid 30
/var/log/httpd-error.log www:www 440 9 * $W1D4 J /var/run/httpd.pid 30

Broken up, this means:

  1. /var/log/httpd-access.log: Name of the log file we’re rotating
  2. www:www set user and group ownership of the archived logs to www, so sysadmins can read them.
  3. 440 set the archive files to be read-only for the www user and group and no access for anyone else.
  4. 9 keep nine archived log files excluding the current one. This way, we should always have the latest 10 weeks of log data available.
  5. * don’t rotate based on log file size.
  6. $W1D4 rotate logs every Monday at 4 in the morning.
  7. J compress the archived logs with bzip2.
  8. /var/run/httpd.pid – get the process ID for the httpd server here.
  9. 30 send SIGUSR1 to cause a graceful restart of httpd.

I set this up last week, and it has since done its work, turning my 1GiB /var/log/httpd-access.log into a 46MiB httpd-access.log.0.bz2. Log files are some of the best use cases for compression. Enjoy.

How to get your Disqus API keys

I’m working on importing my comments into the otherwise excellent Disqus commenting system, but getting ahold of your API keys can be rather difficult, so I’ll just document the process here for later reference.

To call the API functions, I’m using the Java-based REST Client – which is free and very handy for this kind of thing.

  1. Log in to Disqus.com with a user that has access to the forum you want API keys for.
  2. Visit http://disqus.com/api/get_my_key/ with your browser to get the user API key (since it uses the active session to give you the API key).
  3. Call http://disqus.com/api/get_forum_list/?user_api_key=_USER_API_KEY_ to get the list of available forums, since you’ll need the numeric identifier for the forum. Look through the JSON response and find the id number for the forum you want an API key for (for my blog, it’s "id": "180233").
  4. Call http://disqus.com/api/get_forum_api_key/?user_api_key=_USER_API_KEY_&forum_id=180233 where 180233 is your forum id. The message field in the JSON response should contain your API key.

That’s quite a bit of manual work, but it does not seem like there’s currently any better method. If you happen to find one, please let me know.

Going to the edge with Drupal 7…

So, my fellow Drupallers, we are only inches away from the code freeze. Are we afraid yet?

A common trend amongst Drupal developers is that we’re all mostly on last years version. Many Drupal programmer blogs have only recently been upgraded to Drupal 6, or are even still running Drupal 5. Not picking on anyone in particular.

I think that’s a good indicator of a problem with Drupal. Upgrading is hard, and when the very people that do Drupal 24/7 are not upgrading, how can we expect anyone else to? And yes, I took a long time upgrading as well.

I think that’s part of the explanation for the long lag before modules were ready for Drupal 6. Everyone was still on Drupal 5, and there was no need or demand to upgrade your module.

This is bad for the community in many ways, and therefore, I’d like to challenge you, fellow module and core developers, to upgrade your blogs to Drupal 7. In September, or October at the very least When the first release candidate comes out. Yes, before the final release.

This is uncommon in the Drupal world, but if we look at other open source projects, like Linux or the BSDs, the developers there are mostly eating their own dog food, running their main working computer right out of CVS/SVN/whatever HEAD. For inspiration, check out this video about the OpenBSD release process. Websites are public facing, so it’s a bit worse when something is broken, but shouldn’t we at the very least be able to run HEAD on our blogs when in the code freeze?

Yes, this is likely to be a painful experience. You will need to make patches for the modules you’re using. You will probably have to live without Views or Panels for a while. But you can also help Drupal 7 become our most successful release ever.

If enough developers do this, we will have a much better tested Drupal 7 core when it is released. Not just unit tests, but real, live sites. We will have many more modules available from day 1. Users looking to try out Drupal will be able to hit the ground running, instead of having to choose between the new release (Drupal 7) and the widely supported release (Drupal 6).

I know this is a lot of work, but is that not what we do as developers? Are you mice or are you men? Dare you come along on a perilous journey to the bleeding edge?

It is fraught with struggle and danger, but the rewards will be enormous.

P.S.: I forgot to mention it when first posting this, but this idea is highly inspired by Moshe Weitzman’s #D7CX – this is basically an extension of that, with more people pitching in.

P.P.S.: Edit: After taking a bit with Heine Deelstra on #drupal, who kindly explained to me that Drupal 7 HEAD contains several security vulnerabilities at the moment, I have opted to postpone the public-facing part of this challenge a few months. I will continue running a Drupal 7 version of my blog on my development setup, but the real, online copy will stay Drupal 6 for now :)

How to create and maintain your own cache table in Drupal

There’s a lot of good documentation for how to use the caching system already set up, in particular a very nice write up by Jeff Eaton that, even though it is written for Drupal 5, I find myself looking at rather often.

If you want to set up your own caching table, however, documentation is kinda scarce – I haven’t been able to find anything that covered it, but that may be due to my lack of Google skills.

In the end, I figured it out by reading the code of a lot of different modules, but I thought I’d take the time to do a write up of my own about how it can be accomplished.

First of all, you’ll need to set up your module. Once that is done, you’ll need these bits in your .install file to set up the new cache table. We’ll use the name “cachedemo” for this module, but it could really be anything.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php
/**
 * Implementation of hook_install().
 */
function cachedemo_install() {
  drupal_install_schema('cachedemo');
}

/**
 * Implementation of hook_uninstall().
 */
function cachedemo_uninstall() {
  drupal_uninstall_schema('cachedemo');
}

/**
 * Implementation of hook_schema().
 */
function cachedemo_schema() {
  $schema = array();

  $schema['cache_cachedemo'] = drupal_get_schema_unprocessed('system', 'cache');

  return $schema;
}

The trick here is using drupal_get_schema_unprocessed to make a 1-to-1 copy of the standard cache table. That will work fine in most cases, and saves you from typing out the entire schema.

The only other really required thing is to to implement hook_flush_caches, so our cache table can get flushed automatically:

1
2
3
4
5
6
7
8
9
10
11
<?php
/**
 * Implementation of hook_flush_caches().
 *
 * This tells Drupal's cache handling system the name of our caching 
 * table, so expired items will be purged automatically and this table 
 * also affected by the empty all caches function.
 */
function cachedemo_flush_caches() {
  return array('cache_cachedemo');
}

And that’s it. You’ll still have to do something with your new cache table – you can check Eaton’s write up mentioned in the start of this post out – most of it still applies to Drupal 6, although the argument order has changed in some places and it is no longer necessary to manually serialize and unserialize.

In the creation of this write up, I threw together a small demo module, just to check that the code I was posting was actually working. You can download it if you want.

Drupal debugging tip – use the logging console

I recently ran across a feature of Drupal’s devel.module that might not be all that well known, namely that it has a facility for debug logging as well as the dpm() I’ve advocated to my fellow developers for a long time.

That is the dd()-command which instead of logging to screen simply outputs a print_r() to a file called drupal-debug.txt in your temporary files folder (where that is depends on your site configuration, but /tmp might be a good place to look).

That alleviates the many problems of simply logging to screen (doesn’t work with AJAX, redirects, causes browser slowness for large datasets, lots of junk in your HTML source, etc.). You can then simply run tail -f /tmp/drupal-debug.txt to see what’s going on – and for those of you that use Mac OS X, I have an additional trick.

OS X ships with a fairly good log viewer, namely Console.app – by default, it’s installed in Applications/Utilities. That can be used to view all the logs in standard locations, but with a little trick, it can be used to view this log file also.

Console has a section for ~/Library/Log, ie. the Log folder in the Library folder in your home folder. You can then set up a symbolic link to the log file there. I have a subfolder in ~/Library/Log called apache2, and I also have my Apache log files symlinked in there as well.

Anyways, the procedure to set it up looks like this:

cd ~/Library/Logs/  
ln -s /tmp/drupal_debug.txt