I've done some informal thinking on how my new weblog package will work. The overriding goal is to simplify. As such, I am making each weblog post a separate file instead of storing the data in an RDBMS. They will probably be numbered sequentially, as I want to guarantee uniqueness and that's as good a way as any. To make a post, I will run a script that will open an editor on a temporary file. The format of the file will be very simple, modelled after emails and the HTTP protocol. There will be a headers section followed by a single, empty line, the subject (if any), another empty line, and the body of the text. In the headers will be appropriate metadata, which at this time will be a list of categories and perhaps a reference. Here's a sample:
Category: linkage, freaky
Cite: Fark
scariest mugshot ever
From The Smoking Gun, I give you the Scariest. Mugshot. EVER.
I don't see any point in having comments or similar dynamic behavior. I won't be generate each page on each request like I (sort of) do now. Instead, I'll regenerate the site every time I post. So after I exit the editor, the weblog script will pick up the post and rebuild the site from it. It will do the regular housekeeping like generating archives, making sure the index page has exactly 20 (or whatever) posts on it, indexing by date, and so forth. One advantage of doing it this way is that it makes it really easy (as opposed to just kind of easy) to do more useful urls. http://ketan.org/692 would be the direct, permanent link to this post. So I could have a URL like http://ketan.org/politics/ that would contain all of my posts on politics. I could also have URLs like http://ketan.org/2003 for, well, 2003, http://ketan.org/2003/04/ for April of 2003, and so forth.
Now, one thing that I would seem to be giving up with this methodology is a web-based posting system. That's not actually the case, however. It would be a simple matter to write a script that would allow me to do the exact same thing from a web browser. I'm never actually directly writing the files; I'm writing to a temporary file that the command-line weblog script then picks up and files appropriately. That makes it easy to have an arbitrary input mechanism. One other thing I've thought of is having a simple email gateway. I would basically just send an email to a magic address that triggers the script. The body of the email would look just like the text file outlined above. I'm leaning toward the email method, as wherever I have web access I'll have email access, but it's not necessarily true that I will have web access wherever I have email access.
Making each post relatively self-contained makes backups easy as well. The script would file each post in its own file in some directory. To back up, I'd just copy the directory somewhere. If I wanted to change this post, I would just edit ~ketan/weblog/692.txt and rebuild the site. I can also easily integrate a spell checker, since I'll just be using a standard UNIX text editor. I'll also be able to easily integrate arbitrary HTML, which I cannot do right now (for example, I can't show the raw HTML in the mock post above because I have an overzealous HTML entity encoder. i.e., a bug).
Basically, I didn't spend too much time thinking about how much effort my current system would be on a day-to-day basis. Now that I've had 18 months to get frustrated by it, I've come up with a way that is far more streamlined for daily use and will be a lot easier to maintain. I don't intend to do any fancy templating at this point; I expect the script code will be intermingled with the logic code. I went overboard with the idea last time around, and I'm still dealing with the bother that caused. The new idea is to default to the simplest way of doing things and not to overthink what I might want to do 4 years down the road.
As a way to make the project both more interesting and more intimidating, I'm thinking of using this as an opportunity to dabble in the Ruby scripting language. I've read a lot of good things about it, and this seems like just the right sized project to give it a whirl.
Now, when will this actually happen? I dunno. Depends on when I get the time. Usually, by the end of the work week, I want to rest from programming for a little while. It doesn't really matter, though, as to you, the site should look basically the same. URLs might be a little neater, and the pages will load much faster, but other than that, it'll be invisible. One thing that I haven't really thought of yet is what I will do with my image gallery code. One of the reasons I don't have much up there is because it's unnecessarily clunky in the same way the weblog itself is (another reason is that I don't take many pictures). I'll probably end up doing something similar in concept to clean that up, but that remains to be seen.