All posts by John Goerzen

Apache vs. lighttpd

I’ve had a somewhat recurring problem that Apache on my server is a memory hog. The machine has a full GB of RAM now, but even so, heavy activity from spidering bots or reddit can bring it to its knees.

I’ve been trying to figure out what to do about it. I run Apache on there, and here’s an idea of what it hosts:

  • This blog, and two others like it, using WordPress
  • Several sites that are nothing but static files
  • gitweb
  • A redmine (ruby on rails) site
  • Several sites running MoinMoin
  • A set of tens of thousands of redirects from mailing list archives I used to host over to gmane (this had negligible impact on resource use when I added it)
  • A few other smaller PHP apps

Due to PHP limitations, it will only work with the Apache prefork or itk MPMs. That constrains the entire rest of the system to those particular MPMs. Other than PHP, most of the rest of these sites are running using FastCGI — that includes redmine and MoinMoin, although gitweb is a plain cgi. Many of these sites, such as this blog, have changed underlying software over the years, and I use mod_rewrite to issue permanent redirects from old URLs to their corresponding new ones as much as possible.

I do have two IPs allocated to the server, and so I can run multiple webservers if desired.

lighttpd is a lightweight webserver. I’ve browsed their documentation some, and I’m having a lot of trouble figuring out whether it would help me any. The way I see it, I have four options:

  1. Keep everything as it is now
  2. Stick with Apache, discard mod_php and use FastCGI for PHP, and pick some better MPM (whatever that might be)
  3. Keep Apache for PHP and move the rest to lighttpd
  4. Move everything to lighttpd, with FastCGI for PHP

I know Apache pretty well, and lighttpd not at all, so if all else is equal, I’d want to stick with Apache. But I’m certainly not above trying something else.

One other wrinkle is that right now everything runs as www-data — PHP, MoinMoin, static sites, everything. That makes me nervous, and I’d like to run some sites under different users for security. I’m not sure if Apache or lighttpd is better for that.

If anybody has thoughts, I’m all ears.

Ahh, updates…

It’s been awhile since I’ve made a blog post, and it will probably soon be evident why.

For myself, I am taking two college classes — philosophy and gerontology — this semester, and still working full time. That’s busy right there. I’m really enjoying them, and in particular enjoying the philosophy class. I will, at long last, graduate this December.

Oliver is about 3 months old now. He’s starting to smile at us more, make his cute little baby noises more, and is very interested in taking in all of his surroundings. He has his opinions about things, but isn’t expressing them too loudly just yet.

Jacob, on the other hand, is sometimes.

Our person from Parents As Teachers was talking to him for his 3-year-old evaluation. She said, “Jacob, are you a boy or a girl?” “I a kitty.” “Are you a boy kitty or a girl kitty?” “I just a PLAIN kitty. Meow.”

He seems to delight in catching someone saying or doing something in a manner he considers wrong. “No, not like THAT!” is heard a lot in our house these days. Or perhaps, “No, not RAILROAD tracks. They TRAIN tracks, dad!”

Jacob’s imagination is very active. Sometimes if it is time to go use the potty, he will insist that we all stop because a freight train is going through the kitchen, the crossing guard lights are flashing, so we have to STOP. He pretends to be kitties, runaway bunnies (he has a book called Runaway Bunny), and occasionally other things.

MoinMoin as a Personal Wiki, Zen To Done, And A Bit of Ikiwiki

Since I last evaluated and complained about wikis last year, I’ve been using moinmoin for two sites: a public one and a personal one.

The personal site has notes on various projects, and my task lists. I’ve been starting out with the Zen To Done (ebook, PDF, paper) idea. It sounds great, by the way; a nice improvement on the better-known GTD.

My To Do Page

Anyhow, in MoinMoin, I have a ToDos page. At the top are links to pages with different tasks: personal, work, yard, etc. Below that, are the three “big rocks” (as ZTD calls them) for the day: three main goals for the day. I edit that section every day.

The Calendar

And below that, I use MoinMoin’s excellent MonthCalendar macro. I have three calendars in a row: this month, next month, and last month. Each day on the calendar is a link to a wiki page; for instance, ToDos/Calendar/2009-10-01. The day has a red background if the wiki page exists, and white otherwise. So when I need to do something on or by a specific day, I click on the link, click my TaskTemplate, and create a simple wiki page. When I complete all the tasks for that day, I delete that day’s wiki page (and can note what I did as the log message if I like). Very slick.

The Task Lists

My task pages are similar. They look like this:


= Personal =

<<NewPage(TaskTemplate,Create new task,@SELF)>>

<<Navigation(children,1)>>
<<BR>>

So, my personal task page has a heading, then it has an input form with a text box and a button that says “Create new task.” Type something in and that becomes the name for a wiki page, and takes you do the editor to describe it. Below the button is a list of all the sub-pages under the Personal page, which represent the tasks. When a task is done, I delete the page and off the list it goes. I can move items from one list to another by renaming the page. It works very, very nicely.

Collecting

Part of both ZTD and GTD is that it must be very easy to get your thoughts down. The idea is that if you have to think, “I’ve got to remember this,” then you’ll be stressed and worried about the things you might be forgetting. I have a “Collecting” page, like the Personal or Work pages, that new items appear on when I’m not editing my wiki. They get there by email.

MoinMoin has a nice email system. I’ve set up a secret email address. Mail sent there goes directly into MoinMoin. It does some checks on it, then looks at a combination of the From and Subject lines to decide what to do with it. If I name an existing page, it will append my message the the end. If it’s a new page, it’ll create it. I have it set up so that it takes the subject line as a page name to create/append to under ToDos/Collecting/$subject (by putting that as the “name” on the To line).

So, on my computers, I have a “newtodo” script that invokes mail(1), asks for a subject, and optionally lets me supply a body. Quick and painless.

Also, I’ve added the address to my mobile phone’s address book. That way I don’t have to carry around pen and paper. Need to get down some thought ? No problem. Hit send email, pull the last address sent to, give it a subject and maybe a body. Very slick.

Wiki Software

As a way of updating my posts from last year: I’ve been very happy with MoinMoin overall. It has some oddities, and the biggest one that concerns me is its attachment support. It doesn’t let you specify a maximum upload size, and doesn’t very well let you restrict attachment work to only certain people. But the biggest problem is that it doesn’t track history on attachments. If a vandal deletes the attachment on a page, it’s GONE. They expect to have that fixed in 2.0, coming out in approximately November, 2010.

I also looked at Ikiwiki carefully over the past few days. Several things impressed me. First, everything can be in git. This makes for a very nice offline mode, better than Moin’s offline sync. The comment module is nicer than anything in Moin, and the tagging system is as well. Ikiwiki truly could make a nice blog, and Moin just couldn’t. It also puts backlinks at the bottom of each page automatically, a nice feature. And it’s by Joey Hess, who writes very solid software.

There are also some drawbacks. Chief on that list is that ikiwiki has no built-in history of a page feature. Click History and it expects to take you to gitweb or ViewVC or some such tool. That means that reverting a page requires either git access or cut and pasting. That’s fine for me, but throwing newbies to gitweb suddenly might not be the most easy. Since ikiwiki is a (very smart) wiki compiler, its permission system is a lot less powerful than Moin’s, and notably can’t control read access to pages at all. If you need to do that, you’d have to do it at the webserver level. It does have a calendar, but not one that works like Moin’s does, though I could probably write one easily enough based on what’s there.

A few other minor nits: the email receiving feature is not as versatile as Moin’s, you can’t subscribe to get email notifications on certain pages (RSS feeds only, which would have to be manually tweaked later), and you can’t easily modify the links at the top of each page or create personal bookmarks.

Ikiwiki looks like an excellent tool, but just not quite the right fit for my needs right at the moment. I’ve also started to look at DokuWiki a bit. I was initially scared off by all the plugins I’d have to use, but it does look like a nice software.

I also re-visited MediaWiki, and once again concluded that it is way too complicated for its own good. There are something like a dozen calendar plugins for it, some of which even are thought to work. The one that looked like the one I’d use had a 7-step (2-page) installation process that involved manually running SQL commands and cutting and pasting some obscure HTML code with macros in it. No thanks.

How To Record High-Definition MythTV Files to DVD or Blu-Ray

I’ve long had a problem. Back on January 20, I took the day off work to watch the inauguration of Barack Obama. I saved the HD video recordings MythTV made of the day (off the ATSC broadcast), intending to eventually save them somehow. I hadn’t quite figured out how until recently, so there they sat: 9 hours of video taking up about 60GB of space on my disk.

MythTV includes a program called mytharchive that will helpfully transcode your files and burn a DVD from them. But it helpfully will transcode the beautiful 1920x1080i down to DVD resolution of — on a good day — 720×480. I couldn’t have that.

Background

My playback devices might be a PC, the MythTV, or the PlayStation 3. Of these, I figured the PS3 was going to be hardest to accommodate.

ATSC (HD) broadcasts in the United States are an MPEG Transport Stream (TS). Things are a bit complicated, because there may be errors in the TS due to reception problems, the resolution and aspect ratio may change multiple times (for instance, down to SD for certain commercials). And, I learned that some ATSC broadcasts are actually 1920×1088 because the vertical resolution has to be a multiple of a certain number, and those bottom 8 pixels shouldn’t be displayed.

Adding to the complexity, one file was 7 hours of video and about 50GB itself. I was going to have to do quite some splitting to get it onto 4.7GB DVD+Rs. I also didn’t want to re-encode any video, both for quality and for time reasons.

Attempts

So, I set out to try and figure out how to do this. My first approach was the sledgehammer one: split(1). split takes a large file and splits it on byte or line boundaries. It has no knowledge of MPEG files, so it may split them in the middle of frames. I figured that, worst case, I could always use cat to reassemble the file later.

Surprisingly, both mplayer and xine could play back any of these files, but the PS3 would only play back the first part. I remembered this as an option if all else failed.

Next, I tried avidemux. Quite the capable program — and I thought I could use it to cut my file into DVD-sized bits. But I couldn’t get it to let me copy the valid MPEG TS into another MPEG TS — it kept complaining of incompatible formats, but wouldn’t tell me in what way they were incompatible. I could get it to transcode to MPEG4, and produce a result that worked on the PS3, but that wasn’t really what I was after.

Then, I tried mpgsplit. Didn’t recognize the MPEG TS as a valid file, and even when I used a different tool to convert to MPEG PS, acted all suspicious as if it bought the MPEG from a shady character on a grungy street corner.

dvbcut

I eventually wound up using dvbcut to split up the ATSC (DVB) recordings. It understood the files natively and did exactly what I wanted. Well, *almost* exactly what I wanted. It has no command-line interface and didn’t have a way to split by filesize, but I calculated that about 35 minutes of the NBC broadcast and 56 minutes of the PBS broadcast could fit on a single DVD+R.

It worked very, very nicely. The resulting files tested out well on both the PS3 and the Linux box.

So after that, I wrote up an index.txt file to add to each disc and a little shell scripting later and I had a directory for each disc. I started burning them with growisofs. Discs 1 and 2 burned well, but then I got an error like this:


File disc06/VIDEO/0930-inaug-ksnw-06.mpg is larger than 4GiB-1.
-allow-limited-size was not specified. There is no way do represent this file size. Aborting.

Eeeeepp. So apparently the ISO 9660 filesystem can’t represent files bigger than 4GB. My files on disc 1 had represented multiple different programs, and stayed under that limit; and disc 2’s file was surprisingly just a few KB short. But the rest of them weren’t. I didn’t want to have to go back and re-split the data to be under 4GB. I also didn’t want to waste 700MB per disc, or to have to make someone change video files every 15 minutes.

So I decided to investigate UDF, the filesystem behind Blu-Ray discs. mkisofs couldn’t make a pure UDF, only a hybrid 9660/UDF disc that risked compatibility issues with big files. There is a mkudffs, but it doesn’t take a directory of its own. So I wrote a script to do it. Note that this script may fail with dotfiles or files with spaces in them:

#!/bin/bash

set -e

if [ ! -d "$1" -o -e "$2" -o -z "$2" ]; then
   echo "Syntax: $0 srcdir destimage [volid]"
   echo "destimage must not exist"
   exit 5
fi

if [ "`id -u`" != "0" ]; then
   echo "This program must run as root."
fi

EXTRAARGS=""
if [ ! -z "$3" ]; then
   EXTRAARGS="--vid=$3"
fi

# Get capacities at http://en.wikipedia.org/wiki/DVD+R as of 9/27/2009

SECSIZE=2048

# I'm going to set it a few lower, than capacity 2295104 just in case.
# Must be at least one lower than the actual size for dd to do its thing.
# SECTORS=2295103
SECTORS=2295000

echo "Allocating image..."

dd if=/dev/zero "of=$2" bs=2048 "seek=$SECTORS" count=1

echo "Creating filesystem..."

mkudffs --blocksize=2048 $EXTRAARGS "$2"
mkdir "$2.mnt"

echo "Populating..."
mount -o rw,loop -t udf "$2" "$2.mnt"
cp -rvi "$1/"* "$2.mnt/"
echo "Unounting..."
umount "$2.mnt"
rmdir "$2.mnt"

echo "Done."

That was loosely based on a script I found on the Arch Linux site. But I didn’t like the original script, because it tried to do too much, wasted tons of time writing out 4.7GB of NULLs when it could have created a sparse file in an instant, and was interactive.

So there you have it. HD broadcast to playable on PS3, losslessly. Note that this info will also work equally well if you have a Bluray drive.

Town Hall Questions

Sen. Brownback (R-KS) will be at my employer Monday, and will have a short town hall session. I’m debating whether to go or not, and whether to say anything or not. I don’t agree with him on much, and highly doubt that I’d change his mind on anything. Should I go? If so, what should I say?

Here are some random facts I’m considering mentioning:

  • When Jacob was born 3 years ago, it cost us $250. When Oliver was born this summer, it cost us around $3000, and Oliver wasn’t a more complicated pregnancy.
  • Most of the past 8 years, my insurance premiums have gone up and my benefits have gone down.
  • If I lost my coverage, I couldn’t afford to buy it on my own, even if I still had my job. Insuring our family on the individual market would cost more each month than our mortgage.
  • Almost as much (80%) of my paycheck goes to health care as to federal taxes. I’d gladly take a tax increase if it slowed growth in health care costs.
  • Brownback says it’s too expensive to to comprehensive reform right now. The estimated cost of reform is $900 billion over 10 years, and it will be paid for.
  • Brownback voted for the Bush tax cuts, which cost $2.5 trillion over 10 years, yes to attack Iraq ($700 billion so far), yes on Medicare prescription drug coverage ($400 billion). He didn’t vote for a way to pay for any of these.

Suggestions?

One final note: I will not be doing anything disruptive or disrespectful.

Late Summer

It’s that time of the year again. Everything is changing, and maybe for the better.

The days are getting shorter. When I left for work on my bicycle yesterday morning, it was still dark outside, and a little nippy. There’s nothing quite like riding a bicycle down a deserted country road at night, a cool breeze at your back, and having the sun come up as you ride.

And with the cooler weather, we can open our windows at night instead of running the air conditioner. It’s also nice to have a pleasant cool breeze flowing through the house, and hear the frogs, crickets, owls, coyotes, and other wildlife at night. Out here, we certainly don’t hear sounds of traffic, or loud car radios, though on a really clear night we might hear the rumble and whistle of a train from a few miles away.

This is also sunflower season in Kansas. The wild sunflowers, when their smaller-than-most-people-think flowers, grow everywhere. Some ditches turn into a sea of person-height yellow. Sunflowers are on the sides of bridges, around people’s mailboxes — just about anywhere that isn’t mowed or farmed. Then you also pass the sunflower fields, with their larger flowers, even more sea-like.

The beans are getting tall in the fields this time of year, and it won’t be long before the milo starts to turn its deep, dark reddish brown.

But, you know, we’re Kansans. We can’t really let ourselves enjoy it all that much. Just today, I heard a conversation — apparently the Farmer’s Almanac is predicting a brutally frigid winter this year. Gotta keep our sense of pessimism about weather alive now…

Google Groups Fail

Last month, I wrote that I was looking for mailing list hosting. It looks like some of the lists I host will move to Alioth, and some to Google Groups.

Google Groups has a nice newbie-friendly interface with the ability to read a group as a forum or as a mailing list. Also, they don’t have criteria about the subject matter of a group, so the Linux user’s group list I host could be there but not at Alioth.

So I set up a Google Group for the LUG. I grabbed the subscriber list from Ecartis, and went to “Directly Add” the members. This was roughly August 12.

After doing so, I got a message saying that Google needs to review these requests to make sure they’re legit, and will get back to me in 1-2 days. OK, that’s reasonable.

Problem is, nobody appears to be reviewing them. This is three weeks later and no action.

So I decided I would ask someone at Google about it. The only way they give to do that is to post in the Google Groups Help Forum. So I did. Guess what? They ignore that, too.

Let me say: relying on this sort of service for something important really makes me think twice. It makes me nervous about using Google Voice (what if my Google Voice number goes down?) It certainly makes me think twice about ever relying on Gmail or the other Apps for anything important.

My own mail server may not have the features that theirs does, but if it breaks, I don’t have to worry about whether anybody even cares to fix it.

Looking for a Linux-compatible scanner

I own two scanners: a Fujitsu ScanSnap S510 and an Epson Perfection 4180 Photo. The S510 is a sheetfed document scanner, and works great at that. It has perfect SANE support, duplex mode, excellent sheet feeder, and is a generally good document scanner. It’s passable for photos, but only that. The color isn’t great, and the precision of a sheetfed scanner just isn’t up to a flatbed.

The Epson has never been supported by SANE directly. There was an epkowa driver, but the 4180 epkowa driver hasn’t been updated in ages and isn’t compatible with any modern Linux distro.

So, I’m looking for a flatbed scanner for photos and documents (no need for negatives; that’s the THIRD scanner on my desk.) The most important requirement is that it work well with Linux. The next most important requirement is that it have as good scanning quality as possible for an under-$200 scanner.

If it’s an all-in-one (with a printer), I’m fine with that, as long as it meets the above requirements. Any suggestions?

Preschool Update

Tuesday was open house at the preschool Jacob will start attending when he turns 3. It’s the same preschool I went to, and the only one in the small town closest to us. It’s been owned and operated by the same person for all of the 30 years it’s existed.

We got out of the car when we got there, and Jacob went it full explore mode. He walked a few feet, and happily said “Ooo! There a ditch here! It has water sometimes.” We went inside, and he looked around for about 5 seconds, then was off. It didn’t take him long to find the toy train, the bulldozer, and the school’s bird. He was so busy, in fact, that he didn’t even notice the snacks that were set out until I pointed them out to him.

It’s hard to describe the building. You walk in and you immediately understand that the owner knows a lot about kids, and providing them a creative, interesting, and educational environment. In one corner, there’s a record player that’s probably older than the preschool, and a CD player right above it. Then there’s a little reading nook with some books, headphones, and cassette player. I’m sure Jacob will enjoy learning about cassettes and records, which we don’t really have out at home anymore.

The owner’s husband built most of the furniture for the preschool himself 30 years ago. He figured it would last maybe 10 years, but is still holding up well and gets a new coat of paint every few years. So Jacob will be sitting on the same chairs I used to sit on.

Most importantly, there is no TV to be seen anywhere in the preschool. No computer in there, either. But there is a large outdoor playground — the fun type that I used to enjoy as a child, not the boring plastic type that I see so much these days.

It reminds me a lot of Mr. Rogers. There isn’t expensive technology or a fancy building, but good old-fashioned creative play. Just what children need. I’ll be very happy that Jacob will be there, and I think he will be too.

Resurrecting Old VHS Videos (and Panasonic DMR-EZ38VK Review)

I have a problem that I’m sure is pretty common. My parents used to rent a VHS camcorder from time to time. Not only that, but various school plays, musicals, etc. are on VHS tapes. As a result, they and I have a library of family memories on VHS. And it appears those tapes go as far back as 1987.

You might imagine there are several problems here. One is that VHS tapes degrade over time. Those that were recorded in EP mode (6 hours on a T-120 tape) are especially prone to this. I’ve been worried about how well those 22-year-old tapes will perform even now.

Another problem is that VHS tapes are getting hard to watch these days. We own a VCR, but it’s probably been 7 years since it was hooked up to anything on a regular basis.

So I have meant for some time to convert these old VHS recordings to DVD format. My initial plan was to use the PVR-250 hardware MPEG-2 encoder card that is used with MythTV to do that. But it’s in the basement, used with MythTV, and would generally be a hassle. As a result, I’ve been “meaning to do” this project for about 5 years, and haven’t.

Last night, I found that tape from 1987. It has a few priceless seconds of my grandpa Klassen on it — he passed away in 1990.

The Panasonic DMR-EZ38VK

I initially set out looking for a dedicated DVD recorder with an S-video input, but wound up buying one with an integrated VHS deck as well: the Panasonic DMR-EZ38VK.

I started with a DVD recorder review on CNet. I was primarily interested in video quality. Surprisingly, it seems there is significant difference in video quality among DVD recorders, which was what led me to the Panasonic line.

I was initially planning on a DMR-EA18K or DMR-EZ18K (the difference is whether or not they include a TV tuner). I was having trouble finding them in stock at the vendors I normally use, and wound up with the DMR-EZ38VK instead. B&H had a open-box demo unit at a special discount, so I snapped it up.

Video Quality

I’ve been recording most items to DVD in “SP” mode, which stores 2h per single-layer DVD. I’d concur with CNet: this produces spectacular results. I don’t think I’ve noticed any MPEG compression artifacts at all in this mode.

Some items, such as TV programs or home recordings with little motion, I’ve recorded in “LP” mode. This mode stores 4 hours on a single-layer DVD. It’s also surprisingly good, considering the amount of compression needed. I have noticed MPEG artifacts in that mode, though not to an extremely annoying degree.

The copying process

I start by popping an empty disc in the drive. Then I’ll put in the VHS tape and position it to the place where I want it to start copying. Then I hit Functions -> Copy -> VHS to DVD -> without finalizing, and away it goes. It automatically detects end-of-tape and helpfully won’t copy 6 hours of static.

When a tape is done copying, you can copy from more tapes to the disc, eject it and finalize it later, or work with it.

When I’m ready to finish a disc, I’ll go and change the “disc name”, which is what shows up at the top of the disc menu that the unit generates. If I feel ambitious, I might change the titles of individual titles as well. But all of this has to be done with an on-screen keyboard, and thus takes awhile, so I usually don’t. Finalizing commits the menu to disc and fixates it, and takes about a minute.

Track Detection

This feature is both a blessing and a curse.

The Panasonic recorder can often detect the break between a recording on a VHS. Newer VCRs would explicitly mark these, but it can detect it even with older camcorders with reasonable accuracy.

When it detects this, it creates a new title on the DVD. This takes a few seconds, so it also rewinds the VHS tape a few seconds, then starts copying again.

Unfortunately, if you’re just wanting to watch one long recording all the way through, this results in a few seconds being duplicated right before each scene transition, which is rather jarring. There is no way to disable this feature, either. The only workaround is to read from an external VCR. But if you do that, you lose the end-of-tape detection.

Generally I’ve decided to just live with it for now. It’s a cheap price to pay for an otherwise pretty good workflow.

Other annoyances

While copying, you can’t access the position indicators for either the VHS deck or the DVD recorder. So you don’t know how far along on the tape you are, or how much space is left on the DVD, until copying stops.

Also, it would be very nice to be able to tell it “copy 23 minutes and 15 seconds from VHS to DVD” when you know you don’t want to copy the whole tape.

The unit also has SD and USB ports for reading digital video. Frustratingly, a USB keyboard can’t be used to edit disc or track titles. That seems like an obvious and cheap feature to have.

Overall

Overall I am happy with the unit. It produces very good quality results, and is pretty easy to use overall. I don’t think I’d pick a different one if I had to do it again. But it could be made better for people that are copying large numbers of VHS tapes to DVD.

Generally, though, I can just start the copy and let it sit for a couple of hours, trusting it to do the reasonable thing with a tape. That’s convenient enough that I can get other things done while it’s copying, and takes little enough of my time that I’m actually working through stacks of tapes now.

Update 8/27 I have now tried some discs from this playing back on my PS3 connected to a 1080p HDTV. On that setup, compression artifacts are noticeable at the 2hr setting, and more are noticeable at the 4hr setting. I don’t think that they are any necessarily any more noticeable than any other home-produced DVD, though, especially on the SP setting. They had not been very visible on SD equipment.