I've always loathed captcha's, and they've gotten progressively worse. But I've noticed some blogs seem to do reasonably well with just a single static word, presumably because most spammers won't customize things for a single blog unless you get a lot of traffic.
I also have a soft spot for an idea that I believe originated with various schemes to reduce e-mail spam:
Make the client pay
"Pay" doesn't have to mean money. Paying by carrying out a computation is just as good. The overall idea is to either make spamming you "too expensive" or at least sufficiently more expensive to make you a less attractive (and hopefully a money losing) target.
So since I've had to delete 50-100 comment spams every day for the last few weeks, I got fed up and did the following as a first step. It doesn't yet add a computational cost for the spammers, but it makes it trivial for me to make it more complex, and if I start seeing comment spam again, I'll ramp it up immediately:
First I added a "script" tag to the comment form. All it contains is this: document.write("");
Then I added a check for the "captcha" in my controller
That's it. So far it's stopped all but one comment spammer, who from the looks of things was an inept manual one.
But this won't stop people for long, especially not if more people do it and it gets worthwhile to circumvent, so here's how I'll escalate things if I start seeing comment spam again:
Consider that for an average poster, waiting an additional second or two for a comment to successfully submit isn't a problem. Even a bit more is probably acceptable - you probably spend more than that trying to figure out captcha's or even being forced to register at a regular basis.
For a comment spammer, though, if 2 seconds of CPU time is wasted per comment posted, that means a real cost. I don't know what kind of throughput these guys manage, but I know that when I worked at Edgeio we had no problems doing several millions HTTP requests from a single dual or quad CPU box (can't remember) to retrieve feeds. I'd be surprised if you couldn't churn out a million comment spams per 24 hours if nothing is stopping you.
"Costing" a comment spammer that can otherwise handle 1 million a day per core 2 seconds of time per comment translates to a reduction in rate from around 1m to about 43k per day per core, or a factor of 23 times. That means the yield for each spam needs to be 23 times as high for the spammer to break even or make any money.
So how to go about it?
Why not just use a single hash function? Simple, it can be optimized on the client side. You want to generate someone randomly to force them to take the full computational cost for every comment.
In other words: If you're careful, the spammers can't know in advance whether they've been fed a "genuine" function or one aimed at keeping them stuck forever.
Of course you need to be careful not to hit genuine posters this way, or they'll quickly learn to stay away.