July 19th, 2006



Several years ago, I came up with a way to spam-proof my email address for USENET posts. I altered the header so that the From line read "mackys@rhinokaosol.net" - then added another header that warned people to "remove any African animals."

The reasoning here is that it takes human intelligence to know the connection between the extra word in my posted email address and Africa. Hence, no automated system would be able to figure it out, because we do not have human level AI yet. However, I apparently made the test too hard, because since then one or two actual human intelligences haven't been able to figure it out either. ;]

But the general idea of using semantics (i.e., meaning or reasoning) to foil bots is a good one. And it's now being applied to captchas - those weird number-letter images that websites use to foil automated account creating bots. (Which the bad guys use for everything from automated identity theft to spam.)

I present to you: KITTENAUTH!

Of course it's always possible to brute-force any finite set of images. Pay a human go over them once, labeling each as "kitten/not kitten" (or whatever). There are two ways to make life harder for such schemes. One is to simply use more kitten and non-kitten images. It should be possible to fit ten thousand thumbnail images into ten megs of space, assuming they're each 10k in size. Assuming the captcha-generator is smart enough to copy the image to a random filename and use that in the <img> tag, there will be no way for the attacker to automate the attack without doing something like computing an MD5 hash on the images presented, and comparing them with a database of hashes built by a human being. This quickly becomes cost-prohibitive for the notoriously cheap-skate spammers, who loathe having to do honest work to make their money. (Which is why they've turned to spamming in the first place.)

If you want to make life even harder, you can create multiple image libraries on different subjects. Imagine the poor bot writer who faces a database of ten million distinct images from a hundred different categories (kittens, cars, i-beams, wood, bridges, etc). And a text string telling him to only click on pictures of "cars or i-beams." As the number of combinations go up, the amount of work to reliably crack the system increases exponentially, and the odds any given brute force attack being successful dwindle to insignificance. A small enough number, in fact, so that we can complete the loop and pay human beings to catch the last few that slip through the cracks.

The spammers lose big, the rest of us win big. What could be better?

[Digg] Top 10 dumbest business ideas that made millions.

Create goggles for dogs and sell them online? Boy, this IS the dumbest idea for a business. How in the world did they manage to become millionaires and have shops all over the world with that one? Beyond me.


Maybe I can put up an AJAX^H^H^H^H WEB 2.0 ENHANCED site to sell bling-bling for cats and retire at 31. ;]
  • Current Music
    PT Barnum