anti-bot: CAPTCHA!

perm url with updates: http://xahlee.org/js/captcha.html

Anti-bot Test: CAPTCHA!

Xah Lee, 2010-01-29

You've seen on the web CAPTCHAs Like this:

captcha

It is a test designed to prevent bot that spam websites. Software can be written to automatically fill web forms. That means, they can leave blog comments or create new web accounts. So, spamers use these software to create hundreds of comments or accounts by the seconds, and leave their advertisement or otherwise walware.

To prevent that, one needs something that computers can not do. Some sort of bot test. So, you have the distorted image, which computers cannot recognize well yet.

The name CAPTCHA is supposed to be: Completely Automated Public Turing test to tell Computers and Humans Apart.

Google's reCAPTCHA

There is a captcha service called reCAPTCHA, now owned by Google and is free, at http://recaptcha.net/. It allows web masters to put captchas on their site.

There is one aspect about reCAPTCHA that's interesting. The distorted text are actually from the process of digitizing books. When OCR can not understand a text, it became the source of google's captcha image. When human gives the answer, the data is used to statistically determine the correct answer. So, reCAPTCHA serves both as anti-bot but also helps in digitizing books. (OCR means Optical Character Recognition. It is the software designed to recognize text in image form, kinda the opposite of captcha.)

Google has a blog explaining their captcha service, at: Source. The blog also features a video, of Luis von Ahn explaining reCAPTCHA. Reading Wikipedia on Luis, he turns out be a well recognized genius with many awards.

Though, i must say, my experience with recaptcha is that often it is hard to understand. Often, after several tries i cannot pass. Apparently, many people felt the same as you can see their comments on google's blog. The severity of this problem is critical.

For more detail, check Wikipedia ReCAPTCHA.

Artificial Intelligence

Captchas are quite interesting in several aspects. It is a simple artificial intelligence problem of devising a scheme so that a machine can tell if a human is human. It is also a problem in image recognition. It is also a interesting problem of web site security.

Note, Wikipedia article cites that many research projects have broken captchas, and also there are alternatives such as image captchas. For example, showing you several images of animals, and ask you to pick one that's cat. Also, breakers has several methods to defeat captchas, including cheap human labor farm.

The history of cop vs robber game in the computing realm is itself a fascinating story.

Overall, i really don't think captchas are a good solution to the web spam problem, at all. It is frustrating to use, waste time, and isn't that effective in preventing spam. In fact, i'm quite surprised that spam just have increased and increased over the past 20 years i use the web, everywhere. In my yahoo email account, gmail account, in my several instant messagge chats, in my web logs, in spam blogs that use randomized snippet of text from my website. Today i even get spams from skype, about few times a week now. Captchas and spam is a phenomenon of the tech geekers trying to solve a human problem by technology. (See: Tech Geekers vs Spammers)

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs