Captcha cracked???


A software company called Vicarious claims to have created a computer algorithm that can solve CAPTCHA with greater than 90% accuracy.

What is CAPTCHA and why should I care?

You’ve already encountered CAPTCHAs if you’ve ever created an e-mail account with Google, set up a PayPal account, or commented on some WordPress blogs. CAPTCHAs are those wavy, distorted letters that you have to type into a box. The purpose is to prove that you are human rather than a computer-controlled “bot” making mischief on the Internet.

You should care for at least two reasons. First, CAPTCHA is the security system used across the entire Internet to help prevent unlawful use of websites. So if that has been broken, the entire Internet should probably start transitioning to a new security system.

But more exciting, this might be a major breakthrough in computer science. Creating machines that can see the world and make sense of images as humans do is one of the “hard problems” in artificial intelligence. Breaking CAPTCHA is a milestone on that road—if Vicarious has pulled it off.

So is it a breakthrough or not?

That depends on how they broke CAPTCHA. Previous attempts have used brittle solutions that relied on quirks of how different CAPTCHAs are implemented. For example, a CAPTCHA that just slants the letters and peppers them with dots can be solved by removing dots and then looking for recognizable letters when the image is bent in various directions. But those attacks have been easily squelched by tweaking the way that CAPTCHAs are generated. If Vicarious has merely created a new set of brittle solutions, then it is not a breakthrough. If the new algorithm does indeed solve a deeper problem in machine vision, and is indeed as good as human vision at solving any CAPTCHA-like problem, then this is breakthrough territory. That is exactly what is being claimed. In fact, Vicarious’s researchers go on to claim that their algorithm works in an analogous way to the human brain.

Do they offer any proof?

Ah, there’s the rub. Vicarious has credibility, given the scientists working there, but its current offer of proof is little more than a press release sent out to journalists and a video (above). The company has released no software code and no technical explanation, and as Vicarious co-founder Dileep George said in an e-mail, “There are no current plans to write a paper, but things could change in the future.”

To be fair, you wouldn’t want Vicarious to share the code. Unleashing a CAPTCHA hack before the world has time to adopt a new security system would be disastrous. Still, the science-by-press-release has annoyed many computer scientists with whom Science talked. As one bluntly put it, “the material provided is not sufficient to back up their claims.”

And CAPTCHA creator Luis von Ahn, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania, is not convinced. He sent ScienceNOW this defiant message:

This is the 50th time somebody claims this. I don’t really get how they think this is news 🙂

If their program is actually a break, we can simply add more distortion or switch to image-based CAPTCHAs.

In an e-mailed response, Vicarious co-founder George* defends his company’s algorithm:

Our approach gives a general way to solve text-based captchas, because we solve the segmentation problem in a very general way. Our system is extremely distortion tolerant, the systems that you saw were trained on just a handful of images per character. He can add more distortions, but we can simply add a few more training data that captures that distortion, if it is not already captured by the existing training examples.

Yes, there are problems in image recognition we haven’t solved yet, and there can be captchas based on that. All we are claiming is that our approach fundamentally breaks all text-based captchas. You cannot hope to patch up the text-based captchas by adding more distortions or clutter. Our approach is immune to all such transformations.

What does all this have to do with the human brain?

Vicarious calls its algorithm the Recursive Cortical Network™. The reference to the human brain is built right into the name, as well as the commercial nature of this research. Whether it really has anything to do with how cortical neurons process information remains to be seen.

Breaking CAPTCHA wasn’t the goal, says Vicarious co-founder Scott Phoenix. “It was just a sanity check. We believe that higher level intelligences are all built on the somatosensory system. So that’s why we started with vision.” The company plans to hook up this visual system to robots. The benchmark then will be, for example, “Preparing a meal in an arbitrary kitchen.”

So does it really work?

Vicarious was concerned when I sent the company an e-mail describing its claim as “unsubstantiated,” so Phoenix and George offered to do a demonstration over Skype. I sent them CAPTCHAs off the Internet. They were able to solve the first two, a reCAPTCHA from Google and a CAPTCHA from a Paypal website, immediately. But the algorithm was stumped by two others. One had Cyrillic characters. “We haven’t trained our system on other languages yet,” Phoenix said. And it also failed on a CAPTCHA that used alternating patches of black and white like a chess board. In a follow-up e-mail, George gave this explanation:

So why didn’t our demo work on the checkerboard pattern? Before the image is presented to our algorithm, it has to pass through a retina+LGN kind of processing (it is a basic, common-to-all-CAPTCHAS pre-processing). In the demo, we had a specialized retina which was faster and gave a few percentage points more accuracy on reCAPTCHA; the downside being it not working well on the checkerboard pattern. (Or any pattern where some portions of the letter are black and some portions of the letter are white). Its just like putting on sunglasses being specialization we add to our eyes for going out in the sun, which causes us to see some other patterns not as well. It is not a fundamental flaw and we have tested that our system works with the checkerboard pattern as well

(This article originally published at

Be Sociable, Share!

Mohit Bansal(23) is B.Tech in Electronics and Communication Engineering from Indian School of Mines, Dhanbad, India. He has interest in business and entrepreneurship and has published couple of research articles. He is also associated with various NGOs. He is with Techaloo when it was just in concept stage. The Techaloo site was not existing even then. Currently Mohit is working with Mu Sigma as a Business Analyst Profile.

Leave a Comment

Current ye@r *

Please support the site
By clicking any of these buttons you help our site to get better
Social PopUP by Timersys