reCAPTCHA Roulette

To remind you, reCAPTCHA asks you to decipher two smeared words before you can register for, say, a gmail account. One of the words is being used to test whether you are a human and not a computer. The reCAPTCHA system knows the right answer for that word and checks whether you get it right. The reCAPTCHA system doesn’t know the other word and is hoping you will help figure it out. If you get the test word right, then your answer on the unkown word is assumed to be correct and used in a massive parallel process of digitizing books. The words are randomly ordered so you cannot know which is the test word.

Once you know this, you many wonder whether you can save yourself time by just filling in the first word and hoping that one is the test word. You will be right with 50% probability. And if so, you will cut your time in half. If you are unlucky, you try again, and you keep on guessing one word until you get lucky. What is the expected time from using this strategy?

Let’s assume it takes 1 second to type in one word. If you answer both words you are sure to get through at the cost of 2 seconds of your time. If you answer one word each time then with probability 1/2 you will pass in 1 second, with probability 1/4 you will pass in 2 seconds, probability 1/8 you pass in 3 seconds, etc. Then your expected time to pass is

$\sum_{t=1}^\infty \frac{t}{2^t}$

Is this more or less than 2? Answer after the jump.

D’oh!

$\sum_{t=1}^\infty \frac{t}{2^t} =2$

and if you add in the small delay before the next pair words appears, you are better off taking the sure thing (unless you have declining marginal opportunity cost of time.)

6 comments

Comments feed for this article

July 16, 2009 at 2:35 pm

Alex

The folks at 4chan had exactly this problem when they were hacking the Time Magazine “Time 100” poll to spell out “MARBLECAKEALSOTHEGAME” and make the 4chan founder the #1 most influential person. They couldn’t hack recaptcha, but they figured out a really clever way to do a lot better than guessing just one of the words every time.

moot wins, Time Inc. loses

July 16, 2009 at 2:52 pm

Andrew

Or, you can just enter both words because you want the world’s books and back-issues of the New York Times to be digitized, and save yourself all of this math. I love reCAPTCHA.

July 17, 2009 at 4:23 am

Morten

Your maths are wrong. As people have finite lives, you can not sum to infinity. Hence, the better strategy is to fill in only one word (assuming no delay between tries). Admittedly a small saving as you would sum to a pretty large number …

July 17, 2009 at 8:47 pm

Divya

As an aside, here is evidence for why captcha’s don’t help website owners: http://www.seomoz.org/blog/captchas-affect-on-conversion-rates

July 19, 2009 at 10:18 pm

I didn’t know this yesterday « Meanderings

[…] 17, 2009 · Leave a Comment That Ticketmaster scalper protection word scramble annoyance has a name. And is kind of cool. Possibly related posts: (automatically generated)I didn’t know this yesterdayNo Titlethe […]

July 28, 2009 at 2:56 pm

Anthony

Divya – are the website owners possibly better off without people who lack the patience to deal with the captcha? If all you want is eyeballs, in the short run, no. But if you want something other than that, or perhaps you want to create an environment which has certain qualities, losing some potential customers might make it attractive to others.

Of course, if you really want to monetize your website and maintain a certain level of quality, you’ll make anyone who refuses the captcha go to one of your advertiser links, so that you get paid for the clickthrough.

reCAPTCHA Roulette

Top Posts

Tags

Subscribe via RSS

Jeff’s Twitter Feed

Email Subscription

6 comments

What Are Your Thoughts? Cancel reply

reCAPTCHA Roulette

talk cheaply

Related

Top Posts

Tags

Subscribe via RSS

Jeff’s Twitter Feed

Email Subscription

6 comments

What Are Your Thoughts? Cancel reply