CAPTCHAs are everywhere on the web now.  They are the distorted text that you are asked to identify before being allowed to register for an account.  The purpose is to prevent computer programs from gaining quick access to many accounts for nefarious purposes (spam for example.)

reCAPTCHA piggy-backs on CAPTCHA.  You are asked to identify two words. The first is a standard CAPTCHA.  If you enter the correct word you identify yourself as a human.  The second is a word that has been optically scanned from a book that is being digitized.  It has found its way into this reCAPTCHA because the computer doing the optical character recognition was not able to identify it.  If you have identified yourself as a human via the first CAPTCHA, your answer to the second word is assumed to be correct and used in the digital translation.  You are digitizing the book.

According to Wikipedia 20 years of the New York Times archive has been digitized with the help of reCAPTCHA.  And, “provides about the equivalent of 160 books per day, or 12,000 manhours per day of free labor.”

The first reaction to this is obvious.  The labor is not free.  In fact it costs exactly 12,000 man hours.  Lots of things can be produced with 12,000 man hours. Lots of leisure can be consumed in 12,000 hours.  Is digitizing the New York Times the best use of this people-time?  On top of that the reCAPTCHA is a tax which reduces the quantity of online accounts transacted and that is a deadweight loss.

But it is just a few seconds of your time right?  Something about that seems to change the calculation.  I bet most people would say that they don’t mind giving away two seconds of their time.  Part of this is due to an illusion of marginal vs total.  People are tempted to treat the act as a gift of two seconds of their time in return for a whole digitized library.  But in fact they are giving away two seconds of their time for one digitized word.

A second part of this is due to a scale illusion. You may successfully convince said reCAPTHArer that she is just getting a tiny fraction of the book for her two seconds but she will probably still say that she is happy with that.  But if you ask her whether she is willing to contribute 1000 seconds for 500 words, probably not.  And, to take increasing marginal costs out of the question, if you asked her whether she thought digitizing the New York Times is worth how many thousands of woman-hours of (dispersed) ucompensated labor she again might start to see the point.

But still, not everybody.  And I think there must be some sound rationale underneath this.  I would not argue that digitizing books is the necessarily the highest priority public good, but the mechanism is inherently linked to deciphering words.  True, we could require everyone who signs up at Facebook to donate 1 penny to fight global warming but A) it is never possible to know exactly what “1 penny toward fighting global warming” means whereas there is no way to redirect my contribution if I decipher a word.  That is not a liquid asset.  And B) two seconds of most people’s time is worth less than 1 penny (we are talking about Facebook users remember) and we don’t have a micro-payments system in place to go down to fractions of pennies.

Perhaps what we have here is a unique opportunity to utilize a public-goods contribution mechanism that transparent and non-manipulable and guarantees to each contributor that he will not be free-ridden on:  everyone else is committed to the same contribution.