Wednesday, September 16, 2009
Google has announced that it is acquiring reCAPTCHA which is providing online captcha service for thousands of websites for preventing spam in their sign-in/registration pages.
Google says that this new acquisition will give two benefit for them.
One is, it will improve the security system for preventing spam in many Google websites by implementing the reCaptcha.
Another benefit is, reCaptcha will be used as way for improving scanning process involved in Google Book project. i-e It will help to improve effectiveness of OCR (Optical Character Recognition).
After reading the second benefit, I was thinking about "how the captcha will help to improve the scanning process?".
By searching the internet, I got the answer for my question.
I came to know that the image text displayed by reCAPTCHA in Sign-in/registration pages are actually the scanned text which are not recognized by OCR.
So, reCAPTCHA will gather the details from data entered by users to improve the transcription accuracy.
reCAPTCHA says "answers from millions of humans on the internet, reCAPTCHA is able to achieve over 99.5% transcription accuracy at the word level"
So, reCAPTCHA is allowing users to complete their registration page only after doing small data entry work which will help to improve transcription accuracy.
This article about transcription will help to understand the role of reCaptcha in the transcription process.