Sooner or later a new generation of spam protection methods will emerge to block all unwanted site visitors. The recently launched Google No CAPTHCA reCaptcha could just be such a method. This new “behaviour analysis” tool is getting more and more attention both from the site owners and from scraping engines who are trying to break it. Since Google does not reveal any secrets of its operation, we want to share with you the techniques used in this new smart analysis CAPTCHA that determines between bot and human. Let’s look inside.
How does “No CAPTCHA reCaptcha” work
- The supplied JavaScript captcha api code accumulates the cues of human activities (or its absence) on a web page even before a user (client) approaches the reCaptcha itself.
- When a user moves to and ticks the “I’m not a robot” checkbox, that behaviour drives even more browser events. These are caught by the same script and a request with encoded payload is sent to the Google server, the user’s fingerprints are recorded and his cookies stored.
- The behaviour analysis system on the Google server analyses the data provided and returns an encoded value to the client page. This value is user and time dependent.
- In case of confusion (or bot-like behavior) Google’s server will ask the client to complete an additional image-check CAPTCHA (see picture below) to further verify if the user is a bot or not.
- The encoded value bears the hidden info if user is verified or not. But then you need to know whether Google has verified that user or not on that page. To check it, you send a POST (ajax) request with the following parameters: the returned encoded value, the secret key and end user ip (the last one is optional). Read the details on how to fetch and verify the user’s response.
Cases in which a second image-check is required
- Bot is suspicious of behavior in the initial test. “In cases when the risk analysis engine can’t confidently predict whether a user is a human or an abusive agent, it will prompt a CAPTCHA to elicit more cues, increasing the number of security checkpoints to confirm the user is valid.” – from Google reCaptcha page.
- Expiration of time is also handled with new reCaptcha. If there is no response from the client for a while, the reCaptcha pops up an additional image-check puzzle.
- ReCaptcha application on mobile devices. The website will show you images for comparison/selection and you will be verified upon single or multiple tap(s).
Criteria of engine verification analysis
- mouse movement, its slightness and straightness
- page scrolls
- time intervals between browser events
- keystrokes
- click location history tied to user fingerprint
All these criteria, are stored in the browser’s cookie. These criteria are processed by Google’s server to discern bots from humans – it is pretty hard for bots to mimic the browser behavior of humans. This technique is pretty far advanced when you compare it to the old CAPTCHAs spam protection methods – which for the most part can be solved using today’s technology.
by Google research
Some more on the behavior captcha
Some readers are perflexed: “If the software is capable of differentiating between bots and humans before presenting CAPTCHAs, then what is the point of the CAPTCHA?”
ReCaptcha is smart. Really smart. How much CAPTCHA users are asked to do, depends on how human they behave. If the risk assessment machine does not have enough evidence that a user is a human, it puts additional tricks (image CAPTCHA) for final verification. This method should remove the usual frustrations we humans feel when confronted with the traditional super distorted text CAPTCHAs.
Want it? Register in google to integrate it
At this point, I believe, many readers are eager to get this new generation CAPTCHA on their sites. Prior to using it, you need to register your site (prooving your site ownership) in reCaptcha google service. Upon success you’ll be issued the reCaptcha credentials (a site key and a secret key). The site key is later integrated into the form with reCaptcha (follow steps of the reCaptcha management after a signup) while the secret key is needed for final verification by your server. This php library is available for integrating reCaptcha into a website.
In the following post we’ve described how to integrate it on site and make it work.
The simplest form with reCaptcha code
1 2 3 4 5 |
<script src="https://www.google.com/recaptcha/api.js" >; <form method="post"> <div class="g-recaptcha" data-sitekey="[site key issued by google]"></div> <input value="submit" type="submit" /> </form> |
Need to break it?
In the following posts, we’ll explore some software and services that might be able to break this new CAPTCHA. So, stay tuned! If you want to help us test drive these methods, please let me know in the comments.
Conclusion
The new reCaptcha is no doubt a nice and powerful tool in spam and web scraping protection. Google has finally created a good user experience for sites which rely on CAPTCHA. Yet, I believe, both human labour CAPTCHA solving services and the programming CAPTCHA solving systems will continue to fight and break this new invention in the endless human-bot competition.
Carbon SInk
My site has a login form whose button is inserted by Javascript and whose result is submitted by Javascript to an API, not directly to an http request.
Seems to me that this makes the form robot proof unless the robot can interpret Javascript.
Comments?
Igor Savinkin
The form will not be the robot proof since the reCaptcha takes the cues, evaluates them at the google server and inserts as an encoded value field into the form. This encoded value robot can’t obfuscate.
/
Anonymous
It’s a shame, since I block javascript at most places. I especially block traffic to google. I don’t trust them, and I specifically do not trust them with data indicating my quirks involving how human I am and what I do in a browser. Forget that.
Yopi
Nice idea but can’t a Slammer record a single successfull human session and use the cookie and sent payload to google many times as bot?
Igor Savinkin
Yopi, your suggestion is not bad. If you can, you may try to code of it and shate with us. Yet, as far as I know, google creates time dependent cookie, so the interaction might be “valid” only for some time period.
Akshaya
I used googles no captcha recaptcha plugin in contact form7 of wordpress.Only once It worked fine.After that each time it asks for image verification.But I don’t want that image verification challenge.Please help me to modify the code of plugin.I want the answer as soon as possible.please help.
Igor Savinkin
The google might suspect your reCaptcha soltion behaviour to be a bot-like. What’s the code of plugin? Any link to it? For such a task you might probably have to pay to one who would do it.
Rahul
CAn u haCk cAtpcha in eaCh sites then emaiL me 🙂
Yogesh Jagtap
It really Sucks.
inzam
im automation tester how can i bypass the reception technique using selenium code
Igor Savinkin
What do you mean “reception technique” ?