visipisi

/code/

var visipisi = {
  mozilla: function(url, cb) {
    var tries = 1;
    var start, intervaller;
    var runtest = function () {
      var count = 0;
      var img = new Image();
      img.onload = function () {
        if (++count == 1)
          cbWrap(true);
      };
      start = new Date().getTime();
      img.src = url;
      var timeoutCB = function () {
        var now = new Date().getTime();
        if (now - start > 10) {
          window.stop();
          window.stop();
          window.clearInterval(intervaller);
          if (++count == 1) {
            cbWrap(false);
          }
        }
      };
      intervaller = window.setInterval(timeoutCB, 2);
    };
    cbWrap = function (value) {
      if (--tries == 0)
        cb(value);
      else
        window.setTimeout(runtest, 10);
    };
    runtest();
  }
};
Tweet

I came across this post on HN and thought I give it a try and see if I can make it work across more browsers and more reliably. The goal is to determine which websites the user have visited through their browser. Mine works a bit differently. The code's to your right. It simply downloads two images from one site, one that might or might not have been cached, and the other which is certainly not cached due to a generated random query string appended to the base URL. It then compares the loading times and if there is a significant difference, it determines the site to have been previously visited. The function calls the callback given to it with true if the url has previously been visited, false otherwise. I have tested this on FF 5 and recent version of Chrome. Except IE about which I have no idea, the code should run fine in other browsers.

What I've observerd is that appending a random query string at the end the URL sometimes causes up to 10x increase in latency. So there maybe false positives. But if you have something cached, this should pick it up almost perfectly.

You can try it out below. Note that after the first run, all the images from the test sites will be cached. So any subsequent tries will result in all sites showing as visited. You need to clear your browsers cache, visit some sites and try again.

All the code including this page is released under GPL license version 3. If you have questions or any feedback, please send me an email at mansour [at] oxplot [dot] com

Update Dec 4, 2011 18:40 GMT+1100

Turns out Chrome (and probably all webkit based browsers) doesn't talk to the server when it's sure the resource is not modified (based on the cache expiry date) - FF for instance makes a request every time. This makes it very easy to know if something is loaded from cache. If the load time is super fast (0-10ms), it's most certainly loaded from cache unless the resource itself is quite beefy.

So now for Chrome users, the test should run fast, more reliable and it shouldn't pollute your cache so mutiple runs will yield consistent results.

Update Dec 4, 2011 23:49 GMT+1100

I have devised a different method that seems to work across Safari, Chrome and Firefix, not 100%, but 95% of time. It's non destructive to both cache and browser (page) history. It uses image loading as before with the exception that it calls stop on the page after 10 ms. Firefox seems to very unpredictable in terms of honoring the timeout value of setTimeout function. Safari on Max OSX is dead on. Give this one a try. It should be most accurate than the two previous versions.

  
facebook: slashdot: cnn: abebooks:
twitter: myspace: bbc: msy:
digg: engadget: reuters: techbuy:
reddit: last.fm: wikipedia: borders (au):
HN: pandora: amazon: mozilla:
stumbleupon: youtube: ebay (au): anandtech:
wired: yahoo: newegg: tomshardware:
xkcd: google: bestbuy: shopbot (au):
linkedin: hotmail: walmart: staticice:
pornhub: redtube: perfectgirls: youporn: