Search Engine Spam Detector

By Nathan Weinberg

Reader George let me know about this interesting tool which acts as a search engine spam detector, analyzing your HTML and letting you know which of your practices might be interpreted by a search engine spider as black hat SEO. The tool tries to detect keyword stuffing, doorway pages and hidden text.

On my site, it found my blogroll as hidden text, because of the nifty tool I use to collapse to 250+ links into a tiny space. They also didn’t like a list I did of languages Google Talk supports, since it looked like keyword stuffing. Other than that, I was completely clean. Very nice.

How does your site do?

For fun, I’ve entered in some sites and checked them out:

  • Google Blogoscoped - perfectly clean
  • Miel’s blog - also clean
  • Matt Cutts blog - indeed, clean
  • - invisible links
  • - four instances of invisible links
  • - two instances of invisible links
  • Jason’s blog - 12 links to doorway pages
  • Boing Boing - their writing style is too similar to keyword stuffing. Hilarious!
  • The official Google blog - lots of invisible text that reads “< whitespace >” and the same keywords stuffing detected by me, since I was just quoting them.

Find anything interesting, post it below. It seems like the majority of the problems detected are honest mistakes. My advice: If it detects nothing that could be considered obviously sinister (like invisible links to online poker/phentermine sites), don’t worry, but if it looks suspicious, think about tweaking your HTML to remove that code.

May 23, 2006 by Nathan Weinberg in:

6 Responses to “Search Engine Spam Detector”

  1. Jason Schramm Says:

    I just ran my blog through that site and it doesn’t show any doorway pages.

  2. GamingFox Says:

    It found hidden texts in my expandable menus, but other than that, my site is pretty clean.

  3. Nathan Weinberg Says:

    Odd, they seem to have dissapeared. Still, its not like you aren’t a black hat search engine spammer…

  4. wunderkid Says:

    Just ran some of my spam through there and got a clean bill of health. That made me feel good.

  5. pentapenguin Says:

    Thanks for the link. My home page got a clean bill of health. :)

  6. LowLevel Says:

    Well, I hope to release the next version of the spam detector in the following weeks. It should report less false positives and it will find new kinds of spam (cloaking, for example). :-)

Leave a Reply

Commenting? If there's a contest today, you might be entering to win something. Check it out.

- This blog has coComment integrated.