The Strange World of Blogspot Spam Blogs

Posted by Geektronica on June 30th, 2005 under

I’ve been cruising the Blogspot world lately looking for cool stuff that the bigger geeklogs might have missed (and I found some cool knitting sites [1, 2] as a result last time I did this). What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs are spam-logs - sites created to increase the Google ranking of some other site (which is itself usually a Google-spamming site). The ultimate purpose of these spamlogs is usually to drive traffic to a commission-paying pharmacy, pr0n, or casino site.

Some of the spamlogs hosted at Blogspot, which apparently does not have a policy against them, are obvious in their intent (for example). It requires a human to start a new Blogspot-hosted site, but after the initial setup (which can be partially aided by scripts, I’m sure), bots can post like crazy. Usually the posts are strings of highly searched terms (like the names of celebrities, TV shows, or something Google Adsense pays a lot for, like asbestos litigation), with a link to the external site that the spammer is trying to bump up in the Google rankings.

None of this is new; Matt Mullenweg posted on it in March, as did SpamBlogging. But a few things have changed since their posts on the topic.

First, Blogspot now requires CAPTCHA authentication to start a new account, which many people said would fix the problem. It hasn’t. Entering a CAPTCHA sequence takes a human about five seconds, and you only have to do it once.

Second, spammers are becoming less obvious by creating posts that link to actual news articles (complete with excerpts); by all appearances, these blogs are just like scores of real blogs. But if you look at the code of the page, there are tons of external spam links, cleverly hidden by CSS. Here is an example: Mario’s News Archive Posts (to which I’m avoiding giving Google-props by using the rel=”nofollow” attribute). With this additional layer of subterfuge, it’s remotely possible that someone will even link to “Mario’s” blog from their highly-ranked site. Check it out - it’s quite slick. It even auto-reloads every few seconds (though I’m not sure why).

A peer under the hood of Mario’s spamlog reveals something like this at the end of every post:
<style>.lin {visibility:hidden;}</style><div class="lin" style="position:absolute;top:-50;left:-3000;"><font size=1>Links of Interest:<BR><a href="http://www.treadmills100.info/York-Treadmill/Where-To-Get-A-Cheap-Treadmill.cfm">Where To Get A Cheap Treadmill</a>

I realized this when I tried to leave a comment on a news item that I found interesting, and clicked the Blogger “show original post” link, which uses some JavaScript to show the text of the post. However, this breaks the embedded CSS, letting the secret out.

The only thing that will stop BlogSpot from becoming 99% spamlogs is for posting to require CAPTCHA each time. This would be a pain, and it shouldn’t be necessary for people who host their own sites, but BlogSpot users should have to prove they’re human each time. Other thoughts on how to stop this, or whether it should be stopped?

48 Responses to “The Strange World of Blogspot Spam Blogs”

  1. Radical Congruency » On BoingBoing Says:

    […] Rings and Free Amazon Things

    BoingBoing has picked up my post at Geektronica on Blogspot spam blogs: Spam blogs are phony weblogs designed to game […]

  2. Kerno Says:

    I often go browsing through Blogger blogs nad I agree with the statement that up to a thrid of them are spam sites. It seems logicval to me to allow you to report this to the Blogger groups as spam, much like you can do with email. They serve no real purpose other than to check the PageRank system and it would be in Google’s best interest’s to have community monitoring of this in fariness to everyone. This system can be abused, yes, but do we want real blogs or spam everywhere you turn?

  3. Kerno Says:

    Sorry for my incredibly poor spelling. Typed in a hurry - it’s beer o’clock in Australia.

  4. sesblog Says:

    Spamblog

    Nem, nem blogspam, arról már írtunk egyébként is pár hónapja. Spamblog, azaz csakis (keresõ-) spammelési célokat szolgáló, robotok által frissített blognak látszó tárgy. Hírsite linkek és idézetek, CSS-sel rejtett spamlinkek, a Geektronica cikkje szépe…

  5. Mark J Says:

    Other than faster action by BlogSpot to shut down those blogs, I don’t know if there is a solution.

  6. BH Says:

    If Google can effectively determine which blogs are spamlogs and drop them from their index, then the incentive to create them will be removed. If spamlogs are like e-mail spam, a few people are creating 90% of them, so they’re probably not that diverse and spotting their “signature” should be fairly easy.

  7. djuggler Says:

    I posted about Blogger’s use of captcha and basically found it frustrating t to the point that I almost gave up two blogs. I post TN’s lottery results at http://www.tnlotteryresults.com but out of curiosity throught I’d try blogging them also. I had always been curious about whether or not posting directly to blogger, as in http://tnlotteryresults.blogspot.com/, or having it post the blog to your own server, as in http://www.tnlotteryresults.com/blog/, would impact search engine ranking or visitors. I never had the ranking question answered but viewing the log files I have found that each of the 3 methods attract a hugely different audience to the point that I cannot end my experiment.

    I continue posting by hand but the day Blogger turned on Captcha I considered giving up the two blogs. I have considered using the Blogger API to automate the posting process which I think would be legitimate. I think Captcha would stop the spam blogs (and I would like to see them stopped) but I think legitmate blogs like mine would be viewed as spam blogs and also stopped and that would be unfortunate.

  8. Eamonn's Home Says:

    Oh, no…

    Google, listen up! I would be quite happy to prove I’m a human every time I post if it makes it easier to stop this practice. And you should change your user agreement to specifically prohibit it.

  9. Origomi Says:

    That “Mario’s news archive posts” site is pretty devious! I think the fact that the poster seems to never sleep, and the references to super-high-paying adsense words kind of give it away- but I might not have even noticed if you hadn’t tipped me off.

    the hidden links is really sneaky, too. someone spent some time doing that. I wonder if it pays off for them?

    I actually have a blog hosted at blogspot, and I honestly don’t think that having to jump through a captcha (or something like that) for every post would really be that bad. It’s certainly not a bad tradeoff for ad-free hosting. That, and the weblog spam really bugs me- I find it makes things like technorati almost useless, at least for my rather arcane interests.

    Thanks for the great info!

  10. David Orban Says:

    The evolution of spam is utterly fascinating. It is hard not to wonder how much clever reasoning that goes into it, for apparently unproductive ends.

  11. codeman38 Says:

    If they’re going to require a CAPTCHA, they’d *better* provide an alternative for those of us who are human but can’t see the blasted things well enough.

  12. Brenda Says:

    Sorry… offtopic. But another new knitting blog you should check out is http://youknitwhat.blogspot.com . Like Go Fug Yourself for knitters.

  13. » BlogSpot Spam Blogs: Blog Tips - ProBlogger Says:

    […] und some cool knitting sites as a result last time I did this). What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs […]

  14. Cary Says:

    Wow, that Mario is one prolific blogger ; )

    I wish I had that kind of blogging stamina! Too funny…

    Unfortunately, as annoying as spam-blogs are, unlike traditional spam e-mail, we aren’t being force-fed them by having them sent to our e-mail adresses, so it seems a little different in my eyes. I can just avoid Mario’s site : ) Thank God!

    Also, his site seems like it would just get him in trouble with Google…don’t they penalize for invisible links? I thought I understood that. It seems like trying to trick Google would just hurt your ranking in the long run.

    I’m personally a little worried about Technorati now, since it relies on people escentially categorizing their own posts, which is just ripe for spamming. I’m already seeing it on some searches. I mean just anybody can tag their post as anything at all…seems like trouble!

  15. Tinus Says:

    There is no CAPTCHA posible for using the Blogger API and mail2blog gateway so that wouldn’t be the ultimate solution anyway…

  16. Is there a PC Doctor in the house? Says:

    Will spam collapse blogs in the same way that it collapsed USENET?

    Interesting post over on geektronica.com :
    “What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs are spam-logs - sites created to increase the Google ranking of some other site (which is itself …

  17. Krag Says:

    Can you prove that you’re not a bot? Perhaps this site is being run by a bot specifically programmed to try to throw real people off the track…

  18. Geektronica Says:

    After reading all the comments and trackbacks, I’m realizing that it’s much more difficult than saying that Google should do something.

    Google can’t monitor content, but they could have a button on the floating “next blog” bar that lets you report suspected spam blogs. It would require manpower from Google to process these submissions, though, so I doubt they’ll do it. And false positives are a near-certainty.

    But perhaps Google itself has stumbled upon the solution: the rel=”nofollow” attribute for external links. How about this: Blogger stays as-is, but if you use Blogspot, all your links get the rel=”nofollow” attribute, so it’s impossible for you to contribute to Google rankings.

    It may seem harsh to block out a huge number of bloggers from contributing to the PageRank system, but consider where we are now. Legitimate Blogspot users are already at a disadvantage in terms of influence on PageRank. Doing away with all PageRank influence from Blogspot sites would level the playing field.

    This problem would be replicated anywhere that free hosting can be taken advantage of. I assume that the same problem exists on message boards running popular software such as PHPBB2; surely scripts can be created to mass-post to these sites. All you have to do is Google “Powered by PHPBB2″ and start posting like mad.

    That’s why Blogger has a throttle on posting speed, just as djuggler said above. Most forum software has this feature too. But if you’re a script, you can just wait 30 seconds or whatever, and post to another site in another window.

  19. Geektronica Says:

    Krag-
    The Turing test is difficult! Am I a human if I’m using XMLRPC to post? If I’m using a search-based RSS feed and a keyboard macro to find and post what interests me? It still takes human intervention and decision making, but I do want my blogging to be as efficient as possible.

    I think OCR-proof CAPTCHA for every posting would solve 99% of the problem, but it needs an audio option so as not to exclude the visually impaired. Other sites already do this, so I don’t know why big fat Google couldn’t. Surely they must care about polluting their own search results.

  20. HART Says:

    Sorry .. I’m quite of the opposite viewpoint here. If the sole purpose of these types of blogs is to increase traffic to their other websites, good for them! There has to be better and cheaper ways to advertise our websites besides paying for it. As far as the internet should be (to me) .. I should only have to pay for access to the internet. I don’t want to pay access to find people or have people find me.

    I am a bloggie newbie, but that’s my understanding of what a blog can do for my business. I am trying to develop an information and community website, but I still want referrals to my website. Is that wrong?

    The only thing these blogging sites did, that caught your attention in the first place, was not disable the Comment option, in my opinion.

    At least it’s not coming to your email box or stealing your email address. How can it be spam? Because you don’t approve?

    I am more concerned with ‘professional blogging sites’ like this one, who believe they are the majority of the blogging world and should dictate what is right and wrong before the rest of humanity get a chance to experience it.

    But - I do enjoy reading your blogs and value your opinions and views - just this time, I’m on the other side of the fence.

    Take care
    HART

  21. Nick Says:

    I’m not sure even that would help… since that wouldn’t stop spam from being posted using the Blogger API from a simple external program that the spammers could write, and probably already use.

  22. Geektronica Says:

    Nick - I agree. Blogger’s not the problem, though, nor is its excellent API. The problem is free hosting, which ultimately isn’t free - someone has to pay for it.

    Hart-
    It’s not that having a Blogspot blog to promote your external website is wrong (I do it on Blogspot too). What’s wrong is using deceptive, cheating tactics to manipulate search engine rankings unfairly. The use of bots and fake blogs doesn’t give humans such as yourself a fair chance to make themselves known through blogging, since the bots will always be better than lone humans at tricking Google.

  23. Frank Frick Says:

    Thought you all would like to know that there’s this program called Article Bot that is even more insidious in how it games Page Rank by automating the updating of the new content.

  24. Kate Says:

    I’m a blogspot user, and I would hate to see people avoid blogspot blogs alltogether to avoid spam. To that end, as a legitimate user I wouldn’t mind CAPTCHA authenticating all my posts. I mean how long does it take to type a few letters into a box?

  25. Wohnung mieten wohung Says:

    I hate blog spam everytime i see it.
    But i would like to honestly disagree with people saying there
    should be not log-links(from a blog poster).
    Stop blog spam but allow genuine posters to enter their website.
    The net, including this blog, need recommendations from others
    in terms of PR too! By implementing measures like rel=”nofollow”
    you are basically killing the open internet community!
    Why? because it’s the the small one website guy that needs PR.
    Amazon, Ebay etc have enough money to spam the WHOLE internet.
    They don’t get punished or removed from the index - coz they
    have “good” content.
    Nobody complains about this and i find it very terrible.

    Wrap It Up: forbid URL log in your blog if you want to commercialize the net!
    The big players (Ebay,Amazon..) will be very happy!
    You views?
    Wohnung mieten

  26. Man from Acapulco Says:

    The people at blogspot/blogger are quite fast at removing spamblogs. I use the support form to report the spam blogs and most of the time they are removed in 1 or 2 days. (example of spamblogs: http://www.blogger.com/profile/10109614)

  27. Blog Party Says:

    Punishing good posters for the wrongdoing of spammers is not a solution. We should encourage those who have something to say. If we penalize the ones who have something constructive to add, we are hurting the overall quality of the net, just like the spammers do.

  28. AnotherChanceToSee Says:

    How about a simple image based Captcha system?

    eg Display a page of animal drawings in random places, and then ask the user to click on three of them in order. cat-dog-elephant
    With different pictures, in different places, it would be impossible for an auto-blog program to process it.
    I would have no problem dealing with that two or three times a day when I wanted to post.

    Some of those text based Captchas are just TOO HARD. The Hotmail ones are HORRIBLE. Fortunately I don’t use Hotmail that often, but I groan everytime one pops.

  29. The Basement Says:

    Blog spam is responsible for the decline of god

    (or Dissecting BlogPulse: Part2)
    In Dissecting
    BlogPulse: Part 1 I discussed a downward trend in the number of
    blog postings that contain at least one of 69 common English words
    but stopped short of discussing the possible root cause(s) of this
    trend….

  30. The Basement Says:

    […] it. The Rise of Blog Spam Blog spam has been the topic of several recent articles as seen here, […]

  31. Dave Says:

    I recently posted about this on my baseball blog. I’ve gotten a bunch of link-exchange requests from sites that are - without a doubt - NOT created by serious sports fans, and perhaps not even created by humans.

    More here: http://genuinelove.crookednumber.com/archives/post171.php

  32. Crooked Number Blog » Spammer Continue to Innovate Says:

    […] 8217;s a recent post about my firsthand experience with some spam blogs. And here’s another post about the subject, from Geektronica. […]

  33. The Republic of Geektronica » Blogger Props Says:

    […] el.icio.usOrigami TesselationsWifi Isn’t All Mauritius Offers Passion”>Kill Bill -> Passion […]

  34. miller Says:

    Mario is dead

  35. Guignol Says:

    I just came across this post linked from Blogger, as I was looking actually for their status page because they were down a few minutes ago. Anyway, I think they could easily employ a spam-detection algorithm and turn off suspected blogs, posting a page instead that says “contact support if this was not a spam blog” so it can get turned back on if it was legit. There ARE good blogs on Blogger as the author indicates, here’s another I like: http://thegrumpyhacker.blogspot.com/

  36. Noelle Williamson Says:

    Hi, I’m not advocating for blogspot here, I use my own host. But, I’m against using captcha images!! I use a screen reader, I’m not sure how many blogspot users use one, (I know there are quite a few livejournal users.) Using a captcha every time someone wants to publish a post, (without an accessible solution, which google doesn’t have yet,) would forbid anyone using screen readers from posting!

  37. Geektronica Says:

    As others have pointed out, CAPTCHA would also disable XMLRPC posting, which many users take advantage of. And I suspect that the spammers will always be able to bypass whatever clever CAPTCHAs people come up with. I came across one story about how spammers were redirecting the CAPTCHA challenges to porn sites, where people looking for porn are asked to solve the challenge, which are then redirected to Blogger or Hotmail or whatever. Pretty ingenious.

    After thinking about it some more, I think the best solution is to make it so the Blogger nav bar that floats at the top has a “report as spamblog” button. This could queue the blog for examination by a Google employee, who could then deactivate it and send a message to the owner, requiring further action to reactivate it.

  38. The Grumpy Hacker Says:

    I disagree. First, that nav bar can be disabled by the blogger. Second, if the nav bar or any other such button like you suggest is *required* on every blog, even if just the ones they host, they’ll lose a lot of users because that’s the kind of thing that attracted them in the first place (not having to put up with that crap).

    I’d propose the counter-suggestion of having a script at some URL that takes as input the referer field, so anytime you land on a spam blog, you just go to that bookmarked URL and it sees the last page you visited and logs it as a potential spam blog. The script could easily either automatically send you back to that page, or provide you with a button to go back if you want, or a “next blog” button too (I use that a lot actually, it’s cool).

  39. tim Says:

    i actually know someone who does this
    they reckon their blogger accounts get deleted by blogger every 2 weeks or so
    they are looking to do their own hosting on a server as a solution
    i think its something that google adsense will clamp down on soon

  40. tim Says:

    oh and they are making $300 plus a day from google adsense on 200 websites- that is why they are doing it

  41. Nick Says:

    Excellent work — keep it up, we need watch dogs to bury the dead. (sites that is)
    monergism.blog.com

  42. Hello Sunshine Says:

    […] ermalink, not on the “write a comment” page. 2. The other line of defense, as Geektronica mentioned, is CAPTCHA, though it’s now used not […]

  43. Hello Sunshine Says:

    […] logspot now has an option to delete a comment entirely. 2. The other line of defense, as Geektronica mentioned, is CAPTCHA, though it’s now used not […]

  44. The Republic of Geektronica » The Ethics of Hosted Spamlogs (Splogs) Says:

    […] ggregation splog and a crappy, mostly-links, human-generated blog. A lot of internet users can’t tell the difference in some cases. Again, if you&#8217 […]

  45. Arleen Says:

    Cool site!

  46. Troy Cassidy Says:

    No one has mentioned the expired blogger factor.Many deleted blogs are getting reregistered by soammers and posting 1 or 2 posts with links to the pron, or casino sites.They take advantage of the backlinks the previous owner of the blog acquired to spam search engines.
    here are some examples.
    charlesmurtaugh.blogspot.com
    secondpersonsingular.blogspot.com
    rougeramblings.blogspot.com
    nosanity4me.blogspot.com
    www.alexanderthegreatest.blogspot.com
    towardssupernova.blogspot.com
    nosoylaresurreccion.blogspot.com
    www.zhonguocai.blogspot.com
    greatscat.blogspot.com
    nosoylaresurreccion.blogspot.com
    bearpolitics.blogspot.com
    sandraisevil.blogspot.com
    waikikiweekly.blogspot.com
    clairitys-journal.blogspot.com

  47. peter.smoothouse.com » Blog Archive » Detecting Blogspot splogs the Bayesian way Says:

    […] Jun 30: Geektronica and PSFK discover loads of Blogspot spam blogs. […]

  48. Find in Forums Says:

    Well it looks like blogspot spam is not a big problem any more, after all the security measurs by google, including capthas on comments and when creating a blog. All spammer blogs are now deleted very quickly by google.


Close
E-mail It