The Strange World of Blogspot Spam Blogs
Posted by Geektronica on June 30th, 2005 underI’ve been cruising the Blogspot world lately looking for cool stuff that the bigger geeklogs might have missed (and I found some cool knitting sites [1, 2] as a result last time I did this). What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs are spam-logs - sites created to increase the Google ranking of some other site (which is itself usually a Google-spamming site). The ultimate purpose of these spamlogs is usually to drive traffic to a commission-paying pharmacy, pr0n, or casino site.
Some of the spamlogs hosted at Blogspot, which apparently does not have a policy against them, are obvious in their intent (for example). It requires a human to start a new Blogspot-hosted site, but after the initial setup (which can be partially aided by scripts, I’m sure), bots can post like crazy. Usually the posts are strings of highly searched terms (like the names of celebrities, TV shows, or something Google Adsense pays a lot for, like asbestos litigation), with a link to the external site that the spammer is trying to bump up in the Google rankings.
None of this is new; Matt Mullenweg posted on it in March, as did SpamBlogging. But a few things have changed since their posts on the topic.
First, Blogspot now requires CAPTCHA authentication to start a new account, which many people said would fix the problem. It hasn’t. Entering a CAPTCHA sequence takes a human about five seconds, and you only have to do it once.
Second, spammers are becoming less obvious by creating posts that link to actual news articles (complete with excerpts); by all appearances, these blogs are just like scores of real blogs. But if you look at the code of the page, there are tons of external spam links, cleverly hidden by CSS. Here is an example: Mario’s News Archive Posts (to which I’m avoiding giving Google-props by using the rel=”nofollow” attribute). With this additional layer of subterfuge, it’s remotely possible that someone will even link to “Mario’s” blog from their highly-ranked site. Check it out - it’s quite slick. It even auto-reloads every few seconds (though I’m not sure why).
A peer under the hood of Mario’s spamlog reveals something like this at the end of every post:
<style>.lin {visibility:hidden;}</style><div class="lin" style="position:absolute;top:-50;left:-3000;"><font size=1>Links of Interest:<BR><a href="http://www.treadmills100.info/York-Treadmill/Where-To-Get-A-Cheap-Treadmill.cfm">Where To Get A Cheap Treadmill</a>
I realized this when I tried to leave a comment on a news item that I found interesting, and clicked the Blogger “show original post” link, which uses some JavaScript to show the text of the post. However, this breaks the embedded CSS, letting the secret out.
The only thing that will stop BlogSpot from becoming 99% spamlogs is for posting to require CAPTCHA each time. This would be a pain, and it shouldn’t be necessary for people who host their own sites, but BlogSpot users should have to prove they’re human each time. Other thoughts on how to stop this, or whether it should be stopped?
June 30th, 2005 at 9:33 pm
[…] Rings and Free Amazon Things
BoingBoing has picked up my post at Geektronica on Blogspot spam blogs: Spam blogs are phony weblogs designed to game […]
June 30th, 2005 at 11:30 pm
I often go browsing through Blogger blogs nad I agree with the statement that up to a thrid of them are spam sites. It seems logicval to me to allow you to report this to the Blogger groups as spam, much like you can do with email. They serve no real purpose other than to check the PageRank system and it would be in Google’s best interest’s to have community monitoring of this in fariness to everyone. This system can be abused, yes, but do we want real blogs or spam everywhere you turn?
June 30th, 2005 at 11:36 pm
Sorry for my incredibly poor spelling. Typed in a hurry - it’s beer o’clock in Australia.
June 30th, 2005 at 11:49 pm
Spamblog
Nem, nem blogspam, arról már írtunk egyébként is pár hónapja. Spamblog, azaz csakis (keresõ-) spammelési célokat szolgáló, robotok által frissített blognak látszó tárgy. Hírsite linkek és idézetek, CSS-sel rejtett spamlinkek, a Geektronica cikkje szépe…
July 1st, 2005 at 1:25 am
Other than faster action by BlogSpot to shut down those blogs, I don’t know if there is a solution.
July 1st, 2005 at 4:13 am
If Google can effectively determine which blogs are spamlogs and drop them from their index, then the incentive to create them will be removed. If spamlogs are like e-mail spam, a few people are creating 90% of them, so they’re probably not that diverse and spotting their “signature” should be fairly easy.
July 1st, 2005 at 4:20 am
I posted about Blogger’s use of captcha and basically found it frustrating t to the point that I almost gave up two blogs. I post TN’s lottery results at http://www.tnlotteryresults.com but out of curiosity throught I’d try blogging them also. I had always been curious about whether or not posting directly to blogger, as in http://tnlotteryresults.blogspot.com/, or having it post the blog to your own server, as in http://www.tnlotteryresults.com/blog/, would impact search engine ranking or visitors. I never had the ranking question answered but viewing the log files I have found that each of the 3 methods attract a hugely different audience to the point that I cannot end my experiment.
I continue posting by hand but the day Blogger turned on Captcha I considered giving up the two blogs. I have considered using the Blogger API to automate the posting process which I think would be legitimate. I think Captcha would stop the spam blogs (and I would like to see them stopped) but I think legitmate blogs like mine would be viewed as spam blogs and also stopped and that would be unfortunate.
July 1st, 2005 at 4:45 am
Oh, no…
Google, listen up! I would be quite happy to prove I’m a human every time I post if it makes it easier to stop this practice. And you should change your user agreement to specifically prohibit it.
July 1st, 2005 at 5:15 am
That “Mario’s news archive posts” site is pretty devious! I think the fact that the poster seems to never sleep, and the references to super-high-paying adsense words kind of give it away- but I might not have even noticed if you hadn’t tipped me off.
the hidden links is really sneaky, too. someone spent some time doing that. I wonder if it pays off for them?
I actually have a blog hosted at blogspot, and I honestly don’t think that having to jump through a captcha (or something like that) for every post would really be that bad. It’s certainly not a bad tradeoff for ad-free hosting. That, and the weblog spam really bugs me- I find it makes things like technorati almost useless, at least for my rather arcane interests.
Thanks for the great info!
July 1st, 2005 at 5:27 am
The evolution of spam is utterly fascinating. It is hard not to wonder how much clever reasoning that goes into it, for apparently unproductive ends.
July 1st, 2005 at 5:41 am
If they’re going to require a CAPTCHA, they’d *better* provide an alternative for those of us who are human but can’t see the blasted things well enough.
July 1st, 2005 at 5:51 am
Sorry… offtopic. But another new knitting blog you should check out is http://youknitwhat.blogspot.com . Like Go Fug Yourself for knitters.
July 1st, 2005 at 7:15 am
[…] und some cool knitting sites as a result last time I did this). What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs […]
July 1st, 2005 at 9:17 am
Wow, that Mario is one prolific blogger ; )
I wish I had that kind of blogging stamina! Too funny…
Unfortunately, as annoying as spam-blogs are, unlike traditional spam e-mail, we aren’t being force-fed them by having them sent to our e-mail adresses, so it seems a little different in my eyes. I can just avoid Mario’s site : ) Thank God!
Also, his site seems like it would just get him in trouble with Google…don’t they penalize for invisible links? I thought I understood that. It seems like trying to trick Google would just hurt your ranking in the long run.
I’m personally a little worried about Technorati now, since it relies on people escentially categorizing their own posts, which is just ripe for spamming. I’m already seeing it on some searches. I mean just anybody can tag their post as anything at all…seems like trouble!
July 1st, 2005 at 9:39 am
There is no CAPTCHA posible for using the Blogger API and mail2blog gateway so that wouldn’t be the ultimate solution anyway…
July 1st, 2005 at 9:44 am
Will spam collapse blogs in the same way that it collapsed USENET?
Interesting post over on geektronica.com :
“What I’ve found, though, is that a large percentage (maybe up to a third) of all Blogspot blogs are spam-logs - sites created to increase the Google ranking of some other site (which is itself …
July 1st, 2005 at 9:58 am
Can you prove that you’re not a bot? Perhaps this site is being run by a bot specifically programmed to try to throw real people off the track…
July 1st, 2005 at 11:03 am
After reading all the comments and trackbacks, I’m realizing that it’s much more difficult than saying that Google should do something.
Google can’t monitor content, but they could have a button on the floating “next blog” bar that lets you report suspected spam blogs. It would require manpower from Google to process these submissions, though, so I doubt they’ll do it. And false positives are a near-certainty.
But perhaps Google itself has stumbled upon the solution: the rel=”nofollow” attribute for external links. How about this: Blogger stays as-is, but if you use Blogspot, all your links get the rel=”nofollow” attribute, so it’s impossible for you to contribute to Google rankings.
It may seem harsh to block out a huge number of bloggers from contributing to the PageRank system, but consider where we are now. Legitimate Blogspot users are already at a disadvantage in terms of influence on PageRank. Doing away with all PageRank influence from Blogspot sites would level the playing field.
This problem would be replicated anywhere that free hosting can be taken advantage of. I assume that the same problem exists on message boards running popular software such as PHPBB2; surely scripts can be created to mass-post to these sites. All you have to do is Google “Powered by PHPBB2″ and start posting like mad.
That’s why Blogger has a throttle on posting speed, just as djuggler said above. Most forum software has this feature too. But if you’re a script, you can just wait 30 seconds or whatever, and post to another site in another window.
July 1st, 2005 at 11:06 am
Krag-
The Turing test is difficult! Am I a human if I’m using XMLRPC to post? If I’m using a search-based RSS feed and a keyboard macro to find and post what interests me? It still takes human intervention and decision making, but I do want my blogging to be as efficient as possible.
I think OCR-proof CAPTCHA for every posting would solve 99% of the problem, but it needs an audio option so as not to exclude the visually impaired. Other sites already do this, so I don’t know why big fat Google couldn’t. Surely they must care about polluting their own search results.
July 1st, 2005 at 1:26 pm
Sorry .. I’m quite of the opposite viewpoint here. If the sole purpose of these types of blogs is to increase traffic to their other websites, good for them! There has to be better and cheaper ways to advertise our websites besides paying for it. As far as the internet should be (to me) .. I should only have to pay for access to the internet. I don’t want to pay access to find people or have people find me.
I am a bloggie newbie, but that’s my understanding of what a blog can do for my business. I am trying to develop an information and community website, but I still want referrals to my website. Is that wrong?
The only thing these blogging sites did, that caught your attention in the first place, was not disable the Comment option, in my opinion.
At least it’s not coming to your email box or stealing your email address. How can it be spam? Because you don’t approve?
I am more concerned with ‘professional blogging sites’ like this one, who believe they are the majority of the blogging world and should dictate what is right and wrong before the rest of humanity get a chance to experience it.
But - I do enjoy reading your blogs and value your opinions and views - just this time, I’m on the other side of the fence.
Take care
HART
July 1st, 2005 at 1:47 pm
I’m not sure even that would help… since that wouldn’t stop spam from being posted using the Blogger API from a simple external program that the spammers could write, and probably already use.
July 1st, 2005 at 2:29 pm
Nick - I agree. Blogger’s not the problem, though, nor is its excellent API. The problem is free hosting, which ultimately isn’t free - someone has to pay for it.
Hart-
It’s not that having a Blogspot blog to promote your external website is wrong (I do it on Blogspot too). What’s wrong is using deceptive, cheating tactics to manipulate search engine rankings unfairly. The use of bots and fake blogs doesn’t give humans such as yourself a fair chance to make themselves known through blogging, since the bots will always be better than lone humans at tricking Google.
July 2nd, 2005 at 12:35 pm
Thought you all would like to know that there’s this program called Article Bot that is even more insidious in how it games Page Rank by automating the updating of the new content.
July 3rd, 2005 at 12:48 pm
I’m a blogspot user, and I would hate to see people avoid blogspot blogs alltogether to avoid spam. To that end, as a legitimate user I wouldn’t mind CAPTCHA authenticating all my posts. I mean how long does it take to type a few letters into a box?
July 4th, 2005 at 1:40 am
I hate blog spam everytime i see it.
But i would like to honestly disagree with people saying there
should be not log-links(from a blog poster).
Stop blog spam but allow genuine posters to enter their website.
The net, including this blog, need recommendations from others
in terms of PR too! By implementing measures like rel=”nofollow”
you are basically killing the open internet community!
Why? because it’s the the small one website guy that needs PR.
Amazon, Ebay etc have enough money to spam the WHOLE internet.
They don’t get punished or removed from the index - coz they
have “good” content.
Nobody complains about this and i find it very terrible.
Wrap It Up: forbid URL log in your blog if you want to commercialize the net!
The big players (Ebay,Amazon..) will be very happy!
You views?
Wohnung mieten
July 4th, 2005 at 5:04 am
The people at blogspot/blogger are quite fast at removing spamblogs. I use the support form to report the spam blogs and most of the time they are removed in 1 or 2 days. (example of spamblogs: http://www.blogger.com/profile/10109614)
July 4th, 2005 at 10:11 pm
Punishing good posters for the wrongdoing of spammers is not a solution. We should encourage those who have something to say. If we penalize the ones who have something constructive to add, we are hurting the overall quality of the net, just like the spammers do.
July 5th, 2005 at 6:34 am
How about a simple image based Captcha system?
eg Display a page of animal drawings in random places, and then ask the user to click on three of them in order. cat-dog-elephant
With different pictures, in different places, it would be impossible for an auto-blog program to process it.
I would have no problem dealing with that two or three times a day when I wanted to post.
Some of those text based Captchas are just TOO HARD. The Hotmail ones are HORRIBLE. Fortunately I don’t use Hotmail that often, but I groan everytime one pops.
July 5th, 2005 at 6:32 pm
Blog spam is responsible for the decline of god
(or Dissecting BlogPulse: Part2)
In Dissecting
BlogPulse: Part 1 I discussed a downward trend in the number of
blog postings that contain at least one of 69 common English words
but stopped short of discussing the possible root cause(s) of this
trend….
July 5th, 2005 at 6:36 pm
[…] it. The Rise of Blog Spam Blog spam has been the topic of several recent articles as seen here, […]
July 6th, 2005 at 9:09 pm
I recently posted about this on my baseball blog. I’ve gotten a bunch of link-exchange requests from sites that are - without a doubt - NOT created by serious sports fans, and perhaps not even created by humans.
More here: http://genuinelove.crookednumber.com/archives/post171.php
July 6th, 2005 at 9:14 pm
[…] 8217;s a recent post about my firsthand experience with some spam blogs. And here’s another post about the subject, from Geektronica. […]
July 11th, 2005 at 1:34 pm
[…] el.icio.usOrigami TesselationsWifi Isn’t All Mauritius Offers Passion”>Kill Bill -> Passion […]
July 13th, 2005 at 3:59 am
Mario is dead
July 13th, 2005 at 11:43 am
I just came across this post linked from Blogger, as I was looking actually for their status page because they were down a few minutes ago. Anyway, I think they could easily employ a spam-detection algorithm and turn off suspected blogs, posting a page instead that says “contact support if this was not a spam blog” so it can get turned back on if it was legit. There ARE good blogs on Blogger as the author indicates, here’s another I like: http://thegrumpyhacker.blogspot.com/
July 14th, 2005 at 12:20 pm
Hi, I’m not advocating for blogspot here, I use my own host. But, I’m against using captcha images!! I use a screen reader, I’m not sure how many blogspot users use one, (I know there are quite a few livejournal users.) Using a captcha every time someone wants to publish a post, (without an accessible solution, which google doesn’t have yet,) would forbid anyone using screen readers from posting!
July 14th, 2005 at 12:26 pm
As others have pointed out, CAPTCHA would also disable XMLRPC posting, which many users take advantage of. And I suspect that the spammers will always be able to bypass whatever clever CAPTCHAs people come up with. I came across one story about how spammers were redirecting the CAPTCHA challenges to porn sites, where people looking for porn are asked to solve the challenge, which are then redirected to Blogger or Hotmail or whatever. Pretty ingenious.
After thinking about it some more, I think the best solution is to make it so the Blogger nav bar that floats at the top has a “report as spamblog” button. This could queue the blog for examination by a Google employee, who could then deactivate it and send a message to the owner, requiring further action to reactivate it.
July 14th, 2005 at 5:55 pm
I disagree. First, that nav bar can be disabled by the blogger. Second, if the nav bar or any other such button like you suggest is *required* on every blog, even if just the ones they host, they’ll lose a lot of users because that’s the kind of thing that attracted them in the first place (not having to put up with that crap).
I’d propose the counter-suggestion of having a script at some URL that takes as input the referer field, so anytime you land on a spam blog, you just go to that bookmarked URL and it sees the last page you visited and logs it as a potential spam blog. The script could easily either automatically send you back to that page, or provide you with a button to go back if you want, or a “next blog” button too (I use that a lot actually, it’s cool).
July 21st, 2005 at 5:12 pm
i actually know someone who does this
they reckon their blogger accounts get deleted by blogger every 2 weeks or so
they are looking to do their own hosting on a server as a solution
i think its something that google adsense will clamp down on soon
July 21st, 2005 at 5:19 pm
oh and they are making $300 plus a day from google adsense on 200 websites- that is why they are doing it
July 23rd, 2005 at 12:20 pm
Excellent work — keep it up, we need watch dogs to bury the dead. (sites that is)
monergism.blog.com
August 20th, 2005 at 11:57 am
[…] ermalink, not on the “write a comment” page. 2. The other line of defense, as Geektronica mentioned, is CAPTCHA, though it’s now used not […]
August 21st, 2005 at 12:00 pm
[…] logspot now has an option to delete a comment entirely. 2. The other line of defense, as Geektronica mentioned, is CAPTCHA, though it’s now used not […]
October 21st, 2005 at 10:25 pm
[…] ggregation splog and a crappy, mostly-links, human-generated blog. A lot of internet users can’t tell the difference in some cases. Again, if you’ […]
November 8th, 2005 at 9:47 am
Cool site!
January 13th, 2006 at 6:45 pm
No one has mentioned the expired blogger factor.Many deleted blogs are getting reregistered by soammers and posting 1 or 2 posts with links to the pron, or casino sites.They take advantage of the backlinks the previous owner of the blog acquired to spam search engines.
here are some examples.
charlesmurtaugh.blogspot.com
secondpersonsingular.blogspot.com
rougeramblings.blogspot.com
nosanity4me.blogspot.com
www.alexanderthegreatest.blogspot.com
towardssupernova.blogspot.com
nosoylaresurreccion.blogspot.com
www.zhonguocai.blogspot.com
greatscat.blogspot.com
nosoylaresurreccion.blogspot.com
bearpolitics.blogspot.com
sandraisevil.blogspot.com
waikikiweekly.blogspot.com
clairitys-journal.blogspot.com
March 9th, 2006 at 4:36 pm
[…] Jun 30: Geektronica and PSFK discover loads of Blogspot spam blogs. […]
June 22nd, 2006 at 3:31 am
Well it looks like blogspot spam is not a big problem any more, after all the security measurs by google, including capthas on comments and when creating a blog. All spammer blogs are now deleted very quickly by google.