Social experiment on Slashdot moderation
Posted 24 Jan 2000 at 12:31 UTC by Raphael 
I had a look on Slashdot this morning and I
saw that someone succeeded in fooling the moderators. He posted a
comment
containing only fake information, and managed to get moderated up to "+5
Informative" within an hour. This made me think about moderation,
democracy and trust metrics.
This comment
was posted in reply to a scientific story about a new explosive molecule
(octanitrocubane) that has recently been synthesized. Although the main
story is true, the comment was made up by an Anonymous Coward who
claimed that an obscure Swiss/German chemist had invented the
molecule in 1958 and published some papers about it, but his works
were ignored until recently. Of course, the chemist never existed, but
the since the comment seemed to contain scientific facts (although
anyone with a scientific background would have laughed at "alternate
hydrofusion technology proposal"), the article was quickly moderated
up. At the time of this writing, it is rated as: Interesting=3,
Informative=2, Total=5.
The comment will probably be moderated down now that the author
of the prank has posted a comment explaining what he did (note: I am
not related to this guy in any way), but that was an interesting social
experiment. He posted the main comment as an AC, followed by some
other comments giving more (fake) details about that obsure chemist,
with links to some TV programs talking about him (actually, the links
are pointing to porn sites) and excerpts from German books quoting the
guy (the German sentences are totally unrelated, use bad grammar
probably from Babelfish and are rather rude for anyone who has a
limited knowledge of German). All these replies were posted
anonymously by the same person within half an hour. He could also
have used one or more throwaway accounts, although succeeding as an
AC helps proving his point against moderation.
Although I do not approve his childish attitude and some of his rude
comments, he did prove that things can go wrong easily with that kind
of moderation system. Basically, the Slashdot system can work as long
as the moderators rate only the articles related to something they know
about. But if some of them start moderating something that looks
interesting despite the fact that they know nothing about that topic,
then things can and will go wrong. It will take a significant concerted
effort by knowledgeable moderators to bring this comment down to a
more reasonable rating, assuming that no other uninformed moderator
pushes the comment back up (or some well-meaning person moderates
it up as "Funny"). Even the meta-moderation will probably not help,
because the meta-moderators could easily fall into the same trap since
they usually do not see the context of the moderated comment.
If you believe that "in any sufficiently large crowd, the majority are
idiots", then this can be applied to Slahsdot moderators too. All
moderators have equal powers and the system is supposed to work as a
kind of democracy. But if the majority does not think very much about
what they are doing (because of lack of time, lack of interest, lack of
intelligence or many other reasons), then it becomes easy to abuse the
system.
A similar abuse is performed by the so-called "karma whores", i.e.
the people who quickly post a comment copying more or less the same
information as the main article (or the pages it refers to). Many
lazy moderators who start rating the comments before reading the
main article will moderate such articles up without thinking that
they are redundant.
I hope that something similar to the trust metrics implemented on
Advogato could help. It would probably not solve everything, but at
least it does not seem to suffer from one of the flaws of Slashdot: if
you manage to create multiple accounts on Slashdot and one of them
gains the moderator status (not very hard if you use the various
techniques described above), then you can use that account to push
the other ones up. Although the meta-moderation limits this to
some extent, it can easily be abused too.
Another option would be to have several overlapping networks of
trust. On Advogato, the trust metrics are related to software
development. But this does not say much about how each person is
good at playing music, understanding nuclear physics, wrestling, or
cooking Chinese food. Just like in role-playing games that use
several "stats" for describing a character (strength, intelligence,
charisma, ...), there could be several criteria (programming skills,
physics, nanotech, humour, politics and many more) for rating a
person, which would in turn rate some articles. If each criteria is
governed by a network of trust as in Advogato, then the system
could be much more difficult to abuse. But not perfect yet... for
example, defining the criteria would not be easy, and defining how
each article or comment is related to each criteria would not be
easy either. (e.g. "This article is about a free software package that
renders complex molecules in 3D". Is it 20% free software, 30%
3D viewing techniques and 50% chemistry? And what about that
comment arguing about the license?)
I don't know if it is possible to design a system that it sufficiently
robust while still being fair and not elitist. Suggestions are
welcome.
This reminded me a lot of the parody that Alan
Sokol did for Social Text. His point was to demonstrate that soft
scientists of the postmodern variety are very uncritical. His original
article contained a number of deliberate mistruths designed so that even
first-year physics students could refute them. However, it all sounded
very good.
I'm not sure exactly what the Social Text affair proved. I mean, this is
just content filtering. It seems to me perfectly acceptable that a few
things get through the screen when somebody puts some effort into
hoaxing the system. It just proves that you shouldn't believe everything
you read, even if it's been "peer reviewed," whether in a forum such as
Social Text or Slashdot.
The real problems with trust are when there's a determined, organized
effort to prank the system. I start to worry when there's direct
financial gain to be made this way - that's why spam is such a problem.
A great many articles on Slashdot discuss consumer products, movies, and
so on. What if somebody figured out it might be profitable to spam
Slashdot with cleverly written posts in support of their own product
and/or dissing their competitors?
The other issue, of course, is how well the content filtering works.
This is what I call the "usenet problem," in honor of usenet's high
signal content, but overwhelmingly high noise content. How do you pick
the good stuff out? I think trust metrics like Advogato's are an
interesting way to address the problem. Advogato itself is a very
imperfect experiment in this direction. Yes, it only addresses free
software. Yes, it can be elitist. Yes, it forces people to give rankings
to other people publicly, which might be delicate. But so far so good;
the quality of discussion here (if not the volume) has been very
gratifying.
I'm very interested in the idea of carrying the ideas in Advogato
forward to a broader audience. I've reserved the name
"cluedot.{com,net.org}" for this very purpose. I'm learning a lot from
the Advogato experience, and am very open to ideas. Thanks for the cool
post, Raphael.
There's a lot to be said for trust metrics and similar systems that take a lot of the manual work out of evaluation the quality of
content and content posters. However, the Slashdot situation mentioned here isn't really solvable by this type ot technology, I
think.
What's needed here is good, old-fashioned clue, and also humility. Slashdot moderation worked fairly well in the initial
stages, when moderators were hand-picked by the site administrators. These 200 (or so) people moderated well, and a lot.
However, in a push to make Slashdot more "democratic", they decided to do the random moderators thing they have now, and
that hasn't worked out so well. I suspect one of the main faults is "You can't moderate and post to the same discussion". When
I read this, I was incredulous. If someone does know a lot about a topic, isn't it a boon to the community to have them both
post their opinions, and also moderate to make the rest of the content as useful as possible?
If they had hand-picked moderators, they could be more humble, and not moderate stuff they weren't competent to judge the
quality of, and the ones that were in the know could post useful followups explaining instead of just moderating down.
There's no such thing as a human touch, as old media has figured out long ago, and which is why good journalism will
ultimately prevail over the sort of "anything goes" forum Slashdot has become.
Advogato has yet to endure the test of the unwashed
masses
that slashdot has to put up with.
Wait until Advogato gets a few more 'killer' stories
that
attract
alot
of attention to the site... Althought it's user base is growing fast,
it's currently got a niche market that only attracts a particular type
of quality audience.
I personally hope advogato never gets so popular that it
attracts
the
average slashdot poster.
The Slashdot moderation system attempts to measure 'quality' of posts -
but suffers from a number of weaknesses in the sampled data.
- Very low number of samples from each sampled person ("moderator").
A moderator normally gets 5 moderator points, and do not get this often
(for me, a "normal" Slashdot poster, perhaps once every two weeks.)
- The resolution of the information gathered from the samples is low.
For each sample, you get one value of +-1 for one post; no information
about the magnitude of "moderation-deservedness" the moderator feels
this has.
- Self-limiting scale of moderation. Posts can only be scored from
-1 to 5, and the moderator sees beforehand what the moderation is.
- Feedback in moderation, and increased chance of moderation for
early posts. If something is moderated up, it more likely to be seen,
and there is no mechanism for increasing the chance that a moderator
sees other articles. You also have the problem of karma feedback - if
somebody has high karma, their posts automatically get a higher score,
so they get a higher chance of getting moderated up again.
- The system probably mix opinion-based moderation (e.g, Funny and
Insightful) and quality-based moderation (e.g, Informative) more than
necessary.
- There are no ways to moderate down as plain 'Incorrect' -
challenging that whatever substance is in a comment actually matches
reality.
All of the above problems can be compensated against through various
means. However, the most obvious way to improve the quality (using a
collaborative filtering system a la Firefly, EachMovie or GroupLens) has
a couple of problems that the Slashdot moderation system do not have:
- It causes opinion reinforcement for the users - users see messages
that agree with their opinions, represented by what they have moderated
as 'good' before.
- It causes opinion-based splits of the community to sub-communities
- different parts of the community see completely different discussions.
It may be possible to limit these effects by having the filtering
system look for 'master users' to function as main sources of opinion
data. The system would have to look for users that usually have
opinions there are few people that disagree with, and, if a challenge
against correctness is allowed, seldom has their positive
knowledge-based evaluation go against correctness.
Users should still have their discussions displayed with a slant
towards things that match their own ratings - otherwise, there is little
incentive for them to keep rating, thus limiting the data collected.
Eivind.
Slashdot is now (Feb 1) carrying a story that some company will
preinstall GNU/Hurd for you. The article posting claims that the Hurd
will be superior to Linux in the long run because it is "Object
Oriented". It's pretty obvious to anyone with any programming clue that
there is no meaningful way in which the Hurd is more "object oriented"
than Linux. It's not implemented in an OO language, it uses a non-OO
RPC mechanism, and it doesn't make particular use of any inheritance
model.
The real key difference between the two systems is that the Hurd is
microkernel-based, while Linux is not.
Currently, there are perhaps a dozen articles with scores of 4 or 5 that
either laud or vituperate the Hurd for being "object oriented", entirely
missing the point that it isn't. The highest score I saw on any article
that points out that the Hurd isn't particularly "object oriented" in
any meaningful way was 3.
It's disconcerting contrafactual posts score so much higher than ones
that point out the error.