Make Saving Plain Text Passwords Illegal
Jeff Atwood has recently been discussing the subject of salting and hashing passwords in his blog Coding Horror. The subject is certainly a tricky one, with lots of room for mistakes, even by experienced coders.
Cryptography, obviously, is so difficult it sets the hair on fire. But you don’t have to be a cryptography expert to understand why it is absolutely essential that you always salt and then hash your passwords, and that’s what this post is going to be about.
Storing passwords in plain text should be illegal
I’m not kidding, not even exaggerating for dramatic effect. I really think it should be against the law to store users’ passwords in plain text - perhaps even more so than hacking the system to obtain those passwords, although now I may begin to exaggerate for dramatic effect…
Many comments on Jeff’s blog lamented the fact that sometimes your boss will decide for you that passwords should be stored in plaintext (or two-way encrypted using a secret key, which the hacker will of course be able to obtain as readily as your password list, meaning it’s as good as plaintext). One often suggested reason would be a requirement that the system must be able to mail back a user’s forgotten password.
The fact that there’s a better solution to that requirement (mail a random password instead, make the user enter a new password of their choice when logged in again) isn’t really the issue here.
In my opinion, this is one of the very rare cases where I think the law should get involved, protecting the developer from having to compromise my security in order to keep his job. The developer should be able to say “No boss, that would be against the law”.
Why should it be illegal?
Because of the simple fact that users reuse their passwords between systems. And that, in combination with an increasingly online life, means that online impersonation is going to become a very serious concern.
Hopefully your bank doesn’t even let you use a password (right? If it does, seriously consider switching banks…) so the commonly depicted scenario where the hacker gets access to your bank account using your password harvested from some other system should hopefully be a red herring these days. However, that’s not where the problem lies.
The attacker we should be concerned with in this case is not primarily someone gaining access to a highly sensitive system using a password harvested from a low sensitivity system (although that is indeed a concern too, especially with semi-sensitive systems allowing weak passwords to be used). The attacker we have to look at here is one who gains access to many low sensitivity systems using a password harvested from one low sensitivity system.
We’re not talking about a hacker who wants to compromise a system. We’re talking about a stalker who wants to compromise you.
Someone surfing the net will soon have created dozens of accounts on many different sites, probably using the same password for most of them and also probably forgetting about most sites soon, never to log in there again.
Say you realize the password you have been using everywhere has been compromised. Now what? Do you run around to all the sites where you have registered, updating your password? Sure, that’s exactly what you should do…but do you even remember half of the sites where you’ve registered?
Hmmm….if only you had some kind of tool that could run around to a big list of internet sites, trying your username and password out and see where it gets a match. That would help. Bet you could ask your new stalker where they downloaded their copy of such a tool…
What we have to take into account is that we’re dealing with a different type of adversary here than is often considered in the security discussions for a site. We’re not talking about a hacker trying to break into your site to mess around with your site. We’re not even talking about a hacker trying to impersonate a user in your system on your system.
We’re talking about a hacker essentially trying to hack a life, not a system…and by storing your user’s password in plaintext you’re making the life of that stalker incredibly much easier!
In order to see exactly why, we should begin by distinguishing clearly between two very different alternatives:
A) Your stalker gets access to your plaintext password
B) Your stalker gets access to your hashed password and is then able to find an input string that results in the same hash (using brute force, rainbow tables, what-have-you…)
One common mistake is to assume that B leads to A. That is to think that if the hacker has cracked the hashed password, they now have access to plaintext password. They don’t. What they have is a string that may or may not be your password – but with a good hash function the odds that they actually have your plaintext password is one in millions. More importantly, even if they happen to have your password (as a one in a million chance), there’s no way for them to know that.
In order to see why this is so, consider a very simple (too simple) candidate for a hashing algorithm to use when hashing your passwords:
Whatever the input, produce the output “Hello World”.
The downside with this function is obvious: It doesn’t matter which password you chose, so a hacker trying to gain access to your system wouldn’t even have to try – any password would work.
Indeed. But think about this in the perspective that this post takes. This isn’t about an attacker trying to gain access to your site, this is about an attacker trying to gain access to my life!
Yes, your site would be extremely easy to hack. Big deal. So someone can log in as me in your system. From my perspective, I would actually prefer that you used that function before storing my password in plaintext. Why?
Because if all the attacker sees when gaining access to your password table is a lot of “Hello world”, then it will be impossible for them to deduce my plaintext password that they could use to log in as me on other systems.
The function above is of course ridiculous in that it destroys all the information in the original password. But consider the following function:
Return the sum of the character values in the input
Now we’re making some way towards a compromise. In this case, any password won’t work with any account anymore. However, it will be awfully easy to come up with alternative passwords that produce the same output. For example, “abc” and “cba” would give the same output. A brute force attack wouldn’t take long.
So the compromise improves the security of your site a bit (although not to a satisfactory level). It also lowers the security for me a bit, because with this scheme the attacker can in fact get a list of all possible strings that produce the correct output, knowing that my plaintext password in fact has to be in that list. However, there is still no way for the attacker to know for sure which one it is, and if the list is large enough we can feel pretty sure that it is not feasible to find.
At this point, real cryptography takes over. Coming up with real substitutes for the suggested functions is the work for real cryptographers. One of the differences with a real hash function is that it won’t be so easy to deduce a way to come up with valid inputs for a given output even if the attacker is allowed to inspect the hash function completely. But what will still hold true for the hash functions cryptographers come up with is that many inputs can give the same output.
Thus there is no way for the hacker who gains access to your hashed password to deduce your plaintext password. When someone has cracked your hashed password, what they have is one out of millions of (completely different) strings that will all produce the same hash output (out of billions of strings that will yield completely different outputs) allowing them to log into that system, but potentially not others.
This means that the difference between A and B above is dramatic.
If the attacker has my plaintext password they can log in as me to every system where I have used that password. If they have a string that produces the right hash, they can only log in as me to every system that uses the same hash function (and I have used that password).
However, since we rely on only a few, popular hash functions such as MD5 and SHA, that may still mean that the impersonator can impersonate me in pretty much any system they try if they find a string that produces the correct hash for my password. So, what’s the big difference, then, you may ask?
Salts.
A salt is a random little number or string that you add to the password before it is hashed. Then you store the salt value together with hashed password in the password table. This means that even if I use the same password on many different systems, the hashes that are stored in those systems won’t be the same (even if they use the same hash function) since different salts have been added to my password before it is hashed.
If all systems store only salted, hashed passwords, an attacker who finds a string that produces the correct hash output on one system won’t be able to reuse it to impersonate me on other systems.
But as soon as only one system gives the hacker access to my plaintext password, they will be able to impersonate me on all those other systems, even if they all use salted, hashed passwords.
So, assuming I’m a regular user who reuses his password: if a system hashes but forgets to salt, the stalker can impersonate me on all other sloppy, non-salted systems, but if a system forgets to hash, the stalker can impersonate me on every other system, salted or not!
That’s why forgetting to salt may be only nearly unforgivable, but forgetting to hash is most definitely, completely unforgivable.
It’s not about access to your system. Forget your system. It’s about being a Good Neighbor in an online world and treating your users’ secrets with respect. And that’s why, in my opinion, storing passwords in plain text ought to be against the law.
The Internet is quickly becoming an integral part of life for ordinary, non-nerd people. To ask aunt Tilly to assume the responsibility for properly managing her set of passwords in order not to expose herself to risk for impersonation is, in my view, rich. If she didn’t manage it, I have a hard time blaming her. In my view, the blame lies squarely with the system designers who decided to value their own revenues higher than aunt Tilly’s integrity by storing her password in plaintext.

September 20th, 2007 at 5:43 pm
I too wrote about this subject a time ago
http://jtbworld.blogspot.com/2007/09/saving-passwords-in-database-no.html
Another thing I’ve seen is that some websites seem to send the password in plain text over internet not even using https.
I read the news some weeks ago about som famous person (singer or the like) that had his life almost destroyed because of someone was able to read his email. Eventually I think FBI was involved and the woman that they got worked on a hight security leveled company.
September 20th, 2007 at 7:09 pm
Hi Jimmy,
We’ll probably only see more of that type of thing before it gets any better.
I think what’s needed is to lift up the issue on the table that there are conflicts of interest at work here: The interest of the site owner, who is more worried about the security and profitability of their own system versus the user who is more worried about their cross-system (”real life”, you might say) integrity.
There are very real trade-offs to be made concerning this conflict of interest. As it stands, the site owners have pretty much all the control over all the trade-off choices, meaning that it is easy to figure out how the trade-offs will play out…
Consumers could, in theory, battle this by means of voting with their feet/dollars. But until the issue has been properly lifted, that doesn’t work on account of users not even being aware of the issue. And I’m sceptical educating users on this issue will even work - should granny really have to understand security to the same level as security experts just to watch the latest adventures of her grandson on YouTube? That’s why I think the law may need to get involved instead, the cards are stacked to heavily in favor of the site owners and educating users just isn’t realistic. Nonetheless, educating users as far as it makes sense is still a good thing, of course!
In fact, I’ll bet most users think that when it comes to security, “what’s good for me is good for the site, and vice versa” - that is, they assume that the companies behind the sites have their own reasons for taking security seriously (which happens to be a correct assumption) but they also assume that the measures those companies could take to increase security for themselves and to improve security for the user are the same measures - which isn’t always a correct assumption, even though it does happens to be correct for most of the part.
I can’t actually think of any cases where improving security for the company can come at the expense of the security for the customer…but I don’t rule such examples out as logically impossible.
However, usually it is more a case of how some particular feature that would only raise security for the user but not so much for the site will go neglected, especially if the site percieves some trade-off where they would have to sacrifice some user-friendly feature (translating directly into profitability) for the sake of the user’s security. Often, however, just the fact that it would cost an extra penny to add a bit of security for the user but not for the site can be enough to skip that feature, as in the case of companies not hashing their passwords just out of sheer cheapness/laziness.
The problem is that users incorrectly think that if it were a security problem for them, it would be a security problem for the site and so the site would address it. Unfortunately, that’s just not true, and users need to be made aware of this in order to enable them to vote with their feet. Or perhaps lawmakers need to be made aware of this so they can help users out. Or perhaps both.
Mailing back forgotten passwords is a good example of this.
The old general rule is that security almost always comes at the expense of user friendliness in some way or another. When you can see that sacrificing user friendliness directly and dramatically improves the security for the site, you can actually expect to see that happening. If that also improves security for the user, you can expect that to be largely coincidental. An example of this is asking the user for longer passwords, which increases the security for both the site and the user. So you actually see this happening. But it is not really for your sake - allthough since you gain from it, the wording on the site trying to convince you to use a longer password will probably make it sound like it is all for you (which it is, also, but again, that’s not why they’re asking you for it).
However, storing passwords in plaintext (or encrypted, which is same-same) to enable password mailback really only compromises the user’s security, not so much the security for the site. Thus you can expect a lot of sites that gladly sacrifice their users’ security in order to draw more users who are unwitting that this should be a major concern to them. The only sites who would refuse to do that would be really honest ones.
Great post of yours btw, excellent with some code examples to actually improve the odds of more people hashing and salting their passwords, instead of just bitching about it like me!
November 15th, 2007 at 12:05 pm
[…] I think that Mats Helander comes up with the best response to this, when he says that it should be illegal to store passwords in a database in plain text: Many comments on Jeff [Atwood]’s blog lamented the fact that sometimes your boss will decide for you that passwords should be stored in plaintext (or two-way encrypted using a secret key, which the hacker will of course be able to obtain as readily as your password list, meaning it’s as good as plaintext). One often suggested reason would be a requirement that the system must be able to mail back a user’s forgotten password. […]