Here is my own analysis. BTW, you can download a tool for scanning around at http://www.robertgraham.com/tools/scanslam/ for machines still vulnerable (but protected by firewalls).
I ran the worm on a (slow) machine with a gigabit Interface. It produced over 100,000 packet/second and 300-mbps. It randomly chooses target IP addresses. This means that a single machine with the right Internet connection can scan the entire Internet in 12 hours. If a hundred such machines get infected, the entire Internet would be scanned in 10 minutes. Most importantly: 100 such machines were infected. Consequently, infection of the entire Internet was nearly instantaneous.
The following is a weekly graph of packet loss on the Internet. We see that the worm infected the Internet in an instant.
The "state-of-the-art" is a large hosting center connected with a couple OC12 (622-mbps) and several OC3 (155-mbps) networks. It only takes a couple machines within the hosting center to saturate those links with traffic.
For this reason we see lots of "square-waves", where only a single server at a hosting site (MFN) gets infected, saturating the entire link.
This graphs shows a single machine behind a 100-mbps Ethernet link. When it got infected, traffic jumped to 100%, when admins rushed in to turn it off, traffic went back to normal. Note the blue line: the outgoing flood prevented packets going out, and since no response was being received, requests stopped coming in. Once responses were received, traffic immediately started flowing in again.
Another graph from NAC, instant infection, instant remediation with firewall rules or turning off an infected machine.
Internet services are centrally located in many countries outside the United States. As a result, many see a binary effect. If their datacenters didn't get infected, users saw 0 problems (unlike the slowdowns in the U.S.). If one or two machines in their datacenters got infected, they saw a total shutdown of their Internet in that country.
Note that many have suspected that these graphs are explained by bugs in routers and firewalls that shutdown after too many independent flows. My experience is that it shutdown mostly by people turning of machines in datacenters, followed by turning on port 1434 filters.
Every news article quotes an "expert" who says something about how we need to keep up with patches better.
If 100% of SQL Server 2000 systems had been patched by system administrators, the situation would not have changed one bit. I probed port 1433/tcp on attacking hosts and got a lot more RSTs than SYNACKs. This means that most hosts were infected by MSDE, not MSSQL. MSDE is "Microsoft Database Embedded", and is embedded within desktop products like Visio, network infrastructure systems from companies like Cisco, and in server applications such as McAffee's virus manager. These aren't unusual: MSDE is being included in thousands of desktop, infrastructure, and server software packages.
Patching all SQL Server would still miss MSDE.
Patching "amost all" makes no difference. The "binary" nature of the worm meant that it takes only a single infected system to take down an entire datacenter. A patch management system that covered 99.9% of all machines in a large datacenter would still have left one of them vulnerable.
Patches often break your applications – a large company trying to get 100% patched will experience more outages than what this worm caused. Government spokemen like Howard Schmidt are therefore proposing that we experience more DoSes over time as we patch systems than the occasional DoSes that happen all at once via worms like this.
You can't get 100% patched. Economists know of a concept called "decreasing marginal returns". It's cheap-and-easy to patch most systems, its nearly impossible to reach 100%.
The main problem here is not patches but hardening. Port 1434 was unnecessary to almost everyone. When application vendors embedded MSDE, why didn't they close down port 1434? Most importantly, my FIRST and LAST step in hardening a system is looking at ‘netstat' and closing down ports I don't need. My personal website http://www.robertgraham.com/ has been running on an unpatched Windows system for 5 years with no problems. I don't need to bother patching it because I have hardened it. Patches solve the "known" vulnerabilities, hardening solves the vulnerabilities that are there, but haven't been discovered yet.
Because of the high rate of traffic, it is easy for an infected victim to find the machine and turn it off. Moreover, it was also obvious that they should do so. In CodeRed and Nimbda, if services weren't failing, administrators would leave the machines running as they tried to fix them: they didn't want to take down valuable services. In this case, the infected machine was DoSed (and was DoSing neighbors), so there was no question that it should be turned off.
More importantly, this uses UDP port 1434, which isn't used for any other purpose on the Internet (except for the rare DNS response). Consequently, ISPs caught in the middle of machines blasting them with traffic could simply flip and switch and turn off a port. In contrast, they couldn't turn off port 80. Sure, some turned off port 80 going into dialup users and DSL networks, but they couldn't stop port 80 requests coming from their customers.
Note that everyone installed filters for inbound and outbound. They want to stop inbound from flooding their network, but they want to prevent still-infected servers from taking out their outbound connections.
From a news story: "At least five of the internet's 13 major hubs were targeted in Saturday's attack". This is because 5 of the root servers experience more problems than others.
However, this is simply due to the "binary" nature I described above. Everyone was attacked equally, but each victim responded differently. Just because one root server went down and another experience no problems doesn't mean one was targeted more than another. It also doesn't mean that one server was designed to cope with attacks better. One root server with multiple OC12 622-mbps connections might get DoSed by a couple machines sharing its links, whereas another root server with single faster Ethernet connection 100-mbps saw few problems.
The volume of traffic spit out by infected machine exceeds all other DDoS, worms, viruses, and hacker attacks combined. This is a meaningless number, of course. Just because your home machines spits out 100-mbps doesn't mean your DSL line won't throttle that down to 128-kbps.
Most Internet users don't notice. Sysadmins quickly responded. I've got 10 different ways to take down the Internet, and while I don't know how to quickly fix some of them, I doubt I could keep the Internet down long enough so that most Internet users would care.
In other words: the worst attack ever wasn't a big deal.
I hear a lot I don't agree with, or which I think is stupid. An example of this is that every press story wants to analyze this in terms of which countries were hit first, but as we know, this is meaningless because the worm attacked the entire Internet all at once.
People have proposed that worms can be given a boost (to infect the Internet faster) if the hacker scans for victims first, then launches the worm against hundreds of victims all at once, rather than just one. People have talked about "mysterious" scans prior to the worms launch.
However, the instantaneous infection is fully explained by the fact that the worm uses UDP vs. TCP. Once a single high-bandwidth server is infected, it takes only a few minutes for this thing to snowball to the entire Internet. Infecting one server on an OC12 link would infect the Internet faster than hundreds of servers on T1 links. The hacker may have started with hundreds of victims, but there is no evidence for it – there can be no evidence for it.
Most victims were infected through MSDE 2000, a lightweight version of SQL Server installed as part of many applications from Microsoft (e.g. Viseo) as well as 3rd parties. You might have MSDE on your desktop right now. News articles comparing this to CodeRed have mentioned that most victims were corporate servers. This is wrong: CodeRed infected primarily desktops from people who didn't know that the "personal" version of IIS was installed, this worm infected primarily people who didn't know that MSDE was installed.
The problem had little to do with normal SQL Server 2000 installations.
The worm attacks the entire Internet all at once. The worm does not see physical locations, does not care about national boundaries, and you shouldn't either. When this hits the TV news, you are going to see animations showing a map with points of red, with the points growing into circles, and the circles merging to cover the globe. It doesn't work like that.
The United States has most of the Internet address space, so a randomly chosen address is likely located in the U.S. The United States has the majority of high-bandwidth hosting sites, so most of the attacks will appear to come from there. That just means no matter where the first infections started, random chance means that the earliest infections will be into/out of the United States. The worm doesn't care where an IP address is located, and when you look at the problem, neither should you.
They just don't get it. My security policy is to do the opposite of whatever the government implores me to do.
Think of it as a roof. It is a lot harder for a homeowner to keep it completely water tight, because water is so easy at finding the smallest hole. Today's complex Internet networks cannot be made watertight. Implore all you want, it's not going to happen. A system administrator has to get everything right all the time, a hacker only has to find one small hole. A sysadmin has to be lucky all the time, a hacker only has to get lucky once. It is easier to destroy than to create.
Patching is useful, of course, but it has nothing to do with this problem.
It was nothing like CodeRed.
People miss the similarities
Having been doing Internet security for 15 years, I've noticed the trend to see conspiracies where none exist. What happened in South Korea is that a couple of MSDE machines sitting in a corner in the data-center took down their redundant OC12 links. Since it is impossible for a TCP-based worm to cause that much damage from that few machines, people can't believe that a simple UDP-based worm could do it (since UDP is less advanced than TCP). Therefore, it must be a conspiracy of hackers going after South Korea. Note that most countries outside the United States are dominated by only a few large ISPs – this centralization means that while United States users see a slower Internet, most countries see an Internet with no problems or total failure. This worm was "binary".
It's clear to Occam. The worm attacks everywhere simultaneously, there are no national boundaries, but as I describe in  above, it affected everyone differently. Nobody was targeted.
How seriously? Making sure that 99.9% of all patches are applied? That's what they are doing already.
It's like those spam for making money fast: they say the system only works if the punter is "serious" about making money. Obviously, if you fail to make money on the scam, it was because they weren't serious. I've talked to victims of the scams, and most seriously believe it was because they weren't "serious" enough about it.
How seriously? Solving security problems has decreasing marginal returns. Frankly, dealing with an outage like this ever year is cheaper than the cost of installing patches.
Lastly, the problem isn't installing patches. My personal website (www.robertgraham.com) has been running for 5 years on an unpatched Windows system. Right now, most systems are patched, but most are not hardened well enough. The biggest band-for-the-buck comes from hardening systems better, not getting those last little bit of patches installed.
Research institutions have been funded by governments to have the highest bandwidth connections. This worm consumes all available bandwidth. Consequently, the source of packets measures who has the most bandwidth, not who is the most morally weak to let themselves be infected. Seriously, a student in his dorm room might have the same MSDE application has her father, but she pumps 30,000 attacks per second onto the Internet through here 100-mbps Ethernet, whereas her father can only manage 40 attacks per second from his home.
This is what they say every time a worm hits. It's hard to believe them when their next sentence contains statements that demonstrate that they still don't get it. It is like everyone is running around saying "choose the Red Pill and wake up", but themselves grab the Blue Pill and stay in wonderland.