PDA

View Full Version : Network Outage: 6-21, 6-22-2007, and wrap-up [Resolved]


bruce
06-21-2007, 05:44 AM
Our network is under a massive DOS attack which used up the capacity of the physical cable (Gigabit) to our router.

We are working with our upstream provider to identify the IP addresses who initiated the DOS and attempt to terminate the incoming traffic from their network.

We will post update here as soon as we made any significant progress

bruce
06-21-2007, 06:05 AM
Our upstream provider has made some progress in blocking off most of these offending traffic. We are seeing much improved performance now.

I'll keep you guys posted.

mjp
06-21-2007, 09:14 AM
Just to expand on this somewhat -

We experienced an outage this morning related to problems with our upstream provider and the networks at our data center and office. All services were affected, including web and email.

Three issues came in to play that were apparently unrelated, but taken together amounted to what was pretty much a worst case scenario that affected virtually every aspect of our operations.

- Early this morning our data center upstream provider experienced a power failure that caused an interruption, and they were operating for some time on generator power. This by itself was not too serious, and only resulted in a brief glitch around 4am Pacific Time.

- Shortly after that interruption, our internal network at the data center was saturated with traffic from an apparent distributed denial of service attack. While the servers were all up and running, the DDoS created so much traffic that no one could connect to web or email services.

- To add insult to injury, there was also a failure of our support office internet provider, so support staff was unable to answer helpdesk tickets or make forum posts (Bruce made the first post of this thread from home).

We have routed the DDoS traffic away from our network so traffic can flow normally, and are working on tracking down the source - and target - of the attack.

Our network capacity is many times greater than what is needed for typical operations, and we have measures in place to prevent most DoS attacks from ever being noticed. This particular attack, however, was so massive in scale that our connection from the upstream provider was saturated, which is what caused the outage. When the incoming line is monopolized in such a way, the anti-DDoS equipment doesn't get a chance to do it's job on the traffic and everything grinds to a halt.

It is impossible to prevent DDoS attacks. As mentioned previously, we do have measures in place to mitigate them, so the vast majority of the time there is no effect on normal service. But if someone is determined to throw enough traffic at you, they can bring down almost any network. As it turns out, this was an attack on a very large scale, sending traffic hundreds of times what our network is capable of handling, so we could have had 100 times the bandwidth we normally have available and the result would have been the same. All we can do is react as best we can in an instance such as this.

We apologize for the outage, and want to assure you that we are prepared to handle that vast majority of issues such as this. But this particular attack was of a magnitude that would have crippled pretty much any host out there.

Eric
06-22-2007, 04:06 AM
We experienced another massive DDOS attack early this morning. Right now, it intensifies and subsides in waves. We are working with our upstream provider's security team.

We will post here when we can but the best place to check for this particular outage is at http://www.daspstatus.com

Eric
06-22-2007, 09:48 AM
We are being dDOS attacked again. Our upstream provider attempted to put in place some filters to thwart the dDOS attack. Unfortunately, this did not work.

Eric
06-23-2007, 06:23 AM
We tried another solution and this appears to be working to mitigate the ddos attack. We are closely monitoring the systems and will be posting more information on the status page.

mjp
06-30-2007, 12:47 AM
We wanted to post a wrap-up of the DDoS (Distributed Denial of Service) experience in an effort to answer everyone's questions and let you know where we stand now.

As you are no doubt aware, the attack began on Thursday, June 21. Typically a DDoS attack is one long sustained barrage of traffic, and the way most people deal with them is simply to wait it out. This attack was different. When it stopped for the first time we assumed it had ended, but about 6 hours later, it started again. It went on to start and stop many times through Friday.

The attack was on an unusually large scale, saturating our gigabit internet connection. A gigabit connection can move as much data as over 650 T1 lines. Our average network traffic is 120 to 150 megabits per second - 15% of the gigabit connection. The attack was at least 1 gigabit, possibly more (the only measurement we had to go by was the saturation of the gigabit line). An incredible amount of traffic.

Throughout Thursday and Friday we attempted to get help from our upstream internet connection provider, but the help they offered proved to be insufficient. They have a managed service that may have been able to mitigate the attack, but it would have taken up to a week to implement. because the attack was on such a large scale, they wanted to perform a time-consuming risk analysis before taking us on (after which there was the possibility that they would have declined to help).

Once our provider was no longer an option, we considered a number of other solutions before we enlisted the services of a company that specializes in DDoS mitigation, Prolexic Technologies. We called them Friday night and within a couple of hours they were handling the traffic to our name servers and the effects of the DDoS were quickly neutralized. Shortly after that, the attack stopped (likely because the attacker saw that it was no longer effective).

There was a period of DNS propagation on Saturday, but since then we have had no ill effects from DDoS. Brief outages earlier in the week were caused by name server issues that we have addressed.

At this point we have much more extensive protections in place than any other host that is comparable to us in size (and many much larger hosts and ISPs as well). We believe we have always offered superior service, and the difficulties that were raised by this recent attack - and the measures that we took to overcome them - have served to strengthen our network overall, and have actually increased the level of service that we are able to provide.

---

During the DDoS attack we established a status page (http://daspstatus.com) that we posted updates to when we had news to report. That page is served from outside of our network. It will remain available, but will only be used to report status in the event that our entire network is unavailable and Control Panel and these forums are not an option.

We discussed many communication options, and have adopted the following;

- The emergency status page will be used in the event the network is unreachable.

- In the event of an outage or issue that is not global (i.e., affecting one server or service), a notice will be posted in the new "Outages and maintenance" forum.

- In the event of an outage that affects all customer sites or email, but does not affect Control Panel, a notice will be posted in the Control Panel news section.

- As email addresses using domains that we host may be unavailable during a global outage, we are recommending that you update your admin email contact to use a third-party address that is not hosted with us.

---

Read Bruce's first-person insider account of the DDoS and how we dealt with it here: http://community.discountasp.net/default.aspx?f=15&m=18352

---

What is a DDoS attack?

DDoS stands for <i>Distributed Denial of Service</i>. A DDoS is carried out by instructing hundreds or thousands (or in some cases hundreds <i>of</i> thousands) of computers to send traffic to a target site or IP address.

These computers are called zombies or bots, and they are mainly normal home or office computers that have been compromised via a trojan horse, worm, virus, etc. Most of the "bot" owners do not even realize their computers are being used, as each bot only sends a small amount of information. The power of the DDoS is many, many machines, each sending a little data. Since the traffic is coming from thousands of different sources, it is impossible to block it by IP address.

What effect did the DDoS have on DiscountASP.NET?

What happened, basically, was our connection to the internet was inundated with more traffic than it could handle.

Traffic flows into our network from the upstream provider's router (the "upstream provider" is the company that provides our bandwidth). Their router funnels the traffic from all over the internet to our network at a data center here in California.

Normally our connection - the narrow part of the funnel, if you will - is more than adequate to handle the flow. In fact, it is about 10 times larger than our average traffic requires. But when the DDoS started, it was like someone aiming a fire hose at the funnel.

So when you attempted to connect to your site or email server, your request was like a drop of water stuck behind the flow of the fire hose. Usually the connection will time out, as the wait to get through would be too long.

What can be done to prevent DDoS attacks?

We use an intrusion protection device called Tipping Point. Tipping Point is able to analyze traffic and drop or reject suspect traffic such as that coming from a DoS or DDoS attack. In fact, Tipping Point has stopped many DoS attacks aimed at our network. But Tipping Point is on our end of the "funnel." So during an attack that creates a back up at the top of the funnel, most of the traffic never even gets to the Tipping Point device, so it cannot do its job.

In order to prevent our network from being overrun by a DDoS attack, we have engaged the services of Prolexic, a company that specializes in DDoS mitigation. Their system protects many large sites and networks, including foreign government sites, major news sites and very large web hosts. We believe this will be a very effective addition to the group of tools that we employ to counteract the malicious activities that commonly affect large networks.