Accidental Denial-of-Service

When pushing a bug fix to the Endorse.me servers this morning, I noticed that the site was responding extraordinarily slowly. A quick check in the logs showed that we were being flooded with requests from a single IP address for a single resource (a Flash/SWF file that we use to allow users to copy text to the clipboard) that was hogging almost an entire web process on Heroku.

I quickly spun up another dyno on Heroku (cloud computing FTW) and looked a little bit deeper into the issue. I emailed the user who’s account seemed to be the source, but received no response. Some light googling didn’t reveal any easy ways to block specific IP addresses using Heroku/Node.js (I found some ways using Rack), and I started considering just how bad it would be to continue to throw new dynos at the problem.

I moved the SWF file to our CDN, where it should have been all along. For convenience in development, I had kept it local, thinking “what’s the worst that could happen with ONE static asset?” Sigh.

Eventually I signed us up for CloudFlare which routes all of our traffic through their network, allowing me to block specific IP’s. Their sign-up process was completely painless aside from the DNS propagation which is unavoidable.

As soon as we were set up, I blocked the IP, and everything went back to normal. I was able to scale our Heroku setup back down again, and actually read my logs.

Now, I’m inclined to think that this whole thing was an accident for a few reasons:

  1. The User Agent string indicated it was from Chrome
  2. The requests were for a SWF file (which seems like it could be a bug, it wasn’t even a big file)
  3. While there were a lot of requests, it wasn’t even close to actually crippling us or taking us down. 

If anyone else has run into this issue before and knows a better long term fix without blocking an IP, please let me know.