Prevent Malicious Requests - DOS Attacks

Prevent Malicious Requests - DOS Attacks - c#

I'm developing an asp.net MVC web application and the client has request that we try our best to make it as resilient as possible to Denial of Service attacks. They are worried that the site may receive malicious high volume requests with the intention to slow/take down the site.
I have discussed this with the product owner as really being out of the remit for the actual web application. I believe it falls to the responsibility of the hosting/network team to monitor traffic and respond to malicious requests.
However they are adamant that the application should have some precautions built into it. They do not want to implement CAPTCHA though.
It has been suggested that we restrict the number of requests that can be made for a session within a given time frame. I was thinking of doing something like this
Best way to implement request throttling in ASP.NET MVC? But using the session id not the client IP as this would cause problems for users coming from behind a corporate firewall - their IP would all be the same.
They have also suggested adding the ability to turn off certain areas of the site - suggesting that an admin user could turn off database intensive areas..... However this would be controlled through the UI and surely if it was under DOS attack an admin user would not be able to get to it anyway.
My question is, is it really worth doing this? Surely a real DOS attack would be much more advanced?
Do you have any other suggestions?

A Denial of Service attack can be pretty much anything that would affect the stability of your service for other people. In this case you're talking about a network DoS and as already stated, this generally wouldn't happen at your application level.
Ideally, this kind of attack would be mitigated at the network level. There are dedicated firewalls that are built for this such as the Cisco ASA 5500 series which works it's way up from basic protection through to high throughput mitigation. They're pretty smart boxes and I can vouch for their effectiveness at blocking these type of attacks, so long as the correct model for the throughput you're getting is being used.
Of course, if it's not possible to have access to a hardware firewall that does this for you, there are some stopgap measures you can put in place to assist with defence from these types of attacks. Please note that none of these are going to be even half as effective as a dedicated firewall would be.
One such example would be the IIS Module Dynamic IP Restrictions which allows you to define a limit of maximum concurrent requests. However, in practice this has a downside in that it may start blocking legitimate requests from browsers that have a high concurrent request throughput for downloading scripts and images etc.
Finally, something you could do that is really crude, but also really effective, is something like what I had written previously. Basically, it was a small tool that monitors log files for duplicate requests from the same IP. So let's say 10 requests to /Home over 2 seconds from 1.2.3.4. If this was detected, a firewall rule (in Windows Advanced Firewall, added using the shell commands) would be added to block requests from this IP, the rule could then be removed 30 minutes later or so.
Like I say, it's very crude, but if you have to do it at the server level, you don't really have many sensible options since it's not where it should be done. You are exactly correct in that the responsibility somewhat lies with the hosting provider.
Finally, you're right about the CAPTCHA, too. If anything, it could assist with a DoS by performing image generation (which could be resource intensive) over and over again, thus starving your resources even more. The time that a CAPTCHA would be effective though, would be if your site were to be spammed by automated registration bots, but I'm sure you knew that already.
If you really want to do something at application level just to please the powers that be, implementing something IP-based request restriction in your app is doable, albeit 90% ineffective (since you will still have to process the request).

You could implement the solution in the cloud and scale servers if you absolutely had to stay up, but it could get expensive...
Another idea would be to log the ip addresses of registered users. In the event of a DOS restrict all traffic to requests from 'good' users.

Preventing a true DoS attack on the application-level is not really doable, as the requests will most probably kill your webserver before it kills your application due to the fact that your application is associated with an application pool which again has a maximum of concurrent requests defined by the server technology you are using.
This interesting article
http://www.asp.net/web-forms/tutorials/aspnet-45/using-asynchronous-methods-in-aspnet-45
states that windows 7, windows Vista and Windows 8 have a maximum of 10 concurrent requests. It goes further with the statement that "You will need a Windows server operating system to see the benefits of asynchronous methods under high load".
You can increase the HTTP.sys queue limit of the application pool that is associated with your application in order to increase the amount of requests that will be queued (for later computation when threads are ready), which will prevent the HTTP Protocol Stack (HTTP.sys)
from returning Http error 503 when the limit is exceeded and no worker-process is available to handle further requests.
You mention that the customer requires you to "try [your] best to make it as resilient as possible to Denial of Service attacks".
My suggestion might not be an applicable measure in your situation, but you could look into implementing the Task-based Asynchronous Pattern (TAP) mentioned in the article in order to accommodate the customers requirement.
This pattern will release threads while long-lasting operations are performed and making the threads available for further requests (thus keeping your HTTP.sys queue lower) - while also giving your application the benefit of increased overall performance when multiple requests to third-party services or multiple intensive IO computations are performed.
This measure will NOT make your application resilient to DoS attacks, but it will make your application as responsible as possible on the hardware that it is served on.

Related

What are the downsides to Request throttling using delay (C# .Net 4 Web Server)

We are running a Http Api and want to be able to set a limit to the number of requests a user can do per time unit. When this limit has been reached, we don't want the users to receive errors, such as Http 429. Instead we want to increase the response times. This has the result that the users can continue to work, but slower, and can then choose to upgrade or not upgrade its paying plan. This solution can quite easily be implemented using Thread.sleep (or something similar) for x number of seconds, on all requests of a user that has passed its limit.
We think that in worst case there might be a problem with the number of possible connections for a single server, since as long as we keep delaying the response, we keep a connection open, and therefore limiting the number of possible other connections.
All requests to the Api is running asynchronously. The Server itself is built to be scalable and is running behind a load balancer. We can start up additional servers if necessary.
When searching for this type of throttling, we find very few examples of this way of limiting the users, and the examples we found seemed not concerned at all about connections running out. So we wonder is this not a problem?
Are there any downsides to this that we are missing, or is this a feasible solution? How many connections can we have open simultaneously without starting to get problems? Can our vision be solved in another way, that is without giving errors to the user?

Thread.Sleep() is pretty much the worst possible thing you can do on a web server. It doesn't matter that you are running things asynchronously because that only applies to I/O bound operations and then frees the thread to do more work.
By using a Sleep() command, you will effectively be taking that thread out of commission for the time it sleeps.
ASP.Net App Pools have a limited number of threads available to them, and therefore in the worst case scenario, you will max out the total number of connections to your server at 40-50 (whatever the default is), if all of them are sleeping at once.
Secondly
This opens up a major attack vector in terms of DOS. If I am an attacker, I could easily take out your entire server by spinning up 100 or 1000 connections, all using the same API key. Using this approach, the server will dutifully start putting all the threads to sleep and then it's game over.
UPDATE
So you could use Task.Delay() in order to insert an arbitrary amount of latency in the response. Under the hood it uses a Timer which is much lighter weight than using a thread.
await Task.Delay(numberOfMilliseconds);
However...
This only takes care of one side of the equation. You still have an open connection to your server for the duration of the delay. Because this is a limited resource it still leaves you vulnerable to a DOS attack that wouldn't have normally existed.
This may be an acceptable risk for you, but you should at least be aware of the possibility.

Why not simply add a "Please Wait..." on the client to artificially look like it's processing? Adding artificial delays on server costs you, it leaves connections as well as threads tied up unnecessarily.

Pushing OR Polling

I have a SL client and a WCF service. The client polls the WCF every 4 seconds and I have almost 100 clients at a time.
The web server is an entry level server with 512 MB RAM.
I want to know, if polling is dependent on the server configuration, if I increase the server configuration will the polling for clients work better?
And second, would pushing (duplex) be better than polling? I have got some mixed response from the blogs I have been reading.
Moreover, what are the best practices in optimizing polling for quicker response at the client? My application needs real-time data
Thanks

My guess would be that you have some kind of race condition that is showing up only with a larger number of clients. What concurrency and instancing modes are you using for your WCF service? (See MSDN: WCF Sessions, Instancing, and Concurrency at http://msdn.microsoft.com/en-us/library/ms731193.aspx)
If you're "losing" responses the first thing I would do is start logging or tracing what's happening at the server. For instance, when a client "doesn't see" a response, is the server ever getting a request? (If so, what happens to it, etc etc.)
I would also keep an eye on memory usage -- you don't say what OS you're using, but 512 MB is awfully skinny these days. If you ever get into a swap-to-disk situation, it's clearly not going to be a good thing.
Lastly, assuming that your service is CPU-bound (i.e. no heavy database & filesystem calls), the best way to raise your throughput is probably to reduce the message payload (wire size), use the most performant bindings (i.e. if client is .NET and you control it, NetTcp binding is much faster than HTTP), and, of course, multithread your service. IMHO, with the info you've provided -- and all other things equal -- polling is probably fine and pushing might just make things more complex. If it's important, you really want to bring a true engineering approach to the problem and identify/measure your bottlenecks.
Hope this helps!

"Push" notifications generally have a lower network overhead, since no traffic is sent when there's nothing to communicate. But "pull" notifications often have a lower application overhead, since you don't have to maintain state when the client is just idling waiting for a notification.
Push notifications also tend to be "faster", since clients are notified immediately when the event happens rather than waiting for the next polling interval. But pull notifications are more flexible -- you can use just about any server or protocol you want, and you can double your client capacity just by doubling your polling wait interval.

High availability & scalability for C#

I've got a C# service that currently runs single-instance on a PC. I'd like to split this component so that it runs on multiple PCs. Each PC should be assigned a certain part of the work. If one PC fails, its work should be moved to a backup machine.
Data synchronization can be done by the DB, so that should not be much of an issue. My current idea is to use some kind of load balancer that splits and sends the incoming requests to the array of PCs and makes sure the work is actually processed.
How would I implement such a functionality? I'm not sure if I'm asking the right question. If my understanding of how this goal should be achieved is wrong, please give me a hint.
Edit:
I wonder if the idea given above (load balancer splitswork packages to PCs and checks for result) is feasible at all. If there is some kind of already implemented solution so this seemingly common problem, I'd love to use that solution.
Availability is a critical requirement.

I'd recommend looking at a Pull model of load-sharing, rather than a Push model. When pushing work, the coordinating server(s)/load-balancer must be aware of all the servers that are currently running in your system so that it knows where to forward requests; this must either be set in config or dynamically set (such as in the Publisher-Subscriber model), then constantly checked to detect if any servers have gone offline. Whilst it's entirely feasible, it can complicate the scaling-out of your application.
With a Pull architecture, you have a central work queue (hosted in MSMQ, Sql Server Service Broker or similar) and each processing service pulls work off that queue. Expose a WCF service to accept external requests and place work onto the queue, safe in the knowledge that some server will do the work, even though you don't know exactly which one. This has the added benefits that each server monitors it's own workload and picks up work as-and-when it is ready, and you can easily add or remove servers to/from this model without any change in config.
This architecture is supported by NServiceBus and the communication between Windows Azure Web & Worker roles.

From what you said each PC will require a full copy of your service -
Each PC should be assigned a certain
part of the work. If one PC fails, its
work should be moved to a backup
machine
Otherwise you won't be able to move its work to another PC.
I would be tempted to have a central server which farms out work to individual PCs. This means that you would need some form of communication between each machine and and keep a record back on the central server of what work has been assigned where.
You'll also need each machine to measure it's cpu loading and reject work if it is too busy.
A multi-threaded approach to the service would make good use of those multiple processor cores that are ubiquitoius nowadays.

How about using a server and multi-threading your processing? Or even multi-threading on a PC as you can get many cores on a standard desktop now.
This obviously doesn't deal with the machine going down, but could give you much more performance for less investment.

you can check windows clustering, and you have to handle set of issues that depends on the behaviour of the service (you can put more details about the service itself so I can answer)

This depends on how you wanted to split your workload, this usually done by
Splitting the same workload by multiple services
Means same service being installed on
different servers and will do the
same job. Assume your service is reading huge data from the db servers and processing them to produce huge client specific datafiles and finally this datafile is been sent to the clients. In this approach all your services installed in diff servers will do the same work but they split the work to increaese the performance.
Splitting the part of the workload by multiple services
In this approach each service will be assigned to the indivitual jobs and works on different goals. in above example one serivce is responsible for reading data from db and generating huge data files and another service is configured only to read the data file and send it to clients.
I have implemented the 2nd approach in one of my work. Because this let me isolate and debug the errors in case of any failures.

The usual approach for load balancer is to split service requests evenly between all service instances.
For each work item (request) you can store relative information in database. Then each service should also have at least one background thread checking database for abandoned work items.

I would suggest that you publish your service through WCF (Windows Communication Foundation).
Then implement a "central" client application which can keep track of available providers of your service and dish out work. The central app will act as scheduler and load balancer of the tasks to be performed.
Check out Juwal Lövy's book on WCF ("Programming WCF Services") for a good introduction on this topic.

You can have a look at NGrid : http://ngrid.sourceforge.net/
or Alchemi : http://www.gridbus.org/~alchemi/index.html
both are grid computing framework with load balancers that will get you started in no time.
Cheers,
Florian

Message Granularity for Message Queues and Service Buses

I'm working on an application that may generate thousands of messages in a fairly tight loop on a client, to be processed on a server. The chain of events is something like:
Client processes item, places in local queue.
Local queue processing picks up messages and calls web service.
Web service creates message in service bus on server.
Service bus processes message to database.
The idea being that all communications are asynchronous, as there will be many clients for the web service. I know that MSMQ can do this directly, but we don't always have that kind of admin capability on the clients to set things up like security etc.
My question is about the granularity of the messages at each stage. The simplest method would mean that each item processed on the client generates one client message/web service call/service bus message. That's fine, but I know it's better for the web service calls to be batched up if possible, except there's a tradeoff between large granularity web service DTOs, versus short-running transactions on the database. This particular scenario does not require a "business transaction", where all or none items are processed, I'm just looking to achieve the best balance of message size vs. number of web service calls vs. database transactions.
Any advice?

Chatty interfaces (i.e. lots and lots of messages) will tend to have a high overhead from dispatching the incoming message (and, on the client, the reply) to the correct code to process the message (this will be a fixed cost per message). While big messages tend to use the resources in processing the message.
Additionally a lot of web service calls in progress will mean a lot of TCP/IP connections to manage, and concurrency issues (including locking in a database) might become an issue.
But without some details of the processing of the message it is hard to be specific, other than the general advice against chatty interfaces because of the fixed overheads.

Measure first, optimize later. Unless you can make a back-of-the-envelope estimate that shows that the simplest solution yields unacceptably high loads, try it, establish good supervisory measurements, see how it performs and scales. Then start thinking about how much to batch and where.
This approach, of course, requires you to be able to change the web service interface after deployment, so you need a versioning approach to deal with clients which may not have been redesigned, supporting several WS versions in parallel. But not thinking about versioning almost always traps you in suboptimal interfaces, anyway.

Abstract the message queue
and have a swappable message queue backend. This way you can test many backends and give yourself an easy bail-out should you pick the wrong one or grow to like a new one that appears. The overhead of messaging is usually packing and handling the request. Different systems are designed for different levels traffic and different symmetries over time.
If you abstract out the basic features you can swap the mechanics in and out as your needs change, or are more accurately assessed.
You can also translate messages from differing queue types at various portions of the application or message route as the recipient's stresses change because they are handling, for example 1000:1/s vs 10:1/s on a higher level.
Good Luck

Whose responsibility is it to throttle web requests?

I am working on a class library that retrieves information from a third-party web site. The web site being accessed will stop responding if too many requests are made within a set time period (~0.5 seconds).
The public methods of my library directly relate to a resource an file on the web server. In other words, each time a method is called, an HttpWebRequest is created and sent to the server. If all goes well, an XML file is returned to the caller. However, if this is the second web request in less than 0.5s, the request will timeout.
My dilemma lies in how I should handle request throttling (if at all). Obviously, I don't want the caller sit around waiting for a response -- especially if I'm completely certain that their request will timeout.
Would it make more sense for my library to queue and throttle the webrequests I create, or should my library simply throw an exception if the a client does not wait long enough between API calls?

The concept of a library is to give its client code as little to worry about as possible. Therefore I would make it the libraries job to queue requests and return results in a timely manner. In an ideal world you would use a callback or delegate model so that the client code can operate in asynchronously, not blocking the UI. You could also offer the option for skipping the queue, (and failing if it operates too soon) and possibly even offer priorities within the queue model.
I also believe it is the responsibility of the library author to default to being a good citizen, and for the library's default operation to be to comply to the conditions of the data provider.

I'd say both - you're dealing with two independent systems and both should take measures to defend themselves from excessive load. The web server should refuse incoming connections, and the client library should take steps to reduce the requests it makes to a slow or unresponsive external service. A common pattern for dealing with this on the client is 'circuit breaker' which wraps calls to an external service, and fails fast for a certain period after failure.

That's the Web server's responsibility, imo. Because the critical load depends on hardware, network bandwidth, etc a lot of things that are outside of your application's control, it should not concern itself with trying the deal with it. IIS can throttle traffic based on various configuration options.

What kind of client is it? Is this an interactive client, for eg: GUI based app?
In that case, you can equate that to a webbrowser scenario, and let the timeout surface to the caller. Also, if you know for sure that this webserver is throttling requests, you can tell the client that he has to wait for a given time period before retrying. In that way, the client will not keep on re-issuing requests, and will know when the first timeout occurs that it is futile to issue requests too fast.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.