WCF cluster for mid-length tasks

WCF cluster for mid-length tasks - c#

We have a self-hosted WCF application which analyses text. The installation we're discussing, involves processing batches of small fragments of text (social media) and longer ones (like newspaper articles). The longer fragments take on average 5-6 sec to process in one WCF instance, while the shorter ones are under 1 sec. There are millions of items of each kind to be processed every day.
Several questions:
What is the recommended configuration? Windows Azure / any kind of IaaS like Amazon / cluster managed by a load balancer?
Is there a built-in support for load balancing in WCF, which does not require writing a wrapper?
For some reason, when a long task is running and another task is submitted to an instance deployed on a multicore machine, they both run in parallel on the same core, instead of starting on another core which is free. Is this some kind of conservative allocation? Can it be managed more efficiently?

The easy answer is Azure (because it's a PaaS by microsoft) but it isn't really a technical question. It depends on costs, and growth prediction.
Not really. WCF supports load balancing, but WCF itself runs in your process and can't load balance itself. It's usually a feature of your hosting platform.
If that's 2 different processes then the OS schedules CPU-time, and I wouldn't recommend messing with that. If both were run on the same core it's probably because they can (which makes sense, as WCF uses a lot if IO)

Related

Cache Dataset in session to fetch on demand

I'm considering using WCF or mormot as frameworks for RESTful service, where the code of business / legacy that needs to be accessed is written in Delphi. Performance is a premise in the project.
The application must be prepared for load balancing. The clients of REST service Desktops are Windows applications. These desktop clients allow the user to view large volumes of data, with huge resultsets in SQL statements. What is the best way to implement a service to cache a recordset and consume it slowly through the REST service. Can demonstrate a good example? The recordset must be cached in the session until the client completed the consultation or decided to do the full fetch. I'm looking for the right architecture?
Enabling load balancing will work in WCF? Due to the recordset being cached on a single server, with the row fetch requests, if any, must fall on the same server.

Both WCF and mORMot share the same high-performance kernel-mode http.sys server. Both feature IOCP and multi-threading.
For performance, mORMot will be lighter, will allocate (much) less memory, won't be affected by Garbage Collector freezes, and is able to get JSON content directly from the database engine (by-passing most temporary data conversion and allocation) - so that you can achieve amazing speed. In short, mORMot was designed for performance of serving REST/JSON content from the ground up - with a multi-threaded kernel (whereas e.g. node.js is mono-threaded). If your purpose is also to cache some data, mORMot works very well as 64 bit native services, giving access to all your system RAM if needed, and has built-in real-time content compression.
WCF is a great general-purpose communication library, which can be RESTful, but is not RESTful from its (historical) roots. The main issue I saw with WCF is the difficulty to configure it between applications (.exe.config tuning may be confusing), and that it is a big black box. For instance, it was not possible to implement Cross-origin resource sharing with WCF when the server is hosted as a Windows service (the Access-Control-Allow-Origin: HTTP headers are deleted by WCF!): you have to host it within IIS - and can't fix the issue, whereas with a full Open Source solution, you can fix any issue.
Load-balancing can be implemented in mORMot and WCF with the same algorithm. Instead of using a round-robbin algorithm in your case, a simple routing based on the content may be enough.
Using WCF to serve business logic written in Delphi will be slow, error prone and difficult to maintain. Mixing technologies induces unneeded complexity. I would not go into this direction.
If you have an existing Delphi code base, and some Delphi skills, I guess mORMot may be a better choice. It was reported e.g. that a single server on production is able to hande more than one million requests per day, serving thousands of concurrent clients, with dedicated JavaScript process on the server side. One of the mORMot design goals was to help working with existing code and legacy projects. But I'm not 100% fair, since I'm the main maintainer of this open source project. :)

Clear MemoryCache across worker processes

I have an ASP.NET MVC application that runs on IIS 7. It is setup as a web garden and the number of worker processes matches the number of my processors. I tend to experience some heavy load at times and this setup as worked best.
I have implemented some caching using System.Web.Cache. I will occasionally need to invalidate some of items in my cache however I cannot clear the cache across all processes.
Does the .NET 4 System.Runtime.Caching features make this any easier? Here is a similar question but I hoping there is better advice with .NET 4.
Flush HttpRuntime.Cache objects across all worker processes on IIS webserver

System.Web.Cache and System.Runtime.Caching provide almost the same features, it is just a simple memory cache where items in the cache can have an expiration time, dependencies etc...
If you want to run your site on multiple physical machines or in your case you run it as web garden, caching data in any in process cache doesn't make a lot of sense because it would cache it for each process again. This will let the memory consumption grow pretty quickly I guess...
In those scenarios a distributed cache system is the best choice, because all processes can leverage the already cached data...
I worked with 2 pretty popular distributed in memory cache systems, one is memcached which was also mentioned in the your link.
The other one is the app fabric cache, here is a good example of how to use it
Memchached is a "simple" cache, it doesn't care about security and all this stuff in the first place. But it is very easy to implement and there are good .Net clients which are really simple to use, almost exactly as the .Net build in crap.
If you want to encrypt the data transfers of your cache and have all this high secured, you might want to go with app fabric...

Creating OLTP system using WCF

is it good idea realize OLTP system using WCF?
System must process 5-8k request per sec.

As noted by #nonnb in a comment, WCF is a great platform to build service oriented or distributed applications. This includes using WCF in OLTP applications (we do that here). With WCF you could theoretically keep adding servers to scale and handle the load but usually you will end up hitting some database contention (e.g. locking).
5K-8K requests per second is a large number. That translates to 300K-~500K requests per minute. To put this in perspective, if you take a look at the TPC-C benchmark results the top end of your range is almost in the top 50 results with the lower end being in (maybe) the top third of results.
Note that the Microsoft TPC-C results are C++ running in COM+ and do not involve .NET or WCF.
In terms of WCF some reading of interest would be Creating high performance WCF services and A Performance Comparison of Windows Communication Foundation. The latter is almost 4 years old so some of those performance benchmarks may have been improved over the years.

An approach towards a Distributed Computing problem using the .Net framework

Im interested in programming a Project which distributes a certain computation on large files throughout several computers. The need for distributed computing arises from the crashy and unstable nature of the software I'm using to do the actual computing - so it might crash on some computers but others will surely do the job.
The ideas I have so far Include:
-Using several servers, each pulling a task from a master server whenever its possible
-Using VMwares
-Using a load-balancing Cluster
What is more suited for the job? Any other ideas I should be aware of?
Also, If you can recommend any reliable distributed computing C# framework, it will be helpful.

haven't used any of these myself (yet), but I bookmarked this question a little while ago. some good suggestions there.

Have you looked at Hadoop MapReduce? It's an open-source implementation of Google's MapReduce framework. Though it's Java and not C#, it sounds like it could be perfectly suited to your scenario; the master server automatically handles load balancing and fault-tolerance in a distributed environment.

I would check out Appistry CloudIQ Platform. It links together multiple machines into a single computing framework identified by a unified address. Your client simply submits jobs to the unified address, and the framework distributes jobs to individual machines. It also monitors task execution, and can automatically restart failed jobs. So if your application is prone to crashing, this framework could be ideal. Rather than submitting the same job to multiple machines (and wasting CPU) to cover the failure case, just submit it once, and let the framework handle restarting the jobs that actually fail. I would consider it ideal for your reliability concerns.

Scability of .NET webservices

Can anyone help me with a question about webservices and scalability? I have written a webservice as a facade into our document management system and need to think about scalability issues. What areas should I be looking at to ensure performance and availability?
Thanks in advance

Performance is separate from scalability. Scalability means that you can add more servers to linearly increase system throughput (i.e more client connections). The best way to start is having stateless webservices. That way any client can call any of the n webservice intance on n different machines. If there is a shared database at the end for persistence that will ultimately be your bottleneck. There are ways to reduce that with data partitioning and sharding, but only when you get to that point.

First of all, decide what is acceptable behaviour of your web service. What should it be able to cope - 1000 connections per second? What response time will each connection have?
Then you need to automate the usage of your web service so you can stress test the system.
What happens when you have 100 requests per second? 1000? 10000?
Then you can make a decision about if performance is ok, if the acceptable behaviour is too strict, or if you need to do heavy performance tuning based on actual profiling data.

You should be looking to host your WCF service in IIS. IIS has a lot of performance, scalability, security etc. mechanisms built in and is the best starting point to save you reinventing the wheel.

Some of the performance is certainly due to your own code, but lets assume that it's already optimized. At that point, the additional performance scaling issues involve the service host (e.g. IIS) the machines that host it, and their network (inter/intranet) connection speeds. You'll need to do some speed tests to be sure of things.

Well it really depends on what you're doing in your web service, but the only way you're going to find out is by simulating lots of users and measuring it.
Take a look at my answer to this question: Measuring performance
When we tested our code in this manor (where the web services were hosted in Windows service(s)), we found that the bottleneck was authenticating each user in the facade service. In particular the windows component LSASS was using most of the CPU.
Luckily we were able to create new machines, each with a facade service, which then called through to our main set of web services. This enable us to scale up to a large number of users (in the region of 100,000 users using our software normally).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.