WCF calls CrossCallContamination (that's a thing right?) - c#

I have a huuuuge problem. While developing a solution for a customer i never had a problem with this but after deploying the solution in their environment more and more issues are surfacing.
Setup
We have an (wpf) application that gets served some basic html markup and gets rendered (through Essential Objects). This is called through a (now) PERCALL setup of WCF.
Problem
Sometimes the call of user1 gets mixed up and gets served to user2. This happens more frequent as more users connects and uses the program. Since the data is used in day-to-day operations this is viewed as bad.
Bad solution
We've implemented a solution that, should work but the problem needs to be fixed anyway. We call each service three times and compares the results in order to ensure that the majority of responses are correct. Since the problem happens without any sort of pattern other than when many users are connected.
Better solution
I've been looking over WEBAPI/2 for a potential replacement structure for handling our calls, however we send large amounts of data and objects that, put simply, would require some reconstruction on both the server and client, and time is of the essence at this junction.
Googling has yielded few, if any, results on this. What am I missing?

Related

Pattern or library for caching data from web service calls and update in the background

I'm working on a web application that uses a number of external data sources for data that we need to display on the front end. Some of the external calls are expensive and some also comes with a monetary cost so we need a way to persist the result of these external requests to survive ie a app restart.
I've started with some proof of concept and my current solution is a combination of a persistent cache/storage (stores serialized json in files on disk) and a runtime cache. When the app starts it will populate runtime cache from the persistent cache, if the persistent cache is empty it would go ahead and call the webservices. Next time the app restarts we're loading from the persistent cache - avoiding the call to the external sources.
After the first population we want the cache to be update in the background with some kind of update process on a given schedule, we also want this update process to be smart enough to only update the cache if the request to the webservice was successful - otherwise keep the old version. Theres also a twist here, some webservices might return a complete collection while others requires one call per entity - so the update-process might differ depending on the concrete web service.
I'm thinking that this senario can't be totally unique, so I've looked around and done a fair bit of Googleing but I haven't fund any patterns or libraries that deals with something like this.
So what I'm looking for is any patterns that might be useful for us, if there is any C#-libraries or articles on the subject as well? I don't want to "reinvent the wheel". If anyone have solved similar problems I would love to hear more about how you approached them.
Thank you so much!

Handling limitations in multithreaded server

In my client-server architecture I have few API functions which usage need to be limited.
Server is written in .net C# and it is running on IIS.
Until now I didn't need to perform any synchronization. Code was written in a way that even if client would send same request multiple times (e.g. create sth request) one call will end with success and all others with error (because of server code + db structure).
What is the best way to perform such limitations? For example I want no more that 1 call of API method: foo() per user per minute.
I thought about some SynchronizationTable which would have just one column unique_text and before computing foo() call I'll write something like foo{userId}{date}{HH:mm} to this table. If call end with success I know that there wasn't foo call from that user in current minute.
I think there is much better way, probably in server code, without using db for that. Of course, there could be thousands of users calling foo.
To clarify what I need: I think it could be some light DictionaryMutex.
For example:
private static DictionaryMutex FooLock = new DictionaryMutex();
FooLock.lock(User.GUID);
try
{
...
}
finally
{
FooLock.unlock(User.GUID);
}
EDIT:
Solution in which one user cannot call foo twice at the same time is also sufficient for me. By "at the same time" I mean that server started to handle second call before returning result for first call.
Note, that keeping this state in memory in an IIS worker process opens the possibility to lose all this data at any instant in time. Worker processes can restart for any number of reasons.
Also, you probably want to have two web servers for high availability. Keeping the state inside of worker processes makes the application no longer clustering-ready. This is often a no-go.
Web apps really should be stateless. Many reasons for that. If you can help it, don't manage your own data structures like suggested in the question and comments.
Depending on how big the call volume is, I'd consider these options:
SQL Server. Your queries are extremely simple and easy to optimize for. Expect 1000s of such queries per seconds per CPU core. This can bear a lot of load. You can use a SQL Express for free.
A specialized store like Redis. Stack Overflow is using Redis as a persistent, clustering-enabled cache. A good idea.
A distributed cache, like Microsoft Velocity. Or others.
This storage problem is rather easy because it fits a key/value store model well. And the data is near worthless so you don't even need to backup.
I think you're overestimating how costly this rate limitation will be. Your web-service is probably doing a lot more costly things than a single UPDATE by primary key to a simple table.

Could we replace multiple WCF calls by a single WCF call?

I am currently working with mvc4 application that reads data from a set of wcf services. Currently when a user hits a page number, if wcf requests are triggered to get data for different parts of the page. I want to improve its performance.
My idea is, when a user lands on a page a single wcf call is made which retrieves all the necessary data that the multiple calls previously did and put the data from it in to the users request httpcontext.
Is this improving performance than the approach single but larger wcf call over named pipes or multiple smaller calls under named pipes? Are there any performance implications of putting a large set of data in to the httpcontext?
I think you are trying to solve one problem by producing even more problems.
If you query all the data at a time and store in httpcontext it will speed up performance for opening new pages but it will take considerably longer to open the page for the first time. Also you may easily run out of memory especially if you have many users at a time if storing data in httpcontext per a user.
I think first you need to localize the problem and find the root cause of poor performance. It may be a query or it may be some database locks.
in any case caching is a good idea, but don't use httpcontext for it. Use ASP.NET cahe or some distributed cache like App Fabric. These tools will provide you with a lot of built-in features and it will be easier for you to then scale your application.
Hope it helps.

SQL (or linq to sql) works slower when called from winforms than when called on web application?

We ran into strange sql / linq behaviour today:
We used to use a web application to perform some intensive database actions on our system. Recently we moved to a winforms interface for various reasons.
We found out that performance has seriously decreased: an action that used to take about 15 minutes now takes as long as one whole hour. The strange thing is that It's the exact same method being called. The method performs quite a bit of read / write using linq2sql, and profiling on the client machine showed that the problematic section is on the SQL action itself, in the linq's "Save" method.
The only difference between the cases is that on one case the method is called from a web application's code behind (MVC in this case), and on the other from a windows form.
The one idea I could come up with is that SQL performance has something to do with the identity of the user accessing the db, but I could not find any support for that assumption.
Any ideas?
Did you run both tests from the same machine? If not hardware differences could be the issue... or network... one could be in a higher speed section of your network... like in the same vlan as the sql server. Try running the client code on the same server the web app was running on.
Also if your app is updating progress in a sycronous manner the app could be waiting a long time for display to update... as apposed to working with a stream ala response.write.
If you are actually outputting progress as you go you should make sure that the progress updates are events and that the display of those happens on another thread so that the processing isn't waiting on display. Actually you probably should put the processing on its own thread... and just have an event handler take care of the updates... that is a whole different discussion. The point is that your app could be waiting to update the display of progress.
It's a very old issue but I happened to run into the question just now. So for whom is may concern nowadays, the solution (and there-before the problem) was frustratingly silly. Linq2SQL was configured on the dev machines to constantly write a log to console.
This was causing a huge delay due to the simple act of outputing large amount of text to the console. On the web server the log was not being written, and therefore - no performance drawback. There was a colossal face-palming once we figured this one out. Thanks for the helpers, I hope this answer will help someone solve it faster next time.
Unattended logging. That was the problem.

High availability & scalability for C#

I've got a C# service that currently runs single-instance on a PC. I'd like to split this component so that it runs on multiple PCs. Each PC should be assigned a certain part of the work. If one PC fails, its work should be moved to a backup machine.
Data synchronization can be done by the DB, so that should not be much of an issue. My current idea is to use some kind of load balancer that splits and sends the incoming requests to the array of PCs and makes sure the work is actually processed.
How would I implement such a functionality? I'm not sure if I'm asking the right question. If my understanding of how this goal should be achieved is wrong, please give me a hint.
Edit:
I wonder if the idea given above (load balancer splitswork packages to PCs and checks for result) is feasible at all. If there is some kind of already implemented solution so this seemingly common problem, I'd love to use that solution.
Availability is a critical requirement.
I'd recommend looking at a Pull model of load-sharing, rather than a Push model. When pushing work, the coordinating server(s)/load-balancer must be aware of all the servers that are currently running in your system so that it knows where to forward requests; this must either be set in config or dynamically set (such as in the Publisher-Subscriber model), then constantly checked to detect if any servers have gone offline. Whilst it's entirely feasible, it can complicate the scaling-out of your application.
With a Pull architecture, you have a central work queue (hosted in MSMQ, Sql Server Service Broker or similar) and each processing service pulls work off that queue. Expose a WCF service to accept external requests and place work onto the queue, safe in the knowledge that some server will do the work, even though you don't know exactly which one. This has the added benefits that each server monitors it's own workload and picks up work as-and-when it is ready, and you can easily add or remove servers to/from this model without any change in config.
This architecture is supported by NServiceBus and the communication between Windows Azure Web & Worker roles.
From what you said each PC will require a full copy of your service -
Each PC should be assigned a certain
part of the work. If one PC fails, its
work should be moved to a backup
machine
Otherwise you won't be able to move its work to another PC.
I would be tempted to have a central server which farms out work to individual PCs. This means that you would need some form of communication between each machine and and keep a record back on the central server of what work has been assigned where.
You'll also need each machine to measure it's cpu loading and reject work if it is too busy.
A multi-threaded approach to the service would make good use of those multiple processor cores that are ubiquitoius nowadays.
How about using a server and multi-threading your processing? Or even multi-threading on a PC as you can get many cores on a standard desktop now.
This obviously doesn't deal with the machine going down, but could give you much more performance for less investment.
you can check windows clustering, and you have to handle set of issues that depends on the behaviour of the service (you can put more details about the service itself so I can answer)
This depends on how you wanted to split your workload, this usually done by
Splitting the same workload by multiple services
Means same service being installed on
different servers and will do the
same job. Assume your service is reading huge data from the db servers and processing them to produce huge client specific datafiles and finally this datafile is been sent to the clients. In this approach all your services installed in diff servers will do the same work but they split the work to increaese the performance.
Splitting the part of the workload by multiple services
In this approach each service will be assigned to the indivitual jobs and works on different goals. in above example one serivce is responsible for reading data from db and generating huge data files and another service is configured only to read the data file and send it to clients.
I have implemented the 2nd approach in one of my work. Because this let me isolate and debug the errors in case of any failures.
The usual approach for load balancer is to split service requests evenly between all service instances.
For each work item (request) you can store relative information in database. Then each service should also have at least one background thread checking database for abandoned work items.
I would suggest that you publish your service through WCF (Windows Communication Foundation).
Then implement a "central" client application which can keep track of available providers of your service and dish out work. The central app will act as scheduler and load balancer of the tasks to be performed.
Check out Juwal Lövy's book on WCF ("Programming WCF Services") for a good introduction on this topic.
You can have a look at NGrid : http://ngrid.sourceforge.net/
or Alchemi : http://www.gridbus.org/~alchemi/index.html
both are grid computing framework with load balancers that will get you started in no time.
Cheers,
Florian

Categories