I'm working on a system which consists of a chain of data transformation components. Each of these components performs a very simple task on an object before passing it on to the next component in the chain. Each of these components runs as a Windows Service. I've designed the system in this way to improve maintenance and aid future extensibility.
At the moment, I'm passing these objects between the components using message queuing, but this is fast becoming a concern. There are likely to be tens of services all continually reading from and writing to message queues. As I understand, all messages are serialized and written to disk which, in my system where the throughput will be very fast, seems very wasteful.
Are there any other established systems which I could use for inter-process communication which avoids the overhead of continual disk reads and writes?
Thanks.
If you don't need guaranteed reliable or transactional messaging you should be able to improve performance significantly in System.Messaging by setting your Message.Recoverable properties to false. Incidentally by default this is already false so your assumption about disk usage may not be correct, assuming the original solution does use MSMQ.
Type: System.Boolean
true if delivery of the message is guaranteed (through saving the message to disk while en
route); false if delivery is not assured. The default is false.
The best solution depends on your service's requirements. You may be able to use this, or you might need some extremely low latency service bus (usually a commercial product), or something in between that you build yourself.
Windows Communication Foundation?
ZeroMQ is a memory only message queue implementation.
Related
If I have a single application running on a single computer but want to have multiple asynchronous threads running and communicating with each-other in order to control the complex behavior of machinery or robots what software design pattern would that be?
I'm specifically looking for something similar to Robot Operating System (ROS) but more in the context of a single library for c# where it handles the messages or the "message bus". There seems to be a lot of overlapping terminology for these things.
I'm essentially looking for a software implementation of a local, distributed node architecture that communicate with each-other much in the same way that nodes on the CAN bus of a car do to perform complex behavior in a distributed way.
Thanks
Your question has a lot of ambiguity. If you have a single application (read single process) then that is different from distributed node architecture.
For Single Application with multiple asynchronous threads
ROS is not the best tool to accomplish this. ROS facilitates communication across nodes using either TCP or shared memory, neither of which are required for communication within the process.
For Distributed Nodes
ROS can be a great tool for this but you need to understand its limitations. First, ROS does not guarantee any real time capabilities. The performance can of course be improved by using nodelets (shared memory) but again, no time guarantees. Second, ROS is not really distributed. It still needs a ROS Master which acts as the main registry.
I suggest you may want to look into ROS2 which uses DDS underneath. ROS2 has distributed architecture and you have the freedom to define your QoS parameters.
I look for ideas how to speed up message transfers through RabbitMQ.
I installed the latest version on Windows 64 bit, running a server on my local machine on which I also publish and consume to/from through a C# implementation. I initially maxed out at 40,000 messages per second which is impressive but does not suit my needs (I compete with a custom binary reader which can handle 24 million unparsed 16 byte large byte arrays per second; obviously I dont expect to get close to that but I attempt to improve at least). I need to send around 115,000,000 messages as fast as possible. I do not want to persist the data and the connection is gonna be direct to one single consumer. I then built chunks of my 16b byte arrays and published onto the bus without any improvement. The transfer rate maxed out at 45mb/second. I find this very very slow given the fact that in the end it should just boil down to raw transmission speed because I could create byte arrays the size of several megabytes where the efficiency rate of routing by the exchange becomes negligible vs raw transmission speed. Why does my message bus max out at 45mb/second transfer speed?
Bump...and Update: Have not seen any answer to this question in a longer time. I am a bit surprised not a single RabbitMQ developer chimed in. I played extensively with RabbitMQ and ZeroMQ. I decided that RabbitMQ is not up to the task when looking at high throughput in-process messaging solutions. The broker implementation and especially parsing logic is a major bottleneck to improving throughput. I dropped RabbitMQ from my list of possible options.
There was a white paper out describing how they provided a solution to managing low latency, high throughput options financial data streams but it sounds to me all they did was throwing hardware at it rather than providing a solution that targets low latency, high throughput requirements.
ZeroMQ, did a superb job after I studied the documentation more intensively. I can run communication in-process, it provides stable enough push/pull, pub/sub, req/rep, and pair/pair patterns which I need. I was looking for blocking logic within the pub/sub pattern which ZeroMQ does not provide (it drops messages instead when a high watermark is exceeded), but the push/pull pattern provides blocking. So, pretty much all I needed is provided for. The only gripe I have is with their understanding of event processing; the event structure implementation through poll/multiplex is not very satisfactory.
I have an application that performs analysis on incoming event flow (CEP engine).
This flow can come from different sources (database, network, etc...).
For efficient decoupling, I want this service to expose a named pipe using wcf, and allow a different application to read the data from the source and feed it into the service.
So, one process is in charge of getting and handling the incoming data while the other for analyzing it, connecting the two using wcf with named pipes binding. They both will be deployed on the same machine.
Question is, will I notice a lower throughput using wcf in the middle then if I would have simply coupled the two services into a single process and use regular events?
No, in modern mainstream operating systems, IPC will never be, can never be, as fast as in-process eventing. The reason for this is the overhead of context switching associated to activating different processes. Even for a multi-core system where distinct processes run on distinct cores, though they each run independently (and therefore there is no cost associated to activating one process versus another - they are both always active), the communication across processes still requires crossing security boundaries, hitting the network stack (even if using pipes), and so on. Where a local function call will be on the order of 1000's of cpu cycles to invoke, an IPC will be millions.
So IPC will be slower than in-process communication. Whether that actually matters in your case, is a different question. For example, suppose you have an operation that requires Monte Carlo simnulation that runs for 2 hours. In this case it really doesn't matter whether it takes 1ms or 1000ms in order to invoke the operation.
Usually, performance of the communication is not what you want to optimize for. Even if performance is important, focusing on one small aspect of performance - let's say, whether to use IPC or local function calls - is probably the wrong way to go about things.
I assumed "CEP" referred to "complex event processing" which implies high throughput, high volume processing. So I understand that performance is important to you.
But, for true scalability and reliability, you cannot simply optimize in-process eventing; You will need to rely on multiple computers and scale out. This will imply some degree of IPC, one way or the other. It's obviously important to be efficient at the smaller scale (events) but your overall top-end performance will be largely bounded by the architecture you choose for scale out.
WCF is nice because of the flexibility it allows in moving building blocks from the local machine to a remote machine, and because of the Channel stack, you can add communication services in a modular way.
Whether this is important to you, is up to you to decide.
I have an app that moves a project and its files from preview to production using a Flex front-end and a .NET web service. Currently, the process takes about 5-10 mins/per project. Aside from latency concerns, it really shouldn't take that long. I'm wondering whether or not this is a good use-case for multi-threading. Also, considering the user may want to push multiple projects or one right after another, is there a way to queue the jobs.
Any suggestions and examples are greatly appreciated.
Thanks!
Something that does heavy disk IO typically isn't a good candidate for multithreading since the disks can really only do one thing at a time. However, if you're pushing to multiple servers or the servers have particularly good disk subsystems some light threading may be beneficial.
As a note - regardless of whether or not you decide to queue the jobs, you will use multi-threading. Queueing is just one way of handling what is ultimately solved using multi-threading.
And yes, I'd recommend you build a queue to push out each project.
You should compare the speed of your code compared to just copying in Windows (i.e., explorer or command line) vs copying with something advanced like TeraCopy. If your code is significantly slower than Window then look at parts in your code to optimize using a profiler. If your code is about as fast as Windows but slower than TeraCopy, then multithreading could help.
Multithreading is not generally helpful when the operation I/O bound, but copying files involves reading from the disk AND writing over the network. This is two I/O operations, so if you separate them onto different threads, it could increase performance. For something like this you need a producer/consumer setup where you have a Circular queue with one thread reading from disk and writing to the queue, and another thread reading from the queue and writing to the network. It'll be important to keep in mind that the two threads will not run at the same speed, so if the queue gets full, wait before writing more data and if it's empty, wait before writing. Also the locking strategy could have a big impact on performance here and could cause the performance to degrade to slower than a single-threaded implementation.
If you're moving things between just two computers, the network is going to be the bottleneck, so you may want to queue these operations.
Likewise, on the same machine, the I/O is going to be the bottleneck, so you'd want to queue there, too.
You should try using the ThreadPool.
ThreadPool.QueueUserWorkItem(MoveProject, project);
Agreed with everyone over the limited performance of running the tasks in parallel.
If you have full control over your deployment environment, you could use Rhino Queues:
http://ayende.com/Blog/archive/2008/08/01/Rhino-Queues.aspx
This will allow you to produce a queue of jobs asynchronously (say from a WCF service being called from your Silverlight/Flex app) and consume them synchronously.
Alternatively you could use WCF and MSMQ, but the learning curve is greater.
When dealing with multiple files using multiple threads usually IS a good idea in concerns of performance.The main reason is that most disks nowadays support native command queuing.
I wrote an article recently about reading/writing files with multiple files on ddj.com.
See http://www.ddj.com/go-parallel/article/showArticle.jhtml?articleID=220300055.
Also see related question
Will using multiple threads with a RandomAccessFile help performance?
In particular i made the experience that when dealing with very many files it IS a good idea to use a number of threads. In contrary using many thread in many cases does not slow down applications as much as commonly expected.
Having said that i'd say there is no other way to find out than trying all possible different approaches. It depends on very many conditions: Hardware, OS, Drivers etc.
The very first thing you should do is point any kind of profiling tool towards your software. If you can't do that (like, if you haven't got such a tool), insert logging code.
The very first thing you need to do is figure out what is taking a long time to complete, and then why is it taking a long time to complete. That your "copy" operation as a whole takes a long time to complete isn't good enough, you need to pinpoint the reason for this down to a method or a set of methods.
Until you do that, all the other things you can do to your code will likely be guesswork. My experience has taught me that when it comes to performance, 9 out of 10 reasons for things running slow comes as surprises to the guy(s) that wrote the code.
So measure first, then change.
For instance, you might discover that you're in fact reporting progress of copying the file on a byte-per-byte basis, to a GUI, using a synchronous call to the UI, in which case it wouldn't matter how fast the actual copying can run, you'll still be bound by message handling speed.
But that's just conjecture until you know, so measure first, then change.
I currently have an application that sends XML over TCP sockets from windows client to windows server.
We are rewriting the architecture and our servers are going to be in Java. One architecture we are looking at is a REST architecture over http. So C# WinForm clients will send info using this. We are looking for high throughput and low latency.
Does anyone have any performance metrics on this approach versus some other C# client to Java server communication options.
This isn't really well-enough defined to make any metric statements; how big are the messages, how often would you be hitting the REST service, is it straight HTTP or do you need to secure it with SSL? In other words, what can you tell us about the workload parameters?
(I say this over and over again on performance questions: unless you can tell me something about the workload, I can't -- nobody really can -- tell you what will give better performance. That's why they used to say you couldn't consider performance until you had an implementation: it's not that you can't think about performance, it's that people often couldn't or at least wouldn't think about workload.)
That said, though, you can make some good estimates simply by looking at how many messages you want to exchange, because setup time for TCP/IP often dominates REST. REST offers two advantages here: first, the TCP/IP time often does dominate the message transmission, and that's pretty well optimized in production web servers like Apache or lighttpd; second, a RESTful architecture enhances scalability by eliminating session state. That means you can scale freely using just a simple TCP/IP load balancer.
I would set up a test to try it and see. I understand that the only part of your application you're changing is the client/server communication. So analyse what you're sending now, and put together a test client/server setup sending messages which are representative of what you think your final solution is going to be doing (perhaps representative only in terms of size/throughput).
As noted in the previous post, there's not enough detail to really judge what the performance is going to be like. e.g.
is your message structure/format going to be the same, but merely over HTTP rather than raw sockets ?
are you going to be sending subsets of XML data ? Processing large quantities of XML can be memory intensive (e.g. if you're using DOM-based approach).
What overhead is your chosen REST framework going to be introducing (hopefully very little, but at the moment we don't know).
The best solution is to set something up using (say) Jersey and spend some time testing various scenarios. If you're re-architecting a solution, it's going to be worth a few days investigating performance (let alone functionality, ease of development etc.)
It's going to be plenty fast, unless you have a very, very large number of concurrent clients hitting those servers. The XML shredding keeps getting faster in both Java and .NET. If you are on CLR2 and Java 5 or above, you will be fine. But of course you still need to do the tests to verify.
We've tested in our lab, REST and SOAP transactions, and they are faster than you might think. Tens of thousands of messages per second. Small numbers of modern CPUs generating XML messages can easily saturate a gigabit network. In other words, the network is the bottleneck (transmission of data), not the CPU (serializing & de-serializing XML).
AND, If you do your software design properly, in the very unlikely situation where REST is not sufficient, then swapping out the message format layer (REST => protobufs) will get you better transmission perf, with minimal disruption.
But before you need to go there, you will be able to send some money to Cisco and get lots more headroom.