I have a core .NET application that needs to spawn an arbitrary number of subprocesses. These processes need to be able to access some form of state object in the core application.
What is the best technique? I'll be moving a large amount of data between processes (Bitmaps), so it needs to be fast.
WCF would probably fit the bill.
Here's a really good article on .NET Remoting for performing distributed intensive analysis. Though Remoting has been replaced with the WCF, the article is relevant and shows how to make the calls asynchronously, etc.
This article contrasts WCF to .NET Remoting—the key takeaway here shows that WCF throughput outperforms Remoting for small data, but it approaches Remoting performance as data size increases.
I have similar requirements and am using Windows Communication Foundation to do that right now. My data sizes are probably a bit smaller though.
For reference I'm doing about 30-60 requests of about 5 KB-30 KB per second on a quad-core machine. WCF has been holding up quite well so far.
With WCF you have the added advantages of choosing a transport protocol and security mode that is suitable for your application.
I'd be hesitant to move large data around. I'd be inclined to move pointers to large data around instead, i.e., memory mapped files.
If you truly need to have separate processes there is always named pipes which would perform quite well.
However, would a application domain boundary suffice? Then you could do object marshalling and things would be a lot easier. You application could work shared instances of the same object by using the MarshalByRefObject attribute.
You can use .NET Remoting for inter-process communication (IPC) with IpcChannel. Otherwise you can search for shared memory wrappers and other IPC forms.
There is an MSDN article comparing WCF to a variety of methods including Remoting. However, unless I am reading the bar graph wrong, it shows Remoting to be the same or slightly better (unlike the other comment said).
There is also a blog post about WCF vs. Remoting. The blog post clearly shows Remoting is faster for binary objects and if you are passing Bitmaps (binary objects) then it seems Remoting or shared memory or other IPC option might be faster, although WCF might not be a bad choice anyway.
Related
If I have a single application running on a single computer but want to have multiple asynchronous threads running and communicating with each-other in order to control the complex behavior of machinery or robots what software design pattern would that be?
I'm specifically looking for something similar to Robot Operating System (ROS) but more in the context of a single library for c# where it handles the messages or the "message bus". There seems to be a lot of overlapping terminology for these things.
I'm essentially looking for a software implementation of a local, distributed node architecture that communicate with each-other much in the same way that nodes on the CAN bus of a car do to perform complex behavior in a distributed way.
Thanks
Your question has a lot of ambiguity. If you have a single application (read single process) then that is different from distributed node architecture.
For Single Application with multiple asynchronous threads
ROS is not the best tool to accomplish this. ROS facilitates communication across nodes using either TCP or shared memory, neither of which are required for communication within the process.
For Distributed Nodes
ROS can be a great tool for this but you need to understand its limitations. First, ROS does not guarantee any real time capabilities. The performance can of course be improved by using nodelets (shared memory) but again, no time guarantees. Second, ROS is not really distributed. It still needs a ROS Master which acts as the main registry.
I suggest you may want to look into ROS2 which uses DDS underneath. ROS2 has distributed architecture and you have the freedom to define your QoS parameters.
I'm considering using WCF or mormot as frameworks for RESTful service, where the code of business / legacy that needs to be accessed is written in Delphi. Performance is a premise in the project.
The application must be prepared for load balancing. The clients of REST service Desktops are Windows applications. These desktop clients allow the user to view large volumes of data, with huge resultsets in SQL statements. What is the best way to implement a service to cache a recordset and consume it slowly through the REST service. Can demonstrate a good example? The recordset must be cached in the session until the client completed the consultation or decided to do the full fetch. I'm looking for the right architecture?
Enabling load balancing will work in WCF? Due to the recordset being cached on a single server, with the row fetch requests, if any, must fall on the same server.
Both WCF and mORMot share the same high-performance kernel-mode http.sys server. Both feature IOCP and multi-threading.
For performance, mORMot will be lighter, will allocate (much) less memory, won't be affected by Garbage Collector freezes, and is able to get JSON content directly from the database engine (by-passing most temporary data conversion and allocation) - so that you can achieve amazing speed. In short, mORMot was designed for performance of serving REST/JSON content from the ground up - with a multi-threaded kernel (whereas e.g. node.js is mono-threaded). If your purpose is also to cache some data, mORMot works very well as 64 bit native services, giving access to all your system RAM if needed, and has built-in real-time content compression.
WCF is a great general-purpose communication library, which can be RESTful, but is not RESTful from its (historical) roots. The main issue I saw with WCF is the difficulty to configure it between applications (.exe.config tuning may be confusing), and that it is a big black box. For instance, it was not possible to implement Cross-origin resource sharing with WCF when the server is hosted as a Windows service (the Access-Control-Allow-Origin: HTTP headers are deleted by WCF!): you have to host it within IIS - and can't fix the issue, whereas with a full Open Source solution, you can fix any issue.
Load-balancing can be implemented in mORMot and WCF with the same algorithm. Instead of using a round-robbin algorithm in your case, a simple routing based on the content may be enough.
Using WCF to serve business logic written in Delphi will be slow, error prone and difficult to maintain. Mixing technologies induces unneeded complexity. I would not go into this direction.
If you have an existing Delphi code base, and some Delphi skills, I guess mORMot may be a better choice. It was reported e.g. that a single server on production is able to hande more than one million requests per day, serving thousands of concurrent clients, with dedicated JavaScript process on the server side. One of the mORMot design goals was to help working with existing code and legacy projects. But I'm not 100% fair, since I'm the main maintainer of this open source project. :)
Currently I am working on a multi-process desktop application on Windows. This application will be a shrink wrapped application which will be deployed on client machines across the world. While we can have broad specifications for the machines - e.g. Windows XP SP3 with .Net 4.0 CF, we wont have control over them and we cant be too specific on their configuration - e.g. we cannot specify the machine must have a cuda 1.4 capable graphic processor etc.
Some of these processes are managed (.Net 4.0) and others are unmanaged (C++ Win32). The processes need to share data. The options I have evaluated to date are
Tcp sockets
Named Pipes
Pipes seem to perform a little better, but for our needs - performance from both are acceptable. And sockets give us the flexibility of crossing machine (and operating systems - we would like to support non-Microsoft OSes eventually) boundaries in the future hence our preference for going with sockets.
However - my major concern is this - If we use Tcp sockets - are we likely to run into issues with firewalls? Has anyone else deployed desktop applications / programs that use TCP for IPC and experienced issues? If so - what kind?
I know this is a fairly open ended question and I will be glad to rephrase. But I would really like to know what kind of potential problems we are likely to run into.
edit: To throw a little more light - we are only transporting a few PODs, ints, floats and strings. We have built a layer of abstraction that offers 2 paradigms - a request/response and subscription . The transport layer has been abstracted away and currently we have two implementations - pipe based and TCP based.
Performance of pipes is often better on a fast LAN but TCP is often better on slower networks or WANs. See msdn points below.
TPC is also more configurable. Concerning firewalls, they allow you to open/close communication ports. If that's not an option or a concern, an alternative would be http (REST/json, web service, xml rpc, etc...) but you have to consider if the http overhead is acceptable. Make sure you try it with real world datasets (passing trivial data in a test makes the overhead seem unreasonable, which would be very reasonable with a real world data set).
Some other info from msdn:
In a fast local area network (LAN) environment, Transmission Control
Protocol/Internet Protocol (TCP/IP) Sockets and Named Pipes clients
are comparable in terms of performance. However, the performance
difference between the TCP/IP Sockets and Named Pipes clients becomes
apparent with slower networks, such as across wide area networks
(WANs) or dial-up networks. This is because of the different ways the
interprocess communication (IPC) mechanisms communicate between peers.
For named pipes, network communications are typically more
interactive. A peer does not send data until another peer asks for it
using a read command. A network read typically involves a series of
peek named pipes messages before it begins to read the data. These can
be very costly in a slow network and cause excessive network traffic,
which in turn affects other network clients.
It is also important to clarify if you are talking about local pipes
or network pipes. If the server application is running locally on the
computer running an instance of Microsoft® SQL Server™ 2000, the local
Named Pipes protocol is an option. Local named pipes runs in kernel
mode and is extremely fast.
For TCP/IP Sockets, data transmissions are more streamlined and have
less overhead. Data transmissions can also take advantage of TCP/IP
Sockets performance enhancement mechanisms such as windowing, delayed
acknowledgements, and so on, which can be very beneficial in a slow
network. Depending on the type of applications, such performance
differences can be significant.
TCP/IP Sockets also support a backlog queue, which can provide a
limited smoothing effect compared to named pipes that may lead to pipe
busy errors when you are attempting to connect to SQL Server.
> In general, sockets are preferred in a slow LAN, WAN, or dial-up
network, whereas named pipes can be a better choice when network speed
is not the issue, as it offers more functionality, ease of use, and
configuration options.
For more information about TCP/IP, see the Microsoft Windows NT®
documentation.
If you need to impersonate the named pipe client's security credentials, there's really only one option :) And named pipes also have nicer names (although DNS SRV records can provide those for TCP ports also).
Otherwise, there's not much difference. Both treat the data as a stream of bytes, leaving you responsible for finding message boundaries yourself. Named pipes have an additional option of keeping message boundaries for you, but be warned, you must both create the pipe in message mode and explicitly set the read mode as well.
If I correctly understand your requirements you need to communicate between processes running on the same computer. The processes probably run all in the same security context of the user which is logged on interactively.
In the case I should mention that there are different aspects of the solution. One problem is just to share the data between the applications. Another problem is the protocol which defines how the common data could be accessed and modified and how the communication between the processes take place. You can have for example one process which provide the data and all another subscribe the data. Another case: you can have common data which can be read or modified by all the applications and you need just be sure that nobody modify the shared data on the same time or nobody access the data during another modify it. Of cause it could be many other different communication scenarios.
Under the aspect I would suggest you two other options which you don't included in your question:
usage memory mapped files (see here and here)
usage of COM interface
Both ways can be good implemented in both .NET and unmanaged C++. The usage of memory mapped files is the best way from the performance point of view. If you create View which will be not associated with some physical file you will have just common memory which can be used between processes. You can use additionally an Mutex or Event to control that the memory will be not used at the same time by multiple applications.
In the most simple scenario you can even use #pragma data_seg in C++ to place some data in the named section of DLL and use /SECTION option (like /SECTION:.MYSEC,RWS) to make the data shared. You can use the DLL in all your .NET applications and in all unmanaged C++ application to access the common data. In the way you will have simple way to access to the common data.
If you need to have some more complex communication scenario the approach with COM interface in C++/.NET could be the best choice. In case of I would recommend you the article which describes step by step how to implement Primary Interop Assembly with the COM interface only in .NET and uses it in both .NET and C++ COM for the communication.
is it good idea realize OLTP system using WCF?
System must process 5-8k request per sec.
As noted by #nonnb in a comment, WCF is a great platform to build service oriented or distributed applications. This includes using WCF in OLTP applications (we do that here). With WCF you could theoretically keep adding servers to scale and handle the load but usually you will end up hitting some database contention (e.g. locking).
5K-8K requests per second is a large number. That translates to 300K-~500K requests per minute. To put this in perspective, if you take a look at the TPC-C benchmark results the top end of your range is almost in the top 50 results with the lower end being in (maybe) the top third of results.
Note that the Microsoft TPC-C results are C++ running in COM+ and do not involve .NET or WCF.
In terms of WCF some reading of interest would be Creating high performance WCF services and A Performance Comparison of Windows Communication Foundation. The latter is almost 4 years old so some of those performance benchmarks may have been improved over the years.
Named Pipes ? XML-RPC ? Standard Input-Output ? Web Services ?
I would not use unsafe stuff like Shared Memory and similar
Named pipes would be the fastest method, but it only works for communication between processes on the same computer. Named pipes communication doesn't go all the way down the network stack (because it only works for communication on the same computer) so it will always be faster.
Anonymous Pipes may only be used on the local machine. However, Named Pipes may traverse the network.
I left out Shared Memory since you specifically mentioned that you don't want to go that route. Shared Memory would be even faster than named pipes tho.
So it depends if you only need to communicate between processes on the same computer or different computers. Any XML-based communication protocol (eg. Web Services) will usually be slower due to the massive overhead in XML.
i don't think there's a quick answer to this. if i was you, i would buy/borrow a copy of Advanced Programming in the Unix Environment (APUE) by Stevens and Rago and read Chapter 15 and 16 on IPC. It's a brilliant book if you really want to understand how *nix (a lot of it applies to any POSIX system) works down to the kernel level.
If you must have a quick answer, i would say the following (without putting a huge amount of thought into it), in descending order of efficiency:
Local Machine IPC
Shared Memory/Memory Mapped
files
Named Pipe/FIFO (only between
related processed - i.e. fork)
Unix Domain Socket
Network IPC/Internet Sockets
Datagram Sockets
Stream Sockets
Raw Sockets
At both levels, you are going to have to think about how the data you transfer is encoded/decoded and trade off between memory usage and CPU utilization.
At the Network level, you will have to consider what layers of protcols you are going to run on top of. Most commonly, at the bottom of the application layer you will be choosing between TCP/IP or UDP. TCP has a lot more overhead as it is does error correction, checksumming and lots of other stuff. if you need in order delivery of messages you need to use TCP as opposed to UDP.
On top of these are other protocols like HTTP, SOAP (on top of HTTP or another protocol like FTP/SMTP etc.). A binary protocol is going to be more efficient as long as you are network bound rather than CPU bound. If using SOAP on the MS.Net platform, then binary encoding of the messages is going to be quicker across the network but may be more CPU intensive.
I could go on. it's not a simple question. Learning where the latencies are and how buffering is handled are key to being able to make decisions on the trade offs you are always forced to with IPC. I'd recommend the APUE book above if you really want to know what is going on under the hood...
Windows Messaging is one of the fastest ways for IPC, after-all Windows is built on them.
It's possible to use WM_COPYDATA with IPInvoke calls to exchange data between 2 form based .Net applications, and I've got an open source library for doing exactly that. I've bench marked around 1771 msg/sec on a fairly hot laptop.
http://thecodeking.github.com/XDMessaging.Net
I don't know why you won't go with shared memory, but its very very fast from C# to C# apps on the same machine, and very reliable (unlike TCP sockets). spazzarama/SharedMemory is a fantastic C# lib that supports shared arrays and buffers with a simple high level API. You just initialize the class with a common memory file name (on client/server sides), and then update the array. Values magically appear on the other side!