I'm going to split a program into two parts, because I'm running out of process memory. One part is taking a picture and storing it on the file system (GUI) and the other part is analyzing the picture (OCR) and reporting the results back to the main part.
The communication between the two processes will look like this:
Is the OCR process responding?
If not, start OCR process.
Tell the OCR process that there is a new picture.
Wait until the OCR process returns the result (most likely less than 1 KB of characters)
The three most important things, in order of priority for me are:
High performance
High stability
Low complexity - I've only got around three days to finish and test the program.
The GUI is written in .NET/C#, so the solution must be compatible with that. Which method of IPC would you recommend me to use?
I'd probably use point to point queues for this. They perform very well and are stable - the kernel uses them for it's own notification system. The MSDN article already has the managed classes built for using them, so complexity is also low.
You could use WCF for Windows Mobile. Microsoft have released guidelines and sample projects for how to do this. If you set it up to use message queue end points (I'm not sure if named pipes are available), then performance should be very good. Apart from that, WCF is a very easy technology to get started with. Good luck!
Related
Given that the familiar form of .NET is run on Windows, which is not a real-time O/S, and MONO runs on Linux (standard kernel is also not a real-time O/S).
Given also, that any memory allocation scheme offering garbage collection (as in "managed" .NET), and indeed any heap memory scheme will introduce non-deterministic, potentially non-trivial delays into an application's execution behavior.
Is there any combination of alternate host O/S and coding paradigm in which one can leverage all of the power and conveniences of C# .NET while implementing a solution which can execute designated portions of code within tightly specified time constraints? e.g. start a C# method every 10ms to a tolerance of less than 1ms, with completion time determined only by the work performed in the method itself?
Obviously, the application would have to be carefully written; time-critical code would have to avoid memory allocations; the application would have to have completed all its memory allocation etc. work and have no other threads active once the hard real-time loop is started. Also, the host O/S would have to support real-time scheduling.
Is this possible within the .NET / MONO framework, or is it precluded by the design of the .NET runtime, framework, and O/Ss on which it (or compatible equivalent) is supported?
For example: is it possible to do reliable fine-grained (~1ms) machine control purely in C# with something like NETduino, or do they have limits or require alternate strategies for such applications?
Short Answer: No.
Longer answer: The closest you can get is running the .net Micro Framework directly on Hardware, but the TinyCLR still doesn't give you deterministic timings. Microsoft has Windows CE/Windows Embedded Compact as their real time offering, but even that is only real time for slower tasks (I believe somewhere in the range of 50 microseconds or more - not sure if that qualifies for Hard Real Time)
I do not know if it were technically possible to create a real-time c# implementation, but no one has done one and even .net native isn't made for that.
Can C# be used for hard real-time? Yes
When we talk about real-time it's most often (if not always) about robotics and IoT. And for that we almost always go with one of these options (forget Windows CE and Windows 10 IoT):
Microcontrollers (example: Arduino, RPi Pico, NodeMCU)
Linux based SBCs (example: Raspberry Pi, BeagleBone, Rock Pi)
Microcontrollers are by nature real-time. Basically the device will just run a loop forever (there are interrupts and multi-threading on some chips though). Top languages in this category are C/C++ and MicroPython. But C# can also be used:
Wilderness Labs (Netduino and Meadow F7)
.NET nanoframefork (several boards)
The second option (Linux based SBCs) is a bit more tricky. The OS has complete control over the hardware and it has a scheduler. That way many processes can be run on just one CPU. The OS itself has a lot of housekeeping as well.
Linux has a set of scheduling APIs that can be used to tell the OS that we want you to favor our process over others. And the OS will do its best to comply but no guarantees. This is usually called soft real-time. In .NET you can use the Process.PriorityClass to change your process's nice value. Depending on how busy the OS is and the amount of resources available (CPUs and memory) you might get satisfying results.
Other than that, Linux also provides hard real-time capabilities with the PREEMT_RT patch, and there is also a feature that you can isolate a CPU core for your selected processes. But to my knowledge .NET does not have any API to use these capabilities (P/Invoke may work).
I have a Windows Store app with an option to export certain data in a video file format. My app is in C#, but the encoding itself is handled by dropping into a C++ library adapted from this sample by David Catuhe and is working well. The problem is that I have found is that the encoding process can take a long time when run at high quality, and if the screen times out (say, on a Surface RT) or the user switches apps, the process fails. I'm not entirely sure what the source of the failure is and am working to verify it, but even if the process were able to survive suspension without changes, I don't know how to handle being tombstoned.
I can live with the encoding being interrupted in certain situations. What I don't want is to have to start over from scratch if the app goes away for some reason.
As far as I can tell, it isn't feasible to simply close the stream without finalizing the video and resume writing to it later. In light of this, I have considered a few options, but I can't tell which, if any, might actually work. I'd be very grateful for some direction.
1) If possible, it'd be great to be able to simply close the stream and reopen it later, picking up where I left off. At the moment I haven't been able to get this to work, but if it SHOULD work I'd love to know.
2) Push the encode process to a background task, either from the start or only when tombstoned. But is there a way to pass an open stream from my app to a background task? If not, is there a way to get my app's background task to run without CPU/memory limitations at least while my app is in the foreground? Because doing a whole encode within the very tight constraints that normally bind background tasks would take years.
3) Render segments of the video progressively while the app is in the foreground and then stitch the parts together at the end. This way, if the encode is interrupted I can pick up at the most recent segment. From my reading this should be possible in theory (I think it falls under the category of remuxing, which would avoid the need to re-encode the video). But I haven't found any samples that cover this scenario, not even in C++ (which I have almost no experience with). The Transcode API doesn't seem to cover joining multiple samples. I've looked into using SharpDX to do it, but the most likely candidate for what I'd want to use (a Media Session) is only exposed for desktop apps.
4) Push the work off to either a desktop or server app. The problem is I want to have this run on Windows RT (so desktop is out) and I don't currently have a business model that can support servers capable of handling such intensive work on my customers' behalf.
So my question is, what is my best line of attack here? Is there any way to hold onto my stream across suspension? And if, as I suspect, option #3 is my best bet, do you know of any samples or guides on how to do it? Obviously C# options would be very much preferred, so I hope I am overlooking one. C++ might be OK (as it was with Mr. Catuhe's sample that got me this far), but I'm afraid I'd need some pretty specific guidance. The MSDN documentation on this, incidentally, is so high-level that I have only a vague idea of even which pieces I would need to assemble and what each requires, let alone how to write the actual program in C++.
Any help you could offer would be very much appreciated.
Unfortunately I don't have enough reputation points on SO to just comment so I have to give this as an answer.
You could consider a combination of #3 and #4. Render in segments within your app and then upload the segments for stitching together. This would bring you back into the realms of using a commodity solution to create your final output.
Can C# be used for developing a real-time application that involves taking input from web cam continuously and processing the input?
You cannot use any main stream garbage collected language for “hard real-time systems”, as the garbage collect will sometimes stop the system responding in a defined time. Avoiding allocating object can help, however you need a way to prove you are not creating any garbage and that the garbage collector will not kick in.
However most “real time” systems don’t in fact need to always respond within a hard time limit, so it all comes down do what you mean by “real time”.
Even when parts of the system needs to be “hard real time” often other large parts of the system like the UI don’t.
(I think your app needs to be fast rather than “real time”, if 1 frame is lost every 100 years how many people will get killed?)
I've used C# to create multiple realtime, high speed, machine vision applications that run 24/7 and have moving machinery dependent on the application. If something goes wrong in the software, something immediately and visibly goes wrong in the real world.
I've found that C#/.Net provide pretty good functionality for doing so. As others have said, definitely stay on top of garbage collection. Break up to processing into several logical steps, and have separate threads working each. I've found the Producer Consumer programming model to work well for this, perhaps ConcurrentQueue for starters.
You could start with something like:
Thread 1 captures the camera image, converts it to some format, and puts it into an ImageQueue
Thread 2 consumes from the ImageQueue, processing the image and comes up with a data object that is put onto a ProcessedQueue
Thread 3 consumes from the ProcessedQueue and does something interesting with the results.
If Thread 2 takes too long, Threads 1 and 3 are still chugging along. If you have a multicore processor you'll be throwing more hardware at the math. You could also use several threads in place of any thread that I wrote above, although you'd have to take care of ordering the results manually.
Edit
After reading other peoples answers, you could probably argue my definition of "realtime". In my case, the computer produces targets that it sends to motion controllers which do the actual realtime motion. The motion controllers provide their own safety layers for things like timing, max/min ranges, smooth accel/decelerations and safety sensors. These controllers read sensors across an entire factory with a cycle time of less than 1ms.
Absolutely. The key will be to avoid garbage collection and memory management as much as possible. Try to avoid new-ing objects as much as possible, using buffers or object pools when you can.
Of course, someone has even developed a library to do that: AForge.NET
As with any real-time application and not just C#, you'll have to manage the buffers well as #David suggested.
Not only that, there're also the XNA Framework (for things like 3D games) and you can program DirectX using C# as well which are very real-time.
And did you know that, if you want, you can do pointer manipulations in C# too?
It depends on how 'real-time' it needs to be; ie, what your timing constraints are, and how quickly you need to 'do something'.
If you can handle 'doing something' maybe every 300ms or so in .NET, say on a timer event, I've found Windows to work okay. Note that this is something I found true on multiple systems of different ages and different speeds. As always, YMMV.
But that number is awfully long for a lot of applications. Maybe not for yours.
Do some research, make sure your app responds quickly enough for your application.
I want it to work on windows servers.
It will be a cloud type server - it'll consist of modules\parts running on different machines all over the world using http\tcp + upnp to connect to each other
There are going to be controlling\monitoring\observing modules on each machine to provide stats on performance
This net is going to be working with large amount of VIDEO\AUDIO life streaming\broadcasting data
It is going to use FFMPEG for re-encoding and OpenGL, OpenCV and such for filtering (.NET wrappers exist and work BTW)
It will not use any WCF or IIS
I want to develop it in team of 2-4 developers, smart students.
So is it OK to create this in C# .Net or I shall not waste my time on promises of ease it could provide to a developer and go C\C++?
So is it reasonable to write a server application in C# in my case?
Offtop - why not WCF
Warning: it gets way to subjective in here.
WCF is grate when you have big corp with relatively small data exchange per one session of service.
When you have video, LIVE video, it all gets complicated. Large amounts of data, lots of users stream in and out from your service at the same time.
Try to do live video streaming over http binding - than try it with others than you'll see why I do not like idea of live streaming with WCF - it is slow, with way2much not needed for live streaming info and after all have you ever seen a live video streaming app on WCF? No - you haven't - may be you have seen +- live video on Silverlight + IIS pair which I do not like because it is just for Silverlight\WindowsMediaPlayer video streaming solution while I want more than that.
I love to have cross-platform clients with reach UI’s. And I do not like (it is all here my personal opinion - so it is subjective) Silverlight+IIS+WCF group. So what shall I do - right go to sockets, streams in such old and simple formats like FLV and Flash as back end client - Simpler in development in some parts, more conservative way of doing live video over the web than one you get from MS today.
I love Flash FLV live streaming because you just open socket and start sending live FLV video data onto it (for each user FLV header and than FLV "TAG's", one by one: video tag, audio tag, video tag, audio tag etc) and Flash plays it! With no special\unusual code. It is fast, easy in supporting, and does not make client need anything new\unusual. And you on server side can take grate use of that "TAG" form of video\audio data representation.
So that is in short why I just do not want to use WCF - hard to get live video playing out from it on client side, no general benefits for live video server.
And when most of live data goes thru sockets why to bother with using WCF for service management.
During last half of 2009 and first half of 2010 I was getting into WCF, live video streaming, silverlight and flash, comparing process of client\server creation, reading different formats with a team of wary interesting developers. In general at the end of project we had lots of mini servers streaming live data and lots of different clients receiving it. Comparing all we've done we came to conclusions which are near one I present you here.
That is why I do not want to use WCF in my nearest project - I do not want to think about how to deliver media data, I want to focus on its filtering\editing.
Why the question appeared
We started playing with FFmpeg\OpenCV in C, and it is pretty simple to manipulate data using them... in C... on Linux...
But when we started to play with there .Net bindings (we are now playing with Tao.FFmpeg) we found that in most cases we end up playing with C# Marshal a lot, and having 2 variables for its C analog (problem of pointers) and so on. I hope we will not see such problem with Emgu CV but steel it makes me a little bit afraid...
I think it's entirely reasonable. The benefits of C# with regard to ease of development will greatly outweigh any performance drawbacks of not using C++.
C# is generally more cross-platform than C++. True, C++ is a cross-platform language, but there are large differences between the APIs that C++ programs use to interact with the system. C# and .Net/Mono have a much more standardized interface to the socket layer.
Finally, with ambitious projects like this, getting the project into a usable form is a much more important goal than getting the highest performance possible. Performance only matters if the project is complete. Write it in C# because that will give you the greatest odds of completion. Then worry about performance.
I'm not exactly sure why people have brought up Cross Platform concerns as clearly the OP has stated the app will run on Windows.
As to the actual questions.
Can you build a server application that communicates via tcp/http in C# that does not have to run in IIS. -> Yes.
Can you build a server application that is performant and scales in C# -> Yes.
Can you do so with Students -> Maybe. Depends on the students... ;) But that is irrespective of the language in use.
Is this something I would do? Yes. We've done that. We have a c# app running on approximately 20,000 machines right now that are communicating effectively over tcp. We aren't using WCF, but we did decide to use RESTful style services over http for the data transfer.
Our biggest issue was simply tuning the app to transfer the "right" amount of data over the wire at a time. This network is for data collection and storage. It's averaging around 200GB of data collected a day..
UPDATE
I wanted to clarify a bit about the above app. The 20,000 machines at the above installation are clients (XP, Vista, 7, 2003 Server, and 2008 Servers). There's only one data collection point server in the mix. The clients post data to the server, when connected to a network, once every 45 seconds. Roughly 97% of the machines stay connected in this manner, the rest connect a couple times a week.
This works out to the server processing about 37 million requests a day.
Now, to be sure, each request is relatively small at around 5KB to 6KB each. However, the shear number of requests shows that a C# application can handle managing those connections, which is the bigger part of the OP's problem.
Because the OP's files are large (Video), then the real issue is simply in data transfer. Which will be hindered more by hard drive speeds, as well as network speed and latency. Those issues are irrespective of which language you are working in and will limit the number of connections per server based on available bandwidth.
Working this out let's limit it down to one server for an example. If you have a video rate of 400kb/s then and a 25MB connection to the internet, then that box could physically only handle around 62 simultaneous connections. Which is so FAR below the number of connections our app is doing as to be a rounding error.
Assuming perfect network conditions (which don't exist), pumping that internet connection up to 100MB (which can be expensive) means a 4x increase in simultaneous connections to 240; still completely manageable.
However, the network is only one side of the equation. Drive speed on the servers matters a lot. You better have a good disk array capable of continuously delivering that amount of data. I know drives claim 3GB data transfers, but a drive which can saturate the channel has never been built. Which means serious planning and money in the server setup.
The point of all of this is to say that the language doesn't matter one bit in your situation. You have other much larger contention issues. With that being the case, go with the language that will help you get the project done faster.
Why stop at C#, if you (possibly) want cross-platform, write it in Python or similar, you'll find that the networking aspects of a scripting language are far better than C# (as that's pretty much the role scripting languages are put to nowadays, running web-based servers).
You'll find developer productivity is much improved over C# (just as C# has better productivity over C++), and there are lots of people who know and want to work on these systems. It sounds like performance of the servers themselves is of less importance than the networking, so it appears that script would be your best choice. Plus ffmpeg libraries are more tightly integrated with python using pyffmpeg than C# (well, mostly).
And it'd be a lot cooler, more fun, and very much cross-platform!
If you want C# and also cross-platform abilities, your development will have to target the Mono platform (or another cross-platform .NET runtime, if you can find one). You might have to give up VisualStudio, and maybe some Microsoft-specific libraries and tools, but you can still have C# on multiple platforms. Just make sure you start the multi-platform building and testing EARLY in the process or it will be hell to change things later.
If the target of the application is to run only on Windows platforms, I'm completely sure to write this application in C#. Many applications like that can be running right now and we don't even know that.
If the target is to run on multiple platformms, you should encapsulate first all the problems that a non-windows platform can bring to your application.
Why do you have to write it in C++ if, in this case, C# is capable to do everything that C++ does? I would use C++ to program things on hardware-level things, like a robot or something else. To write a server application, C# will fit very well what you want, it was designed for these things.
And C# is cross-platform, you just need the right tool to make it work on a specific platform.
I have many unused computers at home. What would be the easiest way for me to utilize them to parallelize my C# program with little or no code changes?
The task I'm trying to do involves looping through lots of english sentences, the dataset can be easily broken into smaller chunks, processed in different machines concurrently.
… with little or no code changes?
Difficult. Basically, look into WCF as a way to communicate between various instances of the program across the network. Depending on the algorithm, the structure might have to be changed drastically, or not at all. In any case, you have to find a way to separete the problem into parts that act independently from each other. Then you have to devise a way of distributing these parts between different instances, and collecting the resulting data.
PLinq offers a great way to parallelize your program without big changes but this only works on one process, across different threads, and then only if the algorithm lends itself to parallelization. In general, some manual refactoring is necessary.
That's probably not possible.
How to parallelize a program depends entirely on what your program does and how it is written, and usually requires extensive code changes and increases the complexity of your program many fold.
The usual way to easily increase concurency in a program is take a task that is repeated many times and just write a function that splits that task into chunks and sends them to different cores to process.
The answer depends on the nature of the work your application will be doing. Different types of work have different possible parallelization solutions. For some types there is no possible/feasible way to parallelize.
The easiest scenario I can think of is for an application which work can easily be broken in discrete job chunks. If this is the case, then you simply design your application to work on a single job chunk. Provide your application with the ability to accept new jobs and deliver the finished jobs. Then, build a job scheduler on top of it. This scheduler can be part of the same application (configure one machine to be the scheduler and the rest as clients), or a separate application.
There are other things to consider: How will occur the communication among machines (files?, network connections?); the application need to be able to report/be_queried about percent of job completed?; there is a need to be able to force the application to stop proccessing current job?; etc.).
If you need a more detailed answer, edit your question and include details about the appplication, the problem the application solves, the expected amount of jobs, etc. Then, the community will come with more specific answers.
Dryad (Microsoft's variation of MapReduce) addresses exactly this problem (parallelize .net programs across multiple PCs). It's in research stage right now.
Too bad there are no CTPs yet :-(
You need to run your application on a distributed system, google for distributed computation windows or for grid computing c#.
Is each sentence processed independently, or are they somehow combined? If your processing operates on a single sentence at a time, you don't need to change your code at all. Just execute the same code on each of your machines and divide the data (your list of sentences) between them. You can do this either by installing a portion of the data on each machine, or by sharing the database and assigning a different chunk to each machine.
If you want to change your code slightly to facilitate parallelism, share the entire database and have the code "mark" each sentence as it's processed, then look for the next unmarked sentence to process. This will give you a gentle introduction to the concept of thread safety -- techniques that ensure one processor doesn't adversely interfere with another.
As always, the more details you can provide about your specific application, the better the SO community can tailor our answers to your purpose.
Good luck -- this sounds like an interesting project!
Before I would invest in parallelizing your program, why not just try breaking the datasets down into pieces and manually run your program on each computer and collate the outputs by hand. If that works, then try automating it with scripts and write a program to collate the outputs.
There are several software solutions that allow you to use commodity based hardware. One is Appistry. I work at Appistry and we have done numerous solutions to run C# applications across hundreds of machines.
A few useful links:
http://www.appistry.com/resource-library/index.html
You can download the product for free here:
http://www.appistry.com/developers/
Hope this helps
-Brett
You might want to look at Flow-Based Programming - it has a Java and a C# implementation. Most approaches to this problem involve trying to take a conventional single-threaded program and figure out which parts can run in parallel. FBP takes a different approach: the application is designed from the start in terms of multiple "black-box" components running asynchronously (think of a manufacturing assembly line). Since a conventional single-threaded program acts like a single component in the FBP environment, it is very easy to extend an existing application. In fact, pieces of an existing app can often be broken off and turned into separate components, provided they can run asynchronously with the rest of the app (i.e. not subroutines). Someone called this "turning an iceberg into ice cubes").