What options do I have for SSIS cross-process communication?

What options do I have for SSIS cross-process communication? - c#

I would like to store an integer variable that gets incremented and decremented (a counting semaphore for limiting concurrent requests to an external API). This would be easy, except I need a way to read/write this variable from an SSIS package that is run in parallel SQL Agent jobs. Right now there can be 0 to 5 instances of the SQL Agent job, and therefore the SSIS package, running at once.
What are my options for reading and writing this variable? The code that will be using this variable is written as a custom SSIS task in .NET.
It is not particularly important that the value is exactly right, as long as it's generally close I'm within a tolerance range. Exact would be great, but not required.
I have access to the file system, registry, database, server, and the SSIS agent as a whole, but I'd like to check this variable very often by 15-30 threads, which has historically caused issues using a file system method (I'm probably doing it wrong), and is IMO too intensive to store in the database. Correct me if I'm wrong. Storing in the registry prevents the variable from being accessible across a server farm.
If there's anyone out there that can help, I will gladly be your indentured servant.

If it is used as counting semaphore, why not actually use Windows semaphore object?
System.Threading.Semaphore is .NET version of it, and if you specify the semaphore name in constructor - the Win32 object will be shared between all the processes that use this name.

Not sure I understand the question - you indicated you have access to a Database, file system, registry, etc. Are you saying you don't want to / can't use these methods? Are you looking to persist the value so in the event the computer halts you can recover?
If persistance is not required, you could persist in memory via an RPC, including COM or web services. Whatever the solution, it seems it needs to be global and visible to all running instances.
Is this variable metadata used as a semaphore to control and coordinate the processes, or is this variable domain data?

Related

C# Windows Service queue with Pool

EDIT: Context
I have to develop a web asp.net application which will allow user to create "Conversion request" for one or several CAO files.
My application should just upload files to a directory where I can convert them.
I then want to have a service that will check the database updated by the web application to see if a conversion is waiting to be done. Then I have to call a batch command on this file with some arguments.
The thing is that those conversion can freeze if the user give a CAO file which has been done wrongly. In this case, we want the user or an admin to be able to cancel the conversion process.
The batch command use to convert is a third party tool that I can't change. It need a token to convert, and I can multithread as long as I have more than one token available. An other application can use those token at the same moment so I can't just have a pre-sized pool of thread according to the number of token I have.
The only way I have to know if I can convert is to start the conversion with the command and see if in the logs it tells me that I have a licence problem which mean either "No token available" or "Current licence doesn't accept the given input format". As I allow only available input formats on the web application, I can't have the second problem.
The web application is almost done, I mean that I can upload file and download results and conversion logs at the end. Now I need to do the service that will take input files, convert them, update convert status in database and lastly add those files in the correct download dirrectory.
I have to work on a service which will look in a database at a high frequency (maybe 5 or 10 seconds) if a row is set as "Ready to convert" or "Must be canceled".
If the row is set to "ready to convert" I must try to convert it, but I do it using a third party dll that have a licence token system that allow me to do only 5 converts simultaneously atm.
If the row is set to "Must be canceled" I must kill the conversion because it's either freeze and the admin had to kill it or because the user canceled his own task.
Also, conversion times can be very long, from 1 or 2 seconds to several hours depending on the file size and how it has been created.
I was thinking about a pooling system, as I saw here :
Stackoverflow answer
Pooling system give me the advantage to isolate the reading database part to the conversion process. But I have the feeling that I loose a kind of control on background process. Which is maybe juste because I'm not used to them.
But I'm not very used to services and even if this pool system seems good, I don't know how I can cancel any task if needed ?
The tool I use to convert work as a simple batch command that will just return me an error if no licence are available now, but using a pool how can I make the convert thread wait for the convert to be done if No licence are available is a simple infinite while loop an appropriate answer ? It seems quite bad to me.
Finally, I can't just use a "5 threads pool" as thoses licences are also used by 2 others applications which doesn't let me know at any time how many of them are available.
The idea of using pool can also be wrong, as I said, I'm not very used to services and before starting something stupid, I prefer ask abotu the good way to do it.
Moreover, about the database reading/writing, even if I think that the second option is better, should I:
Use big models files that I already use on my ASP.NET web application which will create a lot of objects (one for each row as it's entities models).
Don't use entities models but lighter models which will be less object entities oriented but will probably be less ressources demanding. This will also be harder to maintain.
So I'm more asking about some advices on how I should do it than a pure code answer, but some example could be very useful.
EDIT: to be more precise, I need to find a way to:
(For the moment, I stay with only one licence available, I will evolve it later if needed)
Have a service that run as a loop and will if possible start a new thread for the given request. The service must still be running as the status can be set to "Require to be cancel".
At the moment I was thinking about a task with a cancellation token. Which would probably achive this.
But if the task find that the token isn't currently available, how can I say to my main loop of the service that it can't process now ? I was thinking about having just an integer value as a return where the return code would be an indicator on the ending reason: Cancellation / No token / Ended... Is that a good way to do ?

What I'm hearing is that the biggest bottleneck in your process is the conversion... pooling / object mapping / direct sql doesn't sound as important as the conversion bottleneck.
There are several ways to solve this depending on your environment and what your constraints are. Good, fast, and cheap... pick 2.
As far as "best practice" go, there are large scale solutions (ESBs, Message Queue based etc), there are small scale solutions (console apps, batch files, powershell scripts on Windows scheduler, etc) and the in-between solutions (this is typically where the "Good Enough" answer is found). The volume of the stuff you need to process should decide which one is the best fit for you.
Regardless of which you choose above...
My first step will be to eliminate variables...
How much volume will you be processing? If you wrote something that's not optimized but works, will that be enough to process your current volume? (e.g. a console app to be run using the Windows Task Scheduler every 10 - 15 seconds and gets killed if it runs for more than say 15 minutes)
Does the third party dll support multi-threading? If no, that eliminates all your multi-threading related questions and narrows down your options on how to scale out.
You can then determine what approach will fit your problem domain...
will it be the same service deployed on several machines, each hitting the database every 10-15 seconds?, or
will it be one service on one machine using multi-threading?
will it be something else (pooling might or might not be in play)?
Once you get the answer above, the next question is.
will the work required fit within the allocated budget and time constraint of your project? if not, go back to the questions above and see if there questions above that you can answer differently that would change the answer to this question to yes.
I hope that these questions help you get to a good answer.

Parallel Execution of a .net application

I want to write a small console application (C# 4.0/4.5) that will serve as a logger to a remote database. Said application could be called from numerous peripheral automation components/programs, not of all which will be .NET based. (calls would be made via commandline: e.g., logme.exe appID, taskID, statusID, msg)
Question: what would happen if two or more of these programs were to execute the logger either a) at the same time, or b) while it's already in use elsewhere?
I'm unsure of the execution fundamentals and whether I should be concerned with this or not.
Thank you

Nothing bad would happen, your process would just run multiple times as if you double clicked it multiple times. As long as you're not trying to write to the same file multiple time or something similar you should be fine (in the case of writing to a remote database it should be handled for you).
If you want a more certain answer you'll have to post your console application code here so that we can tell you if nothing raises a red flag.

Make sure that a shared components can be handled in a multiprocessor environment. If that means using a mutex to lock a shared resource across multiple instances then that needs to be done.
See Basic Synchronization by Joe Albahari on implementing differing locking operations.

Pass informations between separate consoles and windows applications

I have two separate programs, one is a console application, and the other one is a windows application.
My windows application:
Has a graphic interface, buttons, and others functions.
One of the buttons, named "research": when I click on it, I launch the console application with this line of code:
string strResult = ProcessHelper.LaunchProcessWaitForPipedResult("MyExecFile.exe", strArguments, 10 * 60 * 1000, true); // 10 mins max
My console Application:
do a query on all existing files in a directory.
My problem:
I want to create a progress-bar on the windows application to show the progress of the console application. The problem is I don't know how to pass this information between the two processes. The only restriction is to not use a database or file.

Given two processes in the same user session, and wanting to avoid any communication outside that session I would look at three options:
1. Using named pipes.
The parent process creates a named pipe using a random name (and confirms that name is not in use by opening it). It passes that name to the child process. A simple protocol is used that allows the child to send updates.
There are a number of challenges to overcome:
Getting the logic to ensure the name is unique right (named pipe names are global).
Ensuring no other process can connect (the default named pipe ACL limits connections to the session: this might be enough).
Handling the case where a different parent process does not support progress updates.
Handling the child or parent crashing.
Avoiding getting too clever with the communication protocol, but allowing room for growth (what happens when more than a simple progress bar is wanted?)
2. Using Shared Memory
In this case names of objects are, by default, local to the session. By default this is more secure.
The parent process creates a sufficiently large amount of shared memory (for a simple progress update: not much), a mutex and an event.
The parent process then, concurrently with the GUI waits for the event to be signalled, when it is it enters the mutex and reads the content of shared memory. It then unsets the event and leaves the mutex.
Meanwhile to send an update the child enters the mutex, updates and memory and sets the event before leaving the mutex.
The challenges here include:
Defining the layout of the shared memory. Without a shared assembly this is likely to be error prone.
Avoiding others using the shared memory and synchronisation objects. .NET makes things harder here: in Win32 I would make the handles inheritable thus not needing to name the objects (except for debugging) and pass to the child directly.
Getting the sequencing of shared memory, mutex and event correct is critical. Memory corruption and more subtle bugs await any errors.
It is harder to do variable length data with shared memory, not an issue for a simple progress count but customers always want more.
Summary
I would probably look at named pipes in the first place (or perhaps custom WMI types if I wanted greater flexibility). BUT I would do that only after trying everything to avoid needing multiple processes in the first place. A shared library plus console wrapper for others, while I use the library directly would be a far easier option.

A good broadcast mechanism for inhouse .net applications to announce their location and version?

I would like to provide a large number of inhouse .net applications with a lightweight way to announce that they are being used. My goal is to keep track of which users might benefit from support check-ins and/or reminders to upgrade.
This is on an inhouse network. There is definitely IP connectivity among all the machines, and probably UDP. (But probably not multicast.)
Writing to a known inhouse share or loading a known URL would be possibilities, but I would like to minimize the impact on the application itself as completely as possible, even at the expense of reliability. So I would rather not risk a timeout (for example if I'm accessing some centralized resource and it has disappeared), and ideally I would rather not launch a worker thread either.
It would also be nice to permit multiple listeners, which is another reason I am thinking about broadcasting rather than invoking a service.
Is there some kind of fire-and-forget broadcast mechanism I could use safely and effectively for this?

There are certainly many options for this, but one that is very easy to implement and meets your criteria is an Asynchronous Web Service call.
This does not require you to start a worker thread (the Framework will do that behind the scenes). Rather than use one of the options outlined in that link to fetch the result, simply ignore the result since it is meaningless to the calling app.

I did something similar, though not exactly a "braodcast"
I have an in house tool several non-techies in the company use. I have it check a network share for a specific EXE (the same EXE you would download if you wanted to use it) and compares the version # of that file with the executing assembly. If the one on the network is newer, alert the user to download the new one.
A lot simpler than trying to set up an auto updater for something that will only be used within the same building as me.

If upgrading is not an issue (i.e. there are no cases where using the old version is better), you can do what I did with something similar:
The application that people actually launch is an updater program, it checks the file version and timestamp on a network share and if a newer version exists, copies it to the program directory. It then runs the program (whether it was updated or not).
var current = new FileInfo(local);
var latest = new FileInfo(remote);
if (!current.Exists)
latest.CopyTo(local);
var currentVersion = FileVersionInfo.GetVersionInfo(local);
var latestVersion = FileVersionInfo.GetVersionInfo(remote);
if (latest.CreationTime > current.CreationTime || latestVersion.FileVersion != currentVersion.FileVersion)
latest.CopyTo(local, true);
Process.Start(local)
I also have the program itself check to see if the updater needs updating (as the updater can't update itself due to file locks)

After some experimentation, I have been getting good results using Win32 mailslots.
There is no official managed wrapper, but the functions are simple to use via PInvoke, as demonstrated in examples like this one.
Using a 'domain' mailslot provides a true broadcast mechanism, permitting multiple listeners and no requirement for a well-known server.

How to query a variable in another running application in c#?

I have an app that, when launched, checks for duplicate processes of itself.
That part I have right - but what I need is to check a state variable in the original running process in order to run some logic.
So: how do I make a variable (e.g. bool) available publicly to other applications so they can query it?

There are a bunch of ways to do this. A very primative way would be to read/write from a file. The old win32 way would be to use PostMessage. The more .NET way would be to use remoting or WCF and Named Pipes.
.NET 4 is also getting support for Memory Mapped files.
Here is a pretty thorough looking artcile describing a few different approaches including support for Memory Mapped files outside of .NET 4
http://www.codeproject.com/KB/threads/csthreadmsg.aspx

The easiest: Create a file, and write something in it.
More advanced, and when done correctly more robust, is using WCF, you use named pipes to setup some communication channel on the local computer only.

If you're using a Mutex to check whether another process is running (you should be) you could use another Mutex whose locked state would be the boolean flag you're looking.

The standard way of doing this is to use the Windows API to create and lock a mutex. The first app to open will create and lock the mutex. Any subsequent executions of the app will not be able to get it and can then shutdown.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.