If I deploy a C# console app, which does the following:
reads message (ActiveMQ)
processes message contents
writes result to database (SQL Server)
Would there be any issues with running this multiple times e.g. what if I created a batch file and ran 100 instances? Would there be any conflict given that each instance would be using the same shared DLLs e.g. Apache.NMS.ActiveMQ.
The other option would be to deploy the app multiple times, but I'd rather not have to manage duplicated folders. I'm also avoiding threading at the moment but that will be an option for further development in future.
Just want to clarify what happens with those DLLs, and check that there wouldn't be a threading type conflict, e.g. one instance writing the results of another instance's processing to the database...
No, there will be no problem with loading the same DLL files into multiple processes as you describe. You would only run into problems running multiple instances of the same application if the process needed exclusive access to a shared resource, like a file. With regard to writing to a database, as long as you design your application so that multiple clients can write data without overwriting data or causing some sort of inconsistency with the domain integrity of the data then again, no problem.
However, I would strongly suggest you look at making you application multi-threaded if it is concurrency you need, or Application Domains if it is isolation you need. Running multiple processes is much more expensive in terms of resources than either of these two options.
Related
I'm currently working on a C# project of an application we'd like to develop. We're brainstorming over the question of sharing the data between users. We'd like to be able to specify a folder where all the files of the application are going to be saved and we'd like to be able to save them on a shared folder (server, different PC or Mac, Nas, etc.).
The deployment would be like so :
Installation on the first PC, we choose a network drive, share, whatever and create all the files for the application in this location.
On the second PC we install the application and we choose the same location (on the network), the application doesn't create anything, it sees that it's already existing and it uses these files as the application's data
Same thing on the other clients
The application's files are going to be documents (most likely XML formatted documents) and when opening the application we want to show all the existing documents. The thing is, we don't only want to have the list of documents and be able to edit their content, we also would like to be able to edit the document's property, so in a way we'd like a file (Sqlite, XML, whatever) representing the list of all the documents and their attributes. Same thing for a list of addresses.
I know all that looks exactly like a client / server with database solution, but this solution is out of the question. I was first looking at SQLite for my data files, but I know concurrency can be a real problem and file lock doesn't work well. The thing is, I would have the same problem with simple XML files (refreshing the content when several users are working, accessing locked files).
So I guess my final question is : Is it feasable? Is there an alternative I didn't see which would allow us to do that more easily?
EDIT :
OK I'm not responding to every post or comment, because I'm currently testing concurrency with SQLite. What I did, and please correct me if the way I test this is wrong, is launch X BackgroundWorker which are all going to insert record in a sample database (which is recreated everytime I start the application). I tried launching 100 iterations of INSERT in the database via these backgroundWorkers.
Of course concurrency is working with one application running, it's simply waiting for the last BackgroundWorker to do it's job and then writing the next record. I also tried inserting at (almost) the same time, meaning I put a loop in every BackgroundWorker waiting for a modulo 5 timestamp (every 5 seconds, every BackgroundWorker runs). Again, it's waiting for the previous insert query to end before doing the next and everything's working fine. I even tried it with 500 BackgroundWorkers and it worked fine.
I then tried launching my app several times and running them simultaneously. When doing this I did have some issue. With two instances of my app it was still working fine, but when trying this with 4-5 instances, it got really buggy and I got two types of error : 1. database is locked 2. disk I/O failure. But mostyle locked databases.
What I did was pretty intensive, in the scenario of my application, it will never ever come to 5 processes trying to simultaneously insert 500 hunded rows at the same time (maybe I'll get a concurrency of two or three connections). But what really bugged me and what makes me think my testing method is not really a good one, is that I got these errors trying to work on a database on a shared network, on a NAS AND on my own HDD. Everytime it worked for maybe 30-40 queries then throwing me "database is locked" error.
Am I testing it wrong? Maybe I shouldn't be trying so hard to make this work, but I'm still not convinced that SQLite is not a good alternative to what I'm trying to do, since the concurrency is going to be really small.
With your optimistic/pessimistic locking, you are ultimately trying to build a database. Also, you WILL have issues with consistency while trying to keep multiple files in sync with each other. Think about if you update the "metadata" file, and the write fails half-way through because of a network blip. File corruption will ensue, and you will be left trying to reconstruct things from backups.
I would suggest a couple of likely solutions:
1) Host the content yourselves, and let them be pure clients (cloud based deployments are ideal for this). Most network/firewall issues can be circumvented by using HTTP as your transport (web services).
2) Have one of the workstations be the "server", which keeps it data files on the NFS. This will give you transactional integrity, incremental backups, etc. There are lots of good embedded database managements systems to help you manage this complexity. MS SQL Server even has some great options for this.
You right, Sqlite uses file locks on database file, so storing all data files in database would bring write-starvation problem for editing your documents.
May be it's better choice to implement simple optimistic/pessimistic locking by yourself on particular-file level? For example, in case of using pessimistic lock you just don't allow anyone to edit particular file, if somebody already in process of editing it. In this case you will hold lock just on one file, but not on the entire database. If possibility of conflict(editing particular file at the same time) is pretty low, it is better to go with optimistic locking.
Simple optimistic locking implementation:
When user get file for reading - it's OK, no problem here. If user get file for editing, you could calculate hash for this file(or get timestamp of last updated time of the file), and then, when user tries to save edited file, compare current(at the moment of saving) hash/timestamp to make sure that file has not been changed by somebody else. If file has not been changed then it's ok to save it. IF file has been changed, then current user is out of luck, you need to inform him about it. This optimistic scenario is nice when possibility of this "out of luck" is pretty low. Otherwise it's better to stick with pessimistic locking, when you do not allow user even to start file editing if somebody else is doing it.
I have two separate programs, one is a console application, and the other one is a windows application.
My windows application:
Has a graphic interface, buttons, and others functions.
One of the buttons, named "research": when I click on it, I launch the console application with this line of code:
string strResult = ProcessHelper.LaunchProcessWaitForPipedResult("MyExecFile.exe", strArguments, 10 * 60 * 1000, true); // 10 mins max
My console Application:
do a query on all existing files in a directory.
My problem:
I want to create a progress-bar on the windows application to show the progress of the console application. The problem is I don't know how to pass this information between the two processes. The only restriction is to not use a database or file.
Given two processes in the same user session, and wanting to avoid any communication outside that session I would look at three options:
1. Using named pipes.
The parent process creates a named pipe using a random name (and confirms that name is not in use by opening it). It passes that name to the child process. A simple protocol is used that allows the child to send updates.
There are a number of challenges to overcome:
Getting the logic to ensure the name is unique right (named pipe names are global).
Ensuring no other process can connect (the default named pipe ACL limits connections to the session: this might be enough).
Handling the case where a different parent process does not support progress updates.
Handling the child or parent crashing.
Avoiding getting too clever with the communication protocol, but allowing room for growth (what happens when more than a simple progress bar is wanted?)
2. Using Shared Memory
In this case names of objects are, by default, local to the session. By default this is more secure.
The parent process creates a sufficiently large amount of shared memory (for a simple progress update: not much), a mutex and an event.
The parent process then, concurrently with the GUI waits for the event to be signalled, when it is it enters the mutex and reads the content of shared memory. It then unsets the event and leaves the mutex.
Meanwhile to send an update the child enters the mutex, updates and memory and sets the event before leaving the mutex.
The challenges here include:
Defining the layout of the shared memory. Without a shared assembly this is likely to be error prone.
Avoiding others using the shared memory and synchronisation objects. .NET makes things harder here: in Win32 I would make the handles inheritable thus not needing to name the objects (except for debugging) and pass to the child directly.
Getting the sequencing of shared memory, mutex and event correct is critical. Memory corruption and more subtle bugs await any errors.
It is harder to do variable length data with shared memory, not an issue for a simple progress count but customers always want more.
Summary
I would probably look at named pipes in the first place (or perhaps custom WMI types if I wanted greater flexibility). BUT I would do that only after trying everything to avoid needing multiple processes in the first place. A shared library plus console wrapper for others, while I use the library directly would be a far easier option.
I have a C# console app which I'm deploying around 20 times (with different config settings) and running. As you might imagine it's hard to keep an eye on what's happening with 20 apps running (I'm eventually going to deploy these as windows services), so is there anything that can show the output of these in one place easily?
I've thought about log files but these could get big quite fast, and it is a lot of files to open and look at - I just want to have some output to check things are still running as expected.
Edit:
I'm going to be writing errors and stop/start information to the database. What I'm talking about here is the general processing information, which isn't all that relevant to revisit, but interesting to look at while its running in the console app.
I have successfully used log4net and its configurable UdpAppender. Then you can point all the UdpAppenders to a single machine where you can receive the Udp messages with Log4View for example.
Since it's configurable, you can use it when you install and debug in production and then increase the logging level to only output ERROR messages instead of DEBUG or INFO messages.
http://logging.apache.org/log4net/
http://www.log4view.com
http://logging.apache.org/log4net/release/config-examples.html
Maybe because I come from a heavy DB background, but how about using SQL Server with a Log table to track activity across different apps?
DBs are geared up towards concurrency and will easily handle multiple applications inserting data into the same Log table, also you get the options of slicing and dicing through the data as much as you would like, taking advantage of the already existing aggregation functions in a DB environment.
If you go down that route, you will probably need to consider maintaining that table (Log retention period, etc.).
You could also potentially start using tools such as Splunk to collate all the log data, and start corresponding app failures to system or environment failures (if these are being tracked).
I'd second Mikael Östberg and recommend using a logger library (log4net, or nlog). There are many options where you can send messages to either a database or queues, etc... Since you can turn the logging on or off easily, you can even keep it in your services as a monitor hook in case something weird happens
I have a data file and from time to time I need to write a change to the file. The change consists of changing information in more than one place. For example, changing some data near the end of the file and also changing some information near the start. I want the two separate writes to either both succeed or both fail, otherwise it is left in uncertain state and effectively corrupted. Is there any builtin support for this scenario in .NET or in general?
If not then how to others solve this issue? How does a database on Windows solve this issue?
UPDATE: I do not want to use the Transactional NTFS capability because it is not available on older version of Windows such as XP and it is slow in the file overwrite scenario as described above.
DB basically uses a Journal concept (at least those one I'm aware of). An idea is, that a write operation is written in journal until Writer doesn't commit a transaction. (Sure it's just basic description, it's so easy)
In your case, it could be a copy of your file, where you're going to write a data, and if everything finished with success, substitute original file with it's copy.
Substitution is: rename original file like a old, rename backup file like a original.
If substitution fails: this is a critical error, that application should handle via fault tolerance strategies. Could be that it informed a user about a failed save operation, and tries to recover. By the way in any moment you have both copies of your file. That one when write operation just started, and that one when write operation finished.
This techniques we used on past projects on VS IDE like systems for industrial control with pretty good success.
If you are using Windows 6 or later (Vista/7/2008/2008R2) the NTFS filesystem supports transactions (including within a distributed transaction): but you will need to use P/Invoke to call Win32 APIs (see this question).
If you need to run on older versions of Windows, or non-NTFS partitions you would need to perform the transactions yourself. This is decidedly non-trivial: getting full ACID functionality while handling multiple processes (including remote access via shares) across process and system crashes even with the assumption that only your access methods will be used (some other process using normal Win32 APIs would of course break things).
In this case a database will almost certainly be easier: there are a number of in-process databases (SQL Compact Edition, SQL Lite, ...) so a database doesn't require a server process.
Question: I currently store ASP.net application data in XML files.
Now the problem is I have asynchronous operations, which means I ran into the problem of simultanous write access on a XML file...
Now, I'm considering moving to an embedded database to solve the issue.
I'm currently considering SQlite and embeddable Firebird.
I'm not sure however if SQlite or Firebird can handle multiple concurrent write access.
And I certainly don't want the same problem again.
Anybody knows ?
SQlite certainly is better known, but which one is better - SQlite or Firebird ? I tend to say Firebird, but I don't really know.
No MS-Access or MS-SQL-express recommodations please, I'm a sane person.
I wll choose Firebird for many reasons and for this too
Although it is transactional, SQLite
does not support concurrent
transactions, so if your embedded
application needs two or more
connections, they must be serialized.
An embedded Firebird database is
simple to upgrade to a fully shared
database - just change the shared
library.
May be you can also check this
SQLITE can be configured to gracefully handle simultaneous writes in most situations. What happens is that when one thread or process begins a write to the db, the file is locked. When the second write is attempted, and encounters the lock, it backs off for a short period before attempting the write again, until it succeeds or times out. The timeout is configurable, but otherwise all this happens without the application code having to do anything special except enabling the option, like this:
// set SQLite to wait and retry for up to 100ms if database locked
sqlite3_busy_timeout( db, 100 );
All this works very well and without any difficulty, except in two circumstances:
If an application does a great many writes, say a thousand inserts, all in one transaction, then the database will be locked up for a significant period and can cause problems for any other application attempting to write. The solution is to break up such large writes into seperate transactions, so other applications can get access to the database.
If the database is shared by different processes running on different machines, sharing a network mounted disk. Many operating systems have bugs in network mounted disks that making file locking unreliable. There is no answer to this. If you need to share a db on a network mounted disk, you need another database engine such as MySQL.
I do not have any experience with Firebird. I have used SQLITE in situations like this for many applications over several years.
Have you looked into Berkeley DB with the SQLite API for SQL support?
It sounds like SQLite will be a good fit. We use SQLite in a number of production apps, it supports, actually, it prefers transactions which go a long way to handling concurrency.
transactional sqlite? in C#
I would add #3 to the list from ravenspoint above: if you have a large call-center or order-processing center, say, where dozens of people might be hitting the SAVE button at the same time, even if each is updating or inserting just one record, you can run into problems using the busy timeout approach.
For scenario #3, a true SQL engine that can serialize is ideal; less ideal but serviceable is a dbms that can do byte-range record locking of a shared-file. But be aware that even a byte-range record lock will be inadequate for a large number of concurrent writes when new records are appended to the end of the file like a caboose on the end of a freight train, so that multiple processes are trying at the same time to set a lock on the same byte-range. On the other hand, a byte-range record locking scheme coupled with a hashed-key sparse file approach (e.g. the old Revelation/OpenInsight database for LANs) will be far superior to ISAM for this scenario.