First off, I will be talking about some legacy code and we are trying to avoid changing it as much as possible. Also, my experience with windows services and WCF is a bit limited so some of the questions may be a bit newbie. Just to give a bit of context before the question.
We have an existing service that loops. It checks via a database call to see if it has records to process. If it does not find any records, it sleeps for 30 seconds and then wakes back up to try again.
I would like to add an entry point to this service that would allow me to pass a record to this service in addition to it processing the records from the database. So the basic flow would be.
Loop
* Read record from database
* If no record from DB, process any records that were passed in via the entry point.
* No records at all, sleep for 30 seconds.
My concern is this. Is it possible to implement this in one service such that I have the looping process but I also allow for calls to come in at any time and add additional items to a queue that can be processed within the loop. My concern is with concurrency and keeping the loop and the listener from stepping on each other.
I know this question may not be worded quite right but I am on the new side with working with this. Any help would be appreciated.
My concern is with concurrency and keeping the loop and the listener from stepping on each other.
This shouldn't be an issue, provided you synchronize access correctly.
The simplest option might be to use a thread safe collection, such as a ConcurrentQueue<T>, to hold your items to process. The WCF service can just add items to the collection without worry, and your next processing step would handle it. The synchronization in this case is really minimal, as the queue would already be fully thread safe.
In addition to Reed's excellent answer, you might want to persist the records in a MSMQ queue to prevent your service from losing records on shutdown, restart, or crash of your service.
Related
So the question is long but pretty self explanatory. I have an app that runs on multiple servers that uses parallel looping to handle objects coming out of a MongoDB Collection. Since MongoDB forces me to allow multi read access I cannot stop multiple processes and or servers from grabbing the same document from the collection and duplicating work.
The program is such that the app waits for information to appear, does some work to figure out what to do with it, then deletes it once it's done. What I hope to achieve is that if I could keep documents from being accessed at the same time, knowing that once one has been read it will eventually be deleted, I can speed up my throughput a bit overall by reducing the number of duplicates and allowing the apps to grab things that aren't being worked.
I don't think pessimistic is quite what I'm looking for but maybe I misunderstood the concept. Also if alternative setups are being used to solve the same problem I would love to hear what might be being used.
Thanks!
What I hope to achieve is that if I could keep documents from being accessed at the same time
The simplest way to achieve this is by introducing a dispatch process architecture. Add a dedicated process that just watch for changes then delegate or dispatch the tasks out to multiple workers.
The process could utilise MongoDB ChangeStreams to access real-time data changes on a single collection, a database or an entire deployment. Once it receives a stream/document, just sends to a worker for processing.
This should also reduce multiple workers trying to access the same tasks and have a logic to back-down.
I have an asp.net web service which is like a reservation system and cannot reserve the same seat by multiple persons and the eligibility for the reservation is based on some other table values in SQL server. I plan to use SQL server queue processing as mentioned here. My customer wants to do this in a synchronized call, means we want the result in the same web service method call. My question is how efficient it to allow a synchronized method call to process a queue and wait until the queue return values (by means of a loop within a time span or so). Please advise the best possible approaches to achieve this.
You are stepping in the XY problem trap. Your primary goal is to make sure no seat can be reserved twice. You should rethink your approach of using a queue is the appropriate solution to this problem. A queue is great to efficiently use your processing resources on a background task. For real-time processing (like yours) it will create more problems that it solves.
It seems like you want to avoid the race condition that occurs when multiple users try to access the same seat twice. The queue doesn't solve the problem, you just move it to the enqueing phase. The one who enters the queue first wins. At the end you have added an unnecessary complication that doesn't bring you any benefit.
A much simpler solution to your problem is to create a unique key on your DB that makes sure that no seat can be reserved twice. Once you try to reserve a seat that has been taken just a moment ago, you will get an SQLException with the error number 2627. You can then prompt the user that the seat is taken.
I'm dealing with a CSV file that's being imported Client side.
This CSV is supposed to contain some information which will be destinated to perform an update in one of the tables of my company's database.
My C# function process the file, looking for errors and, if no errors were found, it sends a bunch of update commands (files usually vary from 50 to 100000 lines).
Until now, I was performing the update in the same thread to execute the update (line by line), but it was getting a little slow, depending on the file, so I choose to send all the SQL to an Azure SQL Queue (which is a service that gets lots of "messages" and runs the SQL code agains the database), so that, the client wouldn't have to wait that much for the action to be performed.
It got a little faster, but still takes a long time (due to the requests to the Azure SQL Queue). So I noticed that putting that action in a separate thread worked and sent all the SQL to my Azure SQL Queue.
I got a little worried about it though. Is it really safe to perform long actions in separate threads? Is it reliable?
A second thread is 100% like the main thread that you're used to working with. I wish I had some authority answer at hand but this is such a common practice that people don't write those anymore...
So, YES, off-loading the work to a second thread is safe and can be considered by most the recommended way to go by it.
Edit 1
Ok, if your thread is running on IIS, you need to register it or it will die because once the request/reponse cycle finishes it will kill it...
I am quite confused on which approach to take and what is best practice.
Lets say i have a C# application which does the following:
sends emails from a queue. Emails to send and all the content is stored in the DB.
Now, I know how to make my C# application almost scalable but I need to go somewhat further.
I want some form of responsibility of being able to distribute the tasks across say X servers. So it is not just 1 server doing all the processing but to share it amoungst the servers.
If one server goes down, then the load is shared between the other servers. I know NLB does this but im not looking for an NLB here.
Sure, you could add a column of some kind in the DB table to indicate which server should be assigned to process that record, and each of the applications on the servers would have an ID of some kind that matches the value in the DB and they would only pull their own records - but this I consider to be cheap, bad practice and unrealistic.
Having a DB table row lock as well, is not something I would do due to potential deadlocks and other possible issues.
I am also NOT indicating using threading "to the extreme" here but yes, there will be threading per item to process or batching them up per thread for x amount of threads.
How should I approach and what do you recommend on making a C# application which is scalable and has high availability? The aim is to have X servers, each with the same application and for each to be able to get records and process them but have the level of processing/items to process shared amoungst the servers so incase if one server or service fails, the other can take on that load until another server is put back.
Sorry for my lack of understanding or knowledge but have been thinking about this quite alot and had lack of sleep trying to think of a good robust solution.
I would be thinking of batching up the work, so each app only pulled back x number of records at a time, marking those retrieved records as taken with a bool field in the table. I'd amend the the SELECT statement to pull only records not marked as taken/done. Table locks would be ok in this instance for very short periods to ensure there is no overlap of apps processing the same records.
EDIT: It's not very elegant, but you could have a datestamp and a status for each entry (instead of a bool field as above). Then you could run a periodic Agent job which runs a sproc to reset the status of any records which have a status of In Progress but which have gone beyond a time threshold without being set to complete. They would be ready for reprocessing by another app later on.
This may not be enterprise-y enough for your tastes, but I'd bet my hide that there are plenty of apps out there in the enterprise which are just as un-sophisticated and work just fine. The best things work with the least complexity.
I have a requirement to monitor the Database rows continuously to check for the Changes(updates). If there are some changes or updates from the other sources the Event should be fired on my application (I am using a WCF). Is there any way to listen the database row continuously for the changes?
I may be having more number of events to monitor different rows in the same table. is there any problem in case of performance. I am using C# web service to monitor the SQL Server back end.
You could use an AFTER UPDATE trigger on the respective tables to add an item to a SQL Server Service Broker queue. Then have the queued notifications sent to your web service.
Another poster mentioned SqlDependency, which I also thought of mentioning but the MSDN documentation is a little strange in that it provides a windows client example but also offers this advice:
SqlDependency was designed to be used
in ASP.NET or middle-tier services
where there is a relatively small
number of servers having dependencies
active against the database. It was
not designed for use in client
applications, where hundreds or
thousands of client computers would
have SqlDependency objects set up for
a single database server.
Ref.
I had a very similar requirement some time ago, and I solved it using a CLR SP to push the data into a message queue.
To ease deployment, I created an CLR SP with a tiny little function called SendMessage that was just pushing a message into a Message Queue, and tied it to my tables using an AFTER INSERT trigger (normal trigger, not CLR trigger).
Performance was my main concern in this case, but I have stress tested it and it greatly exceeded my expectations. And compared to SQL Server Service Broker, it's a very easy-to-deploy solution. The code in the CLR SP is really trivial as well.
Monitoring "continuously" could mean every few hours, minutes, seconds or even milliseconds. This solution might not work for millisecond updates: but if you only have to "monitor" a table a few times a minute you could simply have an external process check a table for updates. (If there is a DateTime column present.) You could then process the changed or newly added rows and perform whatever notification you need to. So you wouldn't be listening for changes, you'd be checking for them. One benefit of doing the checking in this manner would be that you wouldn't risk as much of a performance hit if a lot of rows were updated during a given quantum of time since you'd bulk them together (as opposed to responding to each and every change individually.)
I pondered the idea of a CLR function
or something of the sort that calls
the service after successfully
inserting/updating/deleting data from
the tables. Is that even good in this
situation?
Probably it's not a good idea, but I guess it's still better than getting into table trigger hell.
I assume your problem is you want to do something after every data modification, let's say, recalculate some value or whatever. Letting the database be responsible for this is not a good idea because it can have severe impacts on performance.
You mentioned you want to detect inserts, updates and deletes on different tables. Doing it the way you are leaning towards, this would require you to setup three triggers/CLR functions per table and have them post an event to your WCF Service (is that even supported in the subset of .net available inside sql server?). The WCF Service takes the appropriate actions based on the events received.
A better solution for the problem would be moving the responsibility for detecting data modification from your database to your application. This can actually be implemented very easily and efficiently.
Each table has a primary key (int, GUID or whatever) and a timestamp column, indicating when the entry was last updated. This is a setup you'll see very often in optimistic concurrency scenarios, so it may not even be necessary to update your schema definitions. Though, if you need to add this column and can't offload updating the timestamp to the application using the database, you just need to write a single update trigger per table, updating the timestamp after each update.
To detect modifications, your WCF Service/Monitoring application builds up a local dictionay (preferably a hashtable) with primary key/timestamp pairs at a given time interval. Using a coverage index in the database, this operation should be really fast. The next step is to compare both dictionaries and voilá, there you go.
There are some caveats to this approach though. One of them is the sum of records per table, another one is the update frequency (if it gets too low it's ineffective) and yet another pinpoint is if you need access to the data previous to modification/insertion.
Hope this helps.
Why don't you use SQL Server Notification service? I think that's the exact thing you are looking for. Go through the documentation of notification services and see if that fits your requirement.
I think there's some great ideas here; from the scalability perspective I'd say that externalizing the check (e.g. Paul Sasik's answer) is probably the best one so far (+1 to him).
If, for some reason, you don't want to externalize the check, then another option would be to use the HttpCache to store a watcher and a callback.
In short, when you put the record in the DB that you want to watch, you also add it to the cache (using the .Add method) and set a SqlCacheDependency on it, and a callback to whatever logic you want to call when the dependency is invoked and the item is ejected from the cache.