I am doing a project that needs to communicate with 20 small computer boards. I will need to keep check of their connections and they will also return some data to me. So my aim is to build a control/monitoring system for these boards.
I will be using Visual Studio 2010 and C# WPF.
My idea/ plan would be like this:
On the main thread:
There will be only one control window, so a main thread will be created mainly to update the data to be displayed. Datas of each board will be display and refreshed at a time interval of 1s. The source of data will be from a database where the main thread will look for the latest data(I have not decided on which kind of database to use yet).
There will be control buttons on the control window too. I already have a .dll library, so I will only need to call the functions inside to direct the boards to action (by starting another thread).
There will be two services:
(Timer service) One will be a scheduled timer to turn the boards on/ off at a specific time. Users would be able to change the on/ off time. It would read from the database to get the on/ off time.
(Connection service) Another one will be responsible to ask and receive information/ status from the board every 30s or less. The work would be including connecting with the board through internet, asking for data, receiving the data and then writing the data to the database. And also writing down the exceptions thrown if the internet connection failed.
My questions:
1) For the connection service, I am wondering if I should be starting 20 threads to do this, one thread per connection to a board. Because if the connections were made by only one thread, the next board connection must wait for the first to finish, which may add up to 1-2 mins for the whole process to end. So I would need around 20 - 40 mins to get all the data back. But if I separate the connection to 20 threads, will it make a big difference in the performance? As the 20 threads never dies, it keeps asking for data every 30s if possible. Besides, does that mean I will have to have 20 database, as it would clash the database if 20 threads are writing in at the same time?
2) For updating the display of data on the main thread for every 1s, should I also start a service to do this? And as the connection service is also accessing the same database, will this clash the database?
There will be more than 100 boards to control and monitor in the future, so I would like to make the program as light as possible.
Thank you very much! Comments and ideas very much appreciated!
Starting 20 threads would be the best bet. (Or as Ralf said, use a thread when needed, in your specific case, it would probably be 20 at some point). Most databases are thread safe, meaning you can write into them from separate threads. If you use a "real" database, this isn't any issue at all.
No, use a Timer on the main thread to update your UI. The UI can easily read from the DB. As long as the update action itself is not taking a lot of time, it is OK to do it on the UI thread.
1) Why not use threads when needed. You can use one DBMS they are build to processing large amounts of information.
2) Not sure what you mean by start a service for the UI thread. As with 1) Database Management Systems are build to process data.
Related
I am developing a c# winform gui that revolves around a few datagridviews that need to be updated at some manageable interval depending on # of updates, somewhere around the 1 - 3 second mark.
Currently, upon startup, it retrieves the current batch of data for the grids from a mysql database, then listens to various redis channels for updates/adds/deletes to those grids. These updates are handled by the redis multiplexer and heavy calculations are offloaded to worker threads, then the gui is updated. So essentially it is one to one, meaning one update/add/delete is processed and the gui is then updated. This works well enough so far however during a crunch of data i'm beginning to notice slowness, it should not happen since i'm expecting much heavier loads in the future.
As the system throughput grows, where it is currently at most around 100 redis messages every couple of seconds, this should be designed to handle thousands.
Conceptually, when it comes to general gui design in this fashion, would it be better to do one of the following:
Decouple from the current 1 to 1 update scenario described above, redis msg -> process -> update gui, and have all redis messages queue in a list or datatable, then on a timer poll this awaiting update queue by the gui and update. This way the gui is not flooded, it updates on its own schedule.
Since these updates coming from redis are also persisted in the mysql database, just ignore redis completely, and at some timed interval query the database, however this would probably result in an entire requeue since it will be tough to know what has changed since the last pull.
Do away with attempting to update the gui in semi-realtime fashion, and only provide a summary view then if user digs in, retrieve data accordingly, but this still runs in to the same problem as the data that is then being viewed should be updated, albeit a smaller subset. However, there exist tons of sophisticated c# enterprise level applications that represent large amounts of data updating especially in the finance industry and seem to work just fine.
What is best practice here? I prefer options 1 or 2 because in theory it should be able to work.
thank you in advance
I have about 10.000 jobs that I want to be handled by approx 100 threads. Once a thread finished, the free 'slot' should get a new job untill there are no more jobs available.
Side note: processor load is not an issue, these jobs are mostly waiting for results or (socket) timeouts. And the amount of 100 is something that I am going to play with to find an optimum. Each job will take between 2 seconds and 5 minutes. So I want to assign new jobs to free threads and not pre-assign all jobs to threads.
My problem is that I am not sure how to do this. Im primarily using Visual Basic .Net (but C# is also ok).
I tried to make an array of threads but since each job/thread also returns a value (it also takes 2 input vars), I used 'withevents' and found out that you cannot do that on an array... maybe a collection would work? But I also need a way to manage the threads and feed them new jobs... And all results should go back to the main-form (thread)...
I have it all running in one thread, but now I want to speed up.
And then I though: Actually this is a rather common problem. There is a bunch of work to be done that needs to be distributed over an amount of worker threads.... So thats why I am asking. Whats the most common solution here?
I tried to make it question as generic as possible, so lots of people with the same kind of problem can be helped with your reply. Thanks!
Edit:
What I want to do in more detail is the following. I currently have about 1200 connected sensors that I want to read from via sockets. First thing I want to know is if the device is online (can connect on ip:port) or not. After it connects it will be depending on the device type. The device type is known after connect and Some devices I just read back a sensor value. Other devices need calibration to be performed, taking up to 5 minutes with mostly wait times and some reading/setting of values. All via the socket. Some even have FTP that I need to download a file from, but that I do via socket to.
My problem: Lot's of waiting time, so lot's of possibility to do things paralel and speed it up hugely.
My starting point is a list of ip:port addresses and I want to end up with a file with that shows the results and the results are also shown on a textbox on the main form (next to a start/pause/stop button)
This was very helpfull:
Multi Threading with Return value : vb.net
It explains the concept of a BackgroundWorker which takes away a lot of the hassle. I am now trying to see where it will bring me.
I have researched a lot and I haven't found anything that meets my needs. I'm hoping someone from SO can throw some insight into this.
I have an application where the expected load is thousands of jobs per customer and I can have 100s of customers. Currently it is 50 customers and close to 1000 jobs per each. These jobs are time sensitive (scheduled by customer) and can run up to 15 minutes (each job).
In order to scale and match the schedules, I'm planning to run this as multi threaded on a single server. So far so good. But the business wants to scale more (as needed) by adding more servers into the mix. Currently the way I have it is when it becomes ready in the database, a console application picks up first 500 and uses Task Parallel library to spawn 10 threads and waits until they are complete. I can't scale this to another server because that one could pick up the same records. I can't update a status on the db record as being processed because if the application crashes on one server, the job will be in limbo.
I could do a message queue and have multiple machines pick from it. The problem with this is the queue has to be transactional to support handling for any crashes. MSMQ supports only MS DTC transaction since it involves database and I'm not really comfortable with DTC transactions, especially with multi threads and multiple machines. Too much maintenance and set up and possibly unknown issues.
Is SQL service broker a good approach instead? Has anyone done something like this in a production environment? I also want to keep the transactions short (A job could run for 15,20 minutes - mostly streaming data from a service). The only reason I'm doing a transaction is to keep the message integrity of queue. I need the job to be re-picked if it crashes (re-appear in the queue)
Any words of wisdom?
Why not having an application receive the jobs and insert them in a table that will contain the queue of jobs. Each work process can then pick up a set of jobs and set the status as processing, then complete the work and set the status as done. Other info such as server name that processed each job, start and end time-stamp could also be logged. Moreover, instead of using multiple threads, you could use independent work processes so as to make your programming easier.
[EDIT]
SQL Server supports record level locking and lock escalation can also be prevented. See Is it possible to force row level locking in SQL Server?. Using such mechanism, you can have your work processes take exclusive locks on jobs to be processed, until they are done or crash (thereby releasing the lock).
We're using RabbitMQ for storing lightweight messages that we eventually want to store in our SQL Server database. There will be times when the queue is empty and times when there is a spike of traffic - 30,000 messages.
We have a C# console app running in the same server.
Do we have the console app run every minute or so and grab a designated number of items off the queue for insertion into the database? (taking manageable bites)
OR
Do we have the console app always "listen" and hammer items into the database as they come in? (more aggressive approach)
Personally I'd go for the first approach. During those "spike" times, you're going to be hammering the database with potentially 30,000 inserts. Whilst this potentially could complete quite quickly (depending on many variables outside the scope of this question), we could do this a little smarter.
Firstly, by periodically polling, you can grab "x" messages from the queue and bulk insert them in a single go (performance-wise, you might want to tweak the the 2 variables here... polling time and how many you take from the queue).
One problem with this approach is that you might end up falling behind during busy periods. So you could make your application change it's polling time based on how many it is receiving, whilst keeping between some min/max thresholds. E.g. if you suddenly get a spike and grab 500 messages... you might decrease your poll time. If the next poll, you can still get thousand, do it again, decrease poll time. As the number you are able to get drops off, you can then begin increasing your polling time under a particular threshold.
This would give you the best of both world imho and be reactive to the spikes/lull periods.
It depends a bit on your requirement but I would create a service that calls SQLBulkCopy to that bulk inserts every couple of minutes. This is by far the fastests approach. Also if your Spike is 30k records I would not worry too much about falling behind.
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx
We have a C# console app running in the same server.
Why not a Service?
What I would do is have the console app always listen to the rabbitmq and then in the console app build your own queue for inserting into the database that way you can throttle the database insertion. By doing this you can control the flow in busy a time by only allowing so many tasks at once and then in slow times you get a faster reaction then polling every so often. The way I would do this is by raising an event and the you know there is something to do in the queue and you can check the queue length to see how many transactions you want to process.
Instead of using a Console Application, you could set up a Windows Service, and set up a timer on the service to poll every n minutes. Take a look at the links below:
http://www.codeproject.com/Questions/189250/how-to-use-a-timer-in-windows-service
http://msdn.microsoft.com/en-us/library/zt39148a.aspx
With a Windows Service, if the server is re-booted, the service can be set up to restart.
First off, I will be talking about some legacy code and we are trying to avoid changing it as much as possible. Also, my experience with windows services and WCF is a bit limited so some of the questions may be a bit newbie. Just to give a bit of context before the question.
We have an existing service that loops. It checks via a database call to see if it has records to process. If it does not find any records, it sleeps for 30 seconds and then wakes back up to try again.
I would like to add an entry point to this service that would allow me to pass a record to this service in addition to it processing the records from the database. So the basic flow would be.
Loop
* Read record from database
* If no record from DB, process any records that were passed in via the entry point.
* No records at all, sleep for 30 seconds.
My concern is this. Is it possible to implement this in one service such that I have the looping process but I also allow for calls to come in at any time and add additional items to a queue that can be processed within the loop. My concern is with concurrency and keeping the loop and the listener from stepping on each other.
I know this question may not be worded quite right but I am on the new side with working with this. Any help would be appreciated.
My concern is with concurrency and keeping the loop and the listener from stepping on each other.
This shouldn't be an issue, provided you synchronize access correctly.
The simplest option might be to use a thread safe collection, such as a ConcurrentQueue<T>, to hold your items to process. The WCF service can just add items to the collection without worry, and your next processing step would handle it. The synchronization in this case is really minimal, as the queue would already be fully thread safe.
In addition to Reed's excellent answer, you might want to persist the records in a MSMQ queue to prevent your service from losing records on shutdown, restart, or crash of your service.