Halt Linq query if it will take 'too' long - c#

Currently I have the need to create a reporting program that runs reports on many different tables within a SQL database. Multiple different clients require this functionality but some clients have larger databases than others. What I would like to know is whether it is possible to halt a query after a period of time if it has been taking 'too' long.
To give some context, some clients have tables with in excess of 2 million rows, although a different client may have only 50k rows in the same table. I want to be able to run the query for say 20 seconds and if it has not finished by then return a message to the user to say that the result set will be too large and the report needs to be generated outside of hours as we do not want to run resource intensive operations during the day.

Set the connection timeout on either your connection string or on the DataContext via the CommandTimeoutproperty. When the timeout expires, you will get a TimeoutException, and your query will be cancelled.
You cannot be sure that the query is cancelled on the server the very instant the timeout occurs, but in most cases it will cancel rather quickly. For details read the excellent article "There's no such thing as a query timeout...". The important part from there is:
A client signals a query timeout to the server using an attention
event. An attention event is simply a distinct type of TDS packet a
SQL Server client can send to it. In addition to connect/disconnect,
T-SQL batch, and RPC events, a client can signal an attention to the
server. An attention tells the server to cancel the connection's
currently executing query (if there is one) as soon as possible. An
attention doesn't rollback open transactions, and it doesn't stop the
currently executing query on a dime -- the server aborts whatever it
was doing for the connection at the next available opportunity.
Usually, this happens pretty quickly, but not always.
But remember, it will differ from provider to provider and it might even be subject to change between server versions.

You can do that easily if you run the quer on a background thread. Make the main thread start a timer and spawn a background thread that runs the query. If when 20 seconds are over the background thread hasn't returned a result, the main thread can cancel it.

Related

What happens if I run WAITFOR from C# and the .NET app crashes?

I have a C# app that executes SQL Server commands sequentially, in a single batch, using Entity Framework. Something like this:
Database.ExecuteSqlCommand("Insert into tbl1...;
WAITFOR delay '00:00:02.000';
Delete from tbl1 where...");
One of the commands is WAITFOR, which requires the DB to wait for about 2 secs before continuing to the next statement.
What will happen if the app will crash while the WAITFOR is running? Will SQL Server still wait for the defined time and then execute the commands, or will it stop in the middle?
Thanks!
To answer your question, I believe the client would have sent the entire batch to server and if client crashes the server will still execute the commands until a specific kill command is issued.
One approach is throw the code into a stored procedure and simply call the procedure. This way all the code will execute as long as it was called.
If your purpose is to delete the record even if the client crashes then you could try a SQL agent job approach as suggested. This way your client is not waiting till deletion is done (unless this is what you want) For this to work you would have to have a record inserted date time in the table so that you can scan the table every two minutes and delete rows based on record insert date time.
If you want more complex but reliable solution you could use Service broker where you could queue the record to be deleted and a separate process can delete it. Service broker might be too much for this simple task. But it depends on how stable you want the deletion to be.
Nachi

Is it safe to perform a long action in a separate thread?

I'm dealing with a CSV file that's being imported Client side.
This CSV is supposed to contain some information which will be destinated to perform an update in one of the tables of my company's database.
My C# function process the file, looking for errors and, if no errors were found, it sends a bunch of update commands (files usually vary from 50 to 100000 lines).
Until now, I was performing the update in the same thread to execute the update (line by line), but it was getting a little slow, depending on the file, so I choose to send all the SQL to an Azure SQL Queue (which is a service that gets lots of "messages" and runs the SQL code agains the database), so that, the client wouldn't have to wait that much for the action to be performed.
It got a little faster, but still takes a long time (due to the requests to the Azure SQL Queue). So I noticed that putting that action in a separate thread worked and sent all the SQL to my Azure SQL Queue.
I got a little worried about it though. Is it really safe to perform long actions in separate threads? Is it reliable?
A second thread is 100% like the main thread that you're used to working with. I wish I had some authority answer at hand but this is such a common practice that people don't write those anymore...
So, YES, off-loading the work to a second thread is safe and can be considered by most the recommended way to go by it.
Edit 1
Ok, if your thread is running on IIS, you need to register it or it will die because once the request/reponse cycle finishes it will kill it...

multi threading from multiple machines

I have researched a lot and I haven't found anything that meets my needs. I'm hoping someone from SO can throw some insight into this.
I have an application where the expected load is thousands of jobs per customer and I can have 100s of customers. Currently it is 50 customers and close to 1000 jobs per each. These jobs are time sensitive (scheduled by customer) and can run up to 15 minutes (each job).
In order to scale and match the schedules, I'm planning to run this as multi threaded on a single server. So far so good. But the business wants to scale more (as needed) by adding more servers into the mix. Currently the way I have it is when it becomes ready in the database, a console application picks up first 500 and uses Task Parallel library to spawn 10 threads and waits until they are complete. I can't scale this to another server because that one could pick up the same records. I can't update a status on the db record as being processed because if the application crashes on one server, the job will be in limbo.
I could do a message queue and have multiple machines pick from it. The problem with this is the queue has to be transactional to support handling for any crashes. MSMQ supports only MS DTC transaction since it involves database and I'm not really comfortable with DTC transactions, especially with multi threads and multiple machines. Too much maintenance and set up and possibly unknown issues.
Is SQL service broker a good approach instead? Has anyone done something like this in a production environment? I also want to keep the transactions short (A job could run for 15,20 minutes - mostly streaming data from a service). The only reason I'm doing a transaction is to keep the message integrity of queue. I need the job to be re-picked if it crashes (re-appear in the queue)
Any words of wisdom?
Why not having an application receive the jobs and insert them in a table that will contain the queue of jobs. Each work process can then pick up a set of jobs and set the status as processing, then complete the work and set the status as done. Other info such as server name that processed each job, start and end time-stamp could also be logged. Moreover, instead of using multiple threads, you could use independent work processes so as to make your programming easier.
[EDIT]
SQL Server supports record level locking and lock escalation can also be prevented. See Is it possible to force row level locking in SQL Server?. Using such mechanism, you can have your work processes take exclusive locks on jobs to be processed, until they are done or crash (thereby releasing the lock).

how to synchronize near realtime reads from a sql server table

We have a reporting app thats needs to update it's charts as the data gets written to it's corresponding table. (the report is based off just one table). Currently we just keep the last read sessionid + rowid (unique combo) in memory and a polling timer just does a select where rowid > what we have in memory (to get the latest rows added). Timer runs every second or so and the fast sql reader does it's job well. So far so good. However I feel this is not optimal because sometimes there are pauses in the data writes due to the process by design. (user clicking the pause button on the system that writes data ..). Meanwhile our timer keeps hitting the db and does not get any new rows. No errors or anything. How is this situation normally handled. The app that writes the data is separate from the reporting app. The 2 apps run on different machines. Bottomline : How to get data into a c# app as and when it is written into a sql server table without polling unnecessarily. thank you
SQL Server has the capability to notify a waiting application for changes, see The Mysterious Notification. This is how SqlDependency works. But this will only work up to a certain threshold of data change rate. If your data changes too frequently then the cost of setting up a query notification just to be immediately invalidated by receiving the notification is too much. For really high end rates of changes the best place is to notify the application directly from the writer, usually achieved via some forms of a pub-sub infrastructure.
You could also attempt a mixed approach: pool for changes in your display application and only set up a query notification if there are no changes. This way you avoid the cost of constantly setting up Query Notifications when the rate of changes is high, but you also get the benefits of non-pooling once the writes settle down.
Unfortunately the only 'proper' way is to poll, however you can reduce the cost of this polling by having SQL wait in a loop (make sure you WAITFOR something like 30ms each loop pass) until data is available (or a set time period elapses, e.g. 10s). This is commonly used when writing SQL pseudoqueues.
You could use extended procs - but that is fragile, or, you could drop messages into MSMQ.
If your reporting application is running on a single server then you can have the application that is writing the data to SQL Server also send a message to the reporting app letting it know that new data is available.
However, having your application connect to the server to see if new records have been added is the most common way of doing it. As long as you do the polling on a background thread, it shouldn't effect the performance of your application at all.
you will need to push the event out of the database into the realm of your application.
The application will need to listen for the message. (you will need to decide what listening means - what port, what protocol, what format etc.)
The database will send the message based on the event through a trigger. (you need to look up how to use external application logic in triggers)

SQL Design: Big table, thread access serialization

I have one BIG table(90k rows, size cca 60mb) which holds info about free rooms capacities for about 50 hotels. This table has very few updates/inserts per hour.
My application sends async requests to this(and joined tables) at max 30 times per sec.
When i start 30 threads(with default AppPool class at .NET 3.5 C#) at one time(with random valid sql query string), only few(cca 4) are processed asynchronously and other threads waits. Why?
Is it becouse of SQL SERVER 2008 table locking, or becouse of .NET core? Or something else?
If it is a SQL problem, can help if i split this big table into one table per each hotel model?
My goal is to have at least 10 threads servet at a time.
This table is tiny. It's doesn't even qualify as a "medium sized" table. It's trivial.
You can be full table scanning it 30 times per second, you can be copying the whole thing in ram, no server is going to be the slightest bit bothered.
If your data fits in ram, databases are fast. If you don't find this, you're doing something REALLY WRONG. Therefore I also think the problems are all on the client side.
It is more than likely on the .NET side. If it were table locking more threads would be processing, but they would be waiting on their queries to return. If I remember correctly there's a property for thread pools that controls how many actual threads they create at once. If there are more pending threads than that number, then they get in line and wait for running threads to finish. Check that.
Have you tried changing the transaction isolation level?
Even when reading from a table Sql Server will lock the table
try setting the isolation level to read uncommitted and see if that improves the situation,
but be advised that its feasible that you will read 'dirty' data make sure you understand the ramifications if this is the solution
this link explains what it is.
link text
Rather than ask, measure. Each of your SQL queries that is actually submitted by your application will create a request on the server, and the sys.dm_exec_requests DMV shows the state of each request. When the request is blocked the wait_type column shows a non-empty value. You can judge from this whether your requests are blocked are not. If they are blocked you'll also know the reason why they are blocked.

Categories