How to manually lock and unlock table such that inserts are prevented - c#

The code below works but does not prevent a different user from inserting a row and thus creating a duplicate ID.
The IDs for the table being updated are auto incremented and assigned. In the code below I do the following:
Get the next available ID (nextID)
Set the ID of each entity to nextID++
Bulk insert
How do I lock the table such that another user cannot insert while the the three tasks above are running? I have seen similar questions that propose setting ISOLATIONLEVEL READCOMMITTED however I don't think that will lock the table at the time I am getting the nextID.
public void BulkInsertEntities(List<Entity> entities)
{
if (entities == null)
throw new ArgumentNullException(nameof(entities));
string tableName = "Entities";
// -----------------------------------------------------------------
// Prevent other users from inserting (but not reading) here
// -----------------------------------------------------------------
long lastID = GetLastID(tableName);
entities.ForEach(x => x.ID = lastID++);
using (SqlConnection con = new SqlConnection(db.Database.GetDbConnection().ConnectionString))
{
con.Open();
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(con.ConnectionString, SqlBulkCopyOptions.KeepIdentity))
{
bulkCopy.DestinationTableName = tableName;
DataTable tbl = DataUtil.ToDataTable<Entity>(entities);
foreach (DataColumn col in tbl.Columns)
bulkCopy.ColumnMappings.Add(col.ColumnName, col.ColumnName);
bulkCopy.WriteToServer(tbl);
}
}
// ---------------------------
// Allow other users to insert
// ---------------------------
}
protected long GetLastID(string tableName)
{
long lastID = 0;
using (var command = db.Database.GetDbConnection().CreateCommand())
{
command.CommandText = $"SELECT IDENT_CURRENT('{tableName}') + IDENT_INCR('{tableName}')";
db.Database.OpenConnection();
lastID = Convert.ToInt64(command.ExecuteScalar());
}
return lastID;
}

For identity-like functionality with a variant on the flexibility, you can create a named sequence:
create sequence dbo.MySequence as int
...and have a default constraint on the table: default(next value for dbo.MySequence).
Nice thing about this is that you can "burn" IDs and send them to clients so they have a key they can put into their data...and then, when the data comes in pre-populated, no harm, no foul. It takes a little more work than identity fields, but it's not too terrible. By "burn" I mean you can get a new ID anytime by calling next value for dbo.MySequence anywhere you like. If you hold onto that value, you know it's not going to be assigned to the table. The table will get the next value after yours. You can then, at your leisure insert a row with the value you got and held...knowing it's a legit key.
There is a feature in SQL Server call application locks. I've only rarely seen it used, but your example might be suitable. Basically, the idea is that you'd put triggers on tables that start by testing for an outstanding app_lock:
if ( applock_test( 'public', 'MyLock', 'Exclusive' ) = 1 )
begin
raiserror( ... )
return
--> or wait and retry
end
...and the long-running process that can't be interrupted gets the applock at the beginning and releases it at the end:
exec #rc = get_applock #dbPrincipal='public', #resource='MyLock', #lockMode='Exclusive'
if ( #rc = 0 )
begin
--> got the lock, do the damage...
--> and then, after carefully handling the edge cases,
--> and making sure we dont skip the release...
exec release_applock #resource='MyLock' #dbPrincipal='public'
end
There are lots of variations. Session-based locks which can be auto-released when a session ends (beware of connection pooling), timeouts, multiple lock modes (shared, exclusive, etc.), and scoped locks (that may not apply to privileged db users).

Related

Why does my SQL update for 20.000 records take over 5 minutes?

I have a piece of C# code, which updates two specific columns for ~1000x20 records in a database on the localhost. As I know (though I am really far from being a database expert), it should not take long, but it takes more than 5 minutes.
I tried SQL Transactions, with no luck. SqlBulkCopy seems a bit overkill, since it's a large table with dozens of columns, and I only have to update 1/2 column for a set of records, so I would like to keep it simple. Is there a better approach to improve efficiency?
The code itself:
public static bool UpdatePlayers(List<Match> matches)
{
using (var connection = new SqlConnection(Database.myConnectionString))
{
connection.Open();
SqlCommand cmd = connection.CreateCommand();
foreach (Match m in matches)
{
cmd.CommandText = "";
foreach (Player p in m.Players)
{
// Some player specific calculation, which takes almost no time.
p.Morale = SomeSpecificCalculationWhichMilisecond();
p.Condition = SomeSpecificCalculationWhichMilisecond();
cmd.CommandText += "UPDATE [Players] SET [Morale] = #morale, [Condition] = #condition WHERE [ID] = #id;";
cmd.Parameters.AddWithValue("#morale", p.Morale);
cmd.Parameters.AddWithValue("#condition", p.Condition);
cmd.Parameters.AddWithValue("#id", p.ID);
}
cmd.ExecuteNonQuery();
}
}
return true;
}
Updating 20,000 records one at a time is a slow process, so taking over 5 minutes is to be expected.
From your query, I would suggest putting the data into a temp table, then joining the temp table to the update. This way it only has to scan the table to update once, and update all values.
Note: it could still take a while to do the update if you have indexes on the fields you are updating and/or there is a large amount of data in the table.
Example update query:
UPDATE P
SET [Morale] = TT.[Morale], [Condition] = TT.[Condition]
FROM [Players] AS P
INNER JOIN #TempTable AS TT ON TT.[ID] = P.[ID];
Populating the temp table
How to get the data into the temp table is up to you. I suspect you could use SqlBulkCopy but you might have to put it into an actual table, then delete the table once you are done.
If possible, I recommend putting a Primary Key on the ID column in the temp table. This may speed up the update process by making it faster to find the related ID in the temp table.
Minor improvements;
use a string builder for the command text
ensure your parameter names are actually unique
clear your parameters for the next use
depending on how many players in each match, batch N commands together rather than 1 match.
Bigger improvement;
use a table value as a parameter and a merge sql statement. Which should look something like this (untested);
CREATE TYPE [MoraleUpdate] AS TABLE (
[Id] ...,
[Condition] ...,
[Morale] ...
)
GO
MERGE [dbo].[Players] AS [Target]
USING #Updates AS [Source]
ON [Target].[Id] = [Source].[Id]
WHEN MATCHED THEN
UPDATE SET SET [Morale] = [Source].[Morale],
[Condition] = [Source].[Condition]
DataTable dt = new DataTable();
dt.Columns.Add("Id", typeof(...));
dt.Columns.Add("Morale", typeof(...));
dt.Columns.Add("Condition", typeof(...));
foreach(...){
dt.Rows.Add(p.Id, p.Morale, p.Condition);
}
SqlParameter sqlParam = cmd.Parameters.AddWithValue("#Updates", dt);
sqlParam.SqlDbType = SqlDbType.Structured;
sqlParam.TypeName = "dbo.[MoraleUpdate]";
cmd.ExecuteNonQuery();
You could also implement a DbDatareader to stream the values to the server while you are calculating them.

How to get better performance from LINQ-to-SQL large dataset update

I have a c# console application that is updating a database with about 320,000 records. Basically it is encrypting a password in each record in a loop, then calling DatabaseContext.SubmitChanges(). The "UPDATE" part of the code takes about 20 seconds. I had to CTRL-C the app because it's taking over 15 minutes to do the "SubmitChanges" part: this is part of a time-sensitive system that should not be down for more than a couple minutes.
I ran SQL Profiler and I'm seeing queries like this for each update:
exec sp_executesql N'UPDATE [dbo].[PointRecord]
SET [PtPassword] = #p19
WHERE ([ID] = #p0) AND ([PtLocation] = #p1) AND ([PtIPAddress] = #p2) AND ([PtPort] = #p3) AND ([PtUsername] = #p4) AND ([PtPassword] = #p5) AND ([PtGWParam1] = #p6) AND ([PtGWParam2] = #p7) AND ([PtGWParam3] = #p8) AND ([PtGWParam4] = #p9) AND ([PtTag] = #p10) AND ([PtCapture] = #p11) AND ([PtGroup] = #p12) AND ([PtNumSuccess] = #p13) AND ([PtNumFailure] = #p14) AND ([PtControllerType] = #p15) AND ([PtControllerVersion] = #p16) AND ([PtAssocXMLGroupID] = #p17) AND ([PtErrorType] IS NULL) AND ([PtPollInterval] = #p18)',N'#p0 int,#p1 nvarchar(4000),#p2 nvarchar(4000),#p3 nvarchar(4000),#p4 nvarchar(4000),#p5 nvarchar(4000),#p6 nvarchar(4000),#p7 nvarchar(4000),#p8 nvarchar(4000),#p9 nvarchar(4000),#p10 nvarchar(4000),#p11 int,#p12 nvarchar(4000),#p13 int,#p14 int,#p15 nvarchar(4000),#p16 nvarchar(4000),#p17 int,#p18 int,#p19 nvarchar(4000)',#p0=296987,#p1=N'1234 Anytown USA',#p2=N'10.16.31.20',#p3=N'80',#p4=N'username1',#p5=N'password1',#p6=N'loadmon.htm?PARM2=21',#p7=N'>Operating Mode',#p8=N'',#p9=N'',#p10=N'1234 Anytown USA\HLTH SERVICE LTS\Operating Modeloadmon',#p11=0,#p12=N'1234 Anytown USA',#p13=0,#p14=0,#p15=N'DeviceA',#p16=N'3.5.0.2019.0219',#p17=309,#p18=15,#p19=N'hk+MUoeVMG69pOB3DHYB8g=='
As you can see, the "WHERE" part is asking for EVERY SINGLE FIELD to match, when this is an indexed table, using unique primary key "ID". This is really time-consuming. Is there any way to get this to only use "WHERE ID=[value]"?
I understand now that checking every field is a requirement of concurrency checking in EF. To bypass, methods outside of LINQ are required. I ended up using a variation of what Mr. Petrov and Mr. Harvey suggested, using ExecuteCommand since I am updating the database, not querying for data. Here is sample code, in case it can help others with a similar issue.
It uses LINQ to get the records to update and the record count for user feedback.
It uses ExecuteCommand to update the records. I am actually updating three tables (only one is shown in the sample below), hence the use of a transaction object.
The EncryptPassword method is not shown. It is what I use to update the records. You should replace that with whatever update logic suits your needs.
static void Main(string[] args)
{
DatabaseHelpers.Initialize();
if (DatabaseHelpers.PasswordsEncrypted)
{
Console.WriteLine("DatabaseHelpers indicates that passwords are already encrypted. Exiting.");
return;
}
// Note that the DatabaseHelpers.DbContext is in a helper library,
// it is a copy of the auto-generated EF 'DataClasses1DataContext'.
// It has already been opened using a generated connection string
// (part of DatabaseHelpers.Initialize()).
// I have altered some of the variable names to hide confidential information.
try
{
// show user what's happening
Console.WriteLine("Encrypting passwords...");
// flip switch on encryption methods
DatabaseHelpers.PasswordsEncrypted = true;
int recordCount = 0;
// Note: Using LINQ to update the records causes an unacceptable delay because of the concurrency checking
// where the UPDATE statement (at SubmitChanges) checks EVERY field instead of just the ID
// and we don't care about that!
// We have to set up an explicit transaction in order to use with context.ExecuteCommand statements
// start transaction - all or nothing
DatabaseHelpers.DbContext.Transaction = DatabaseHelpers.DbContext.Connection.BeginTransaction();
// update non-null and non-empty passwords in groups
Console.Write("Updating RecordGroups");
List<RecordGroup> recordGroups = (from p in DatabaseHelpers.DbContext.RecordGroups
where p.RecordPassword != null && p.RecordPassword != string.Empty
select p).ToList();
recordCount = recordGroups.Count;
foreach (RecordGroup rGroup in recordGroups)
{
// bypass LINQ-to-SQL
DatabaseHelpers.DbContext.ExecuteCommand("UPDATE RecordGroup SET RecordPassword={0} WHERE ID={1}", DatabaseHelpers.EncryptPassword(rGroup.RecordPassword), rGroup.ID);
Console.Write('.');
}
// show user what's happening
Console.WriteLine("\nCommitting transaction...");
DatabaseHelpers.DbContext.Transaction.Commit();
// display results
Console.WriteLine($"Updated {recordCount} RecordGroup passwords. Exiting.");
}
catch (Exception ex)
{
Console.WriteLine($"\nThere was an error executing the password encryption process: {ex}");
DatabaseHelpers.DbContext.Transaction.Rollback();
}
}

Deleting Large data without transaction logs in SQL azure

I want to delete a large amount of data from azure SQL table frequently using the below code, but when deleting records then transactions logs will be created which will consume Database data storage ,how could we perform deletion without transactions logs and consuming database data storage ?
Task.Run(async () =>
{
long maxId = crumbManager.GetMaxId(fromDate,tenantId);
var startingTime = DateTime.UtcNow;
while (!cancellationToken.IsCancellationRequested && maxId > 0 && startingTime.AddHours(2) > DateTime.UtcNow)
{
try
{
var query = $#"delete top(10000) from Crumbs where CrumbId <= #maxId and TenantId =#tenantId ";
using (var con = new SqlConnection(connection))
{
con.Open();
using (var cmd = new SqlCommand(query, con))
{
cmd.Parameters.AddWithValue("#maxId", maxId);
cmd.Parameters.AddWithValue("#tenantId", tenantId);
cmd.CommandTimeout = 200;
var affected = cmd.ExecuteNonQuery();
if (affected == 0)
{
break;
}
}
}
}
catch (Exception ex)
{
}
finally
{
await Task.Delay(TimeSpan.FromSeconds(5), cancellationToken.Token);
}
}
});
You can't. Databases make changes using a transaction log so that it can handle failures in the middle of a transaction. So, even delete operations use space in the transaction log. Now, the transaction log only takes space (when using full recovery like SQL Azure does for user databases) until the next backup operation. Those are happening every few minutes today, so the time in which space is required on disk for the log is minimal.
There are some operations which are minimally logged and use less space than doing row-by-row deletes. For example, if you do a truncate table or swap out a partition from a partitioned table (and then drop it), then you generate much less log than doing row-by-row. You would need to consider some design changes to your schema to enable this pattern since you aren't just deleting all rows now.
Ultimately, you should just focus on making sure that the operation you perform in SQL Azure is efficient. if you loop over a heap and delete K rows over and over, that can algorithmically perform many scans over the table instead of range scans. If you do that even without any of the fancy truncate/partition approaches, you may be able to improve the performance of the system over what you might have now.
Hope that helps explain how SQL works a bit.
Try to use batching techniques to minimize log usage.
declare
#batch_size int,
#del_rowcount int = 1
set #batch_size = 100
set nocount on;
while #del_rowcount > 0
begin
begin tran
delete top (#batch_size)
from dbo.LargeDeleteTest
set #del_rowcount = ##rowcount
print 'Delete row count: ' + cast(#del_rowcount as nvarchar(32))
commit tran
end
Drop any foreign keys, delete the rows and then recreate the foreign keys can speed up things also.

how to execute method or DML in foreach loop as parallel to accelarate execution time?

I have the following code which do query table and update rows inside it,
then execute insert depending on previus query:
private void button1_Click(object sender, EventArgs e)
{
doDBDML();
}
private void doDBDML()
{
using (ThreadingDBEntities db = new ThreadingDBEntities())
{
var rows = db.people.Where(x => x.id == null).ToList();
foreach (person p in rows) // how to execute it in parallel.
{
p.id = p.personID; // update null value.
// execute stored procedure which has output parameter to get id (return existing id or insert new row and get its id).
ObjectParameter outParam = new ObjectParameter("p_id", typeof(Int32));
db.sp_getCompanyId(p.company, outParam);
// and new row. this depends on current person object id , and company id which has returned from stored precedure.
User_Company userComp = new User_Company();
userComp.person_Id = p.personID;
userComp.Company_Id = (Int32) outParam.Value;
db.SaveChanges();
}
}
}
The store procedure is
CREATE PROCEDURE [dbo].[sp_getCompanyId]
#p_ocmpany_Name nvarchar(255),
#p_id int output
AS
SELECT #p_id = id
from Company
where Company_name = #p_ocmpany_Name
if #p_id is null
begin
begin transaction;
insert into Company (company_name) values (#p_ocmpany_Name);
select #p_id = SCOPE_IDENTITY();
commit transaction;
end;
RETURN #p_id
this works fine, but the problem is:
- The query returns a huge number of rows, so the execution takes a long time to execute, how to accelerate it? how to call the above doDBDML method to execute it as parallel or bulk execution?
while executing the application on huge number of rows, it doesn't response, how to solve this issue? running it in background or as new thread ???
My question is related to windows forms application, however I also want to know if the solution is good for asp.net
As said by TomC and TheGeneral there's probably something wrong with your solution to tackle your problem.
If you want to check how much speed you would gain with simple parallelism you could check out
Parallel.Foreach
https://msdn.microsoft.com/de-de/library/dd460720(v=vs.100).aspx
But also keep in mind, if you work with multiple threads that you could possibly create duplicates inside the database.
e.g.
Thread #1 - enters stored-procedure
Thread #1 - cannot find customer "customerA" -> start transaction for creating a new customer
Thread #2 - enters stored-procedure
Thread #2 - cannot find customer "customerA" -> start transaction
Thread #2 - finish transaction -> return id '42'
Thread #1 - finish transaction -> return id '666'
Now you have two customer datasets for the same customer
ID | Name
42 | customerA
666 | customerA

Loop to insert data in OleDb breaks when I try to execute more than 1 query

I'm working on a small offline C# application with an Access 2002 database (.mdb) and OleDb.
I have 2 tables where I need to insert data at the same time, one holding a foreign key of the other. So, to simplify let's say one table has 2 attributes: "idTable1" (auto-increment integer) and "number", and the other has 2 attributes: "idTable2" (auto-increment integer) and "fkTable1" (foreign key containing an integer value that matches an "idTable1" from table 1).
A foreach loop iterates over a collection and inserts each element in Table1. Then the idea is to use a SELECT ##Identity query on Table1 to get the auto-incrementing id field of the last record that was inserted, and insert that in Table2 as a foreign key.
I'm just trying the first part before I attempt to insert the foreign key: loop over a collection, insert each item in Table1 and get the idTable1 of the last inserted record. But whenever I try to execute SELECT ##Identity I get only 1 record in my database, even when the loop correctly iterates over all the collection items.
My code looks like this:
string queryInsertTable1 = "INSERT INTO Table1 (numero) VALUES (?)";
string queryGetLastId = "Select ##Identity";
using (OleDbConnection dbConnection = new OleDbConnection(strDeConexion))
{
using (OleDbCommand commandStatement = new OleDbCommand(queryInsertTable1, dbConnection))
{
dbConnection.Open();
foreach (int c in Collection)
{
commandStatement.Parameters.AddWithValue("", c);
commandStatement.ExecuteNonQuery();
commandStatement.CommandText = queryGetLastId;
LastInsertedId = (int)commandStatement.ExecuteScalar();
}
}
}
If I comment out the last 3 lines:
commandStatement.CommandText = queryGetLastId;
LastInsertedId = (int)commandStatement.ExecuteScalar();
Then all records from Collection are correctly inserted in the BD. But as soon as I un-comment those, I get just 1 record inserted, while the value stored in "c" is the last element in the collection (so the loop worked fine).
I also tried calling commandStatement.Parameters.Clear() right after the commandStatement.ExecuteNonQuery() sentence, but that makes no difference (and it shouldn't, but I still tried).
I don't want to make things complicated by using transactions and such, if I can avoid them, since this is a very simple, single-computer, offline and small application. So if anyone knows what I could do to make that code work, I'd be very grateful :)
Here: commandStatement.CommandText = queryGetLastId; you actually changing your command from inserting to selecting identity.
Thus, on the next iteration it will not insert anything, but again select for identity, that's why you're having only one record inserted into DB.
I think it's better to have two separate commands for inserting and for selecting identity.
Also note - you're trying to add new parameter into commandStatement on each iteration, so on iteration, say, N it will be N parameters. Either clear parameters before adding new one, or add parameter outside of loop and in the loop change its value only.
using (OleDbConnection dbConnection = new OleDbConnection(strDeConexion))
{
using (OleDbCommand commandStatement = new OleDbCommand(queryInsertTable1, dbConnection))
using (OleDbCommand commandIdentity = new OleDbCommand(queryGetLastId , dbConnection))
{
commandStatement.Parameters.Add(new OleDbParameter());
dbConnection.Open();
foreach (int c in Collection)
{
commandStatement.Parameters[0].Value = c;
commandStatement.ExecuteNonQuery();
LastInsertedId = (int)commandIdentity.ExecuteScalar();
}
}
}
Once you changed the commandtext the first query is gone from it so it wont insert any data to the first table again. reassing it and try

Categories