Suppose I have an application that extract some data from an internet site and add them to a database. This application run in multiple instance, each instance extract data for a specific country.
Some data are linked to a master table called rounds which have as PK an auto-increment field, my doubt comes from this code:
using (MySqlConnection connection = new DBConnection().Connect())
{
using (MySqlCommand command = new MySqlCommand())
{
command.Connection = connection;
command.CommandText = "INSERT IGNORE INTO competition_rounds (round_id, season_id, `name`)
VALUES (#round_id, #season_id, #round_name)";
command.Parameters.Add("#round_id", MySqlDbType.Int32).Value = round.Id;
command.Parameters.Add("#season_id", MySqlDbType.Int32).Value = round.seasonId;
command.Parameters.Add("#round_name", MySqlDbType.VarChar).Value = round.Name;
command.ExecuteNonQuery();
return Convert.ToInt32(command.LastInsertedId);
}
}
The code above add a new round to the rounds table, and this works well. But if I have multiple instances running, is possible that the application will fire the same code (in both the instances) and return the same id for both instance? eg:
instance 1 -> fire round insert -> return 3
instance 2 -> fire round insert -> return 3
both instance has executed the same method in the exact same time. Could this situation happen? Is possible prevent that? Should I create a guid or a composed PK?
The client loads the LastInsertedId property from the OK_PACKET:
An OK packet is sent from the server to the client to signal
successful completion of a command. As of MySQL 5.7.5, OK packets are
also used to indicate EOF, and EOF packets are deprecated.
On the server side, from the documentation:
You can retrieve the most recent automatically generated
AUTO_INCREMENT value with the LAST_INSERT_ID() SQL function or the
mysql_insert_id() C API function. These functions are
connection-specific, so their return values are not affected by
another connection which is also performing inserts.
In other words, this kind of situation is accounted for (in any respectable DB system).
You'll be fine.
Database Management Systems (DBMSs) such as MySQL operate on a basis of ACID (Atomicity, Consistency, Isolation, Durability) transactions. These transactions are scheduled in a sequential all or nothing fashion. Therefore, you don't need to worry about parallel transactions.
That said, with the multiple application instances you may need to worry about which transaction is process first. That is, UserA of application instanceA may send insert A and UserB of application instanceB may send insert B some time after UserA. Even though UserA sent the request first, they can be received and process in B then A order - perhaps due to network latency.
Related
I am trying to fetch a collection of custom objects from an oracle db (v21) from a .net client. Since i cant do any type mapping i want to fetch it as json.
Here is the query:
select json_array("UDTARR") from sys.typetest
This is the result i see in sql developer (expected output):
This is what i get when i execute the same query via .net:
"[]"
The same strategy (json_array()) seems to work fine in .net for collections of primitive types as well as for non-collection-type fields of the same custom object.
Please someone tell me i´m missing something obvious?
Here are the type definitions in case someone wants to try to replicate the issue:
The type that is used in the field "UDTARR":
create type udtarray AS VARRAY(5) OF TEST_DATATYPEEX;
Type "TEST_DATATYPEEX":
create type TEST_DATATYPEEX AS OBJECT
(test_id NUMBER,
vc VARCHAR2(20),
vcarray stringarray)
Type "STRINGARRAY":
create type stringarray AS VARRAY(5) OF VARCHAR2(50);
Code for executing the query and reading the value:
string query = "select json_array(\"UDTARR\") from sys.typetest"
using (var command = new OracleCommand(query, con))
using (var reader = command.ExecuteReader()){
while (reader.Read()){
Console.WriteLine(reader.GetString(0))
}
}
In the Eventlog both queries are recorded, in both cases the user is connected with dba privileges:
(from sql developer)
Audit trail: LENGTH: '362' ACTION :[45] 'select json_array("UDTARR")
from sys.typetest' DATABASE USER:[3] 'SYS' PRIVILEGE :[6] 'SYSDBA'
(from .net)
Audit trail: LENGTH: '361' ACTION :[45] 'select json_array("UDTARR")
from sys.typetest' DATABASE USER:[3] 'SYS' PRIVILEGE :[6] 'SYSDBA'
UnCOMMITted data is only visible within the session that created it (and will ROLLBACK at the end of the session if it has not been COMMITted). If you can't see the data from another session (i.e. in C#) then make sure you have issued a COMMIT command in the SQL client where you INSERTed the data (i.e. SQL Developer).
Note: even if you connect as the same user, this will create a separate session and you will not be able to see the uncommitted data in the other session.
From the COMMIT documentation:
Until you commit a transaction:
You can see any changes you have made during the transaction by querying the modified tables, but other users cannot see the changes. After you commit the transaction, the changes are visible to other users' statements that execute after the commit.
You can roll back (undo) any changes made during the transaction with the ROLLBACK statement (see ROLLBACK).
Ok looks like i might be retarded... closing sql developer gave me a prompt that there where uncommited changes, after commiting and closing sql developer i am now also receiving the expected data in .net.
I have never seen behaviour like this in any other sql management tool but hey you live and you learn :)
I have a method that needs to "claim" a payment number to ensure it is available at a later time. I cannot just get a new payment number when ready to commit to the database, as the number is added to a signed token, and then the payment number is taken from the signed token later on when committing to the database to allow the token to be linked to the payment afterwards.
Payment numbers are sequential and the current method used in existing code is:
Create a Payment
Get the last payment number from the database
Increment the payment number
Use this payment number for the Payment
Update the database with the incremented payment number
In my service I am trying to prevent the following race-condition:
My service reads the payment number (eg. 100)
Another service uses and updates the payment number (now 101)
My service increments the number locally (to 101) and updates the database (still 101)
This would produce two payments with a payment number of 100.
Here is my implementation so far, in my Transaction class:
private DbSet<PaymentIdentifier> paymentIdentifier;
//...
private int ClaimNextPaymentNumber()
{
int nextPaymentNumber = -1;
using(var dbTransaction = db.Database.BeginTransaction())
{
int lastPaymentNumber = paymentIdentifier.ElementAt(0).Identifier;
nextPaymentNumber = lastPaymentNumber + 1;
paymentIdentifier.ElementAt(0).Identifier = nextPaymentNumber;
db.SaveChanges();
dbTransaction.Commit();
}
return nextPaymentNumber;
}
The PaymentIdentifier table has a single row and a single column "Identifier" (hence the .ElementAt(0)). I am unable to change the database structure as there is lots of legacy code relying on it that is very brittle.
Will having the code wrapped in a transaction (as I have done) protect against the race condition, or is there some Entity Framework / PostgreSQL idiosyncrasies I need to deal with to protect the identifier from being read whilst performing the transaction?
Thank you!
(As a side point, I believe lots of legacy code in the other software connecting to the database simply ignores the race condition and relies on it being "very fast")
It helps you with the race condition only if all code, including legacy, will use this method. If there is still code that continue using client side incrementing without transaction, you'll get the same problem. Just exchange 'My service' and 'Another service' in your description.
1. Another service reads the payment number (eg. 100) **without** transaction
2. My service uses and updates the payment number (now 101) **with** transaction
3. Another service increments the number locally (to 101) and updates the database (still 101) **without** transaction
Note that you can replace your code with simpler one by executing this query without explicit transaction.
update PaymentIdentifier set Identifier = Identifier + 1 returning Identifier;
But again, it will not solve your concurrency problem until you replace all places where the Identifier is incremented. If you can change that, you would better use SEQUENCE or Generators that will safely provide you with incremental Ids.
A transaction does not automaticaly lock your table. A Transaction just ensures that multiple changes to the database are done altogether or nothing at all (see the A (atomic) in ACID). But the thing you want is that only one session can read, add one, update the value. And after that is done the next session is allowed to do the same thing.
So you now have different possibilities:
Use a Sequence you can get the next value for example like that SELECT nextval('mysequencename'). If if two sessions try to get a value at the same time they will get two differnt values.
If you have more complex needs and want to store every "token" with additional data in a table. so every token is a row in the table with additional colums you could use table locking. With this you could restrict the access to table. So only one session is allowed to access the table at a time. But make sure that you use locks for as short as possible because this will become your performance bottleneck.
The database prevents the race condition by throwing a concurrency violation error in this case. So, I looked at how this is handled in the legacy code (following the suggestion by #sergey-l) and it uses a simple retry mechanism. So, I did the same:
private int ClaimNextPaymentNumber()
{
DbContextTransaction dbTransaction;
bool failed;
int paymentNumber = -1;
do
{
failed = false;
using(dbTransaction = db.Database.BeginTransaction())
{
try
{
paymentNumber = TryToClaimNextPaymentNumber();
}
catch(DbUpdateConcurrencyException ex)
{
failed = true;
ResetForClaimPaymentNumberRetry(ex);
}
dbTransaction.Commit();
concurrencyExceptionRetryCount = 0;
}
}
while(failed);
return paymentNumber;
}
I have a big updated list of strings which must be uploaded with update of each row from 0 to last index by rewriting of exist records and adding of new rows to MySql database on remote server each time user calls function.
The adding of data string by string takes a lot of time even if not hangs by process:
foreach (string str in myList)
{
string Query = "insert into tab(a) values(#a);";
MySqlConnection conn = new MySqlConnection(connString);
MySqlCommand conn_ = new MySqlCommand(Query, conn);
conn.Open();
conn_.ExecuteNonQuery();
conn.Close();
}
My goal is to figure out, what should be most proper way to do this fast. Maybe I should create and update table locally and then somehow upload it to database.
I have a List<string> myList = new List<string>(); which contains about 5000 rows and I have table in database on remote server:
id | user | nickname
_____________________
0 | record | record
1 | ... | ...
My desired result is to update all records from 0 to highest index with adding of new records and removing of extra records in case if current upload contains less records then previous each time from 0 index, of course no maters if index will come with gap between removed rows.
You claim:
Adding of data to MySql database on remote server
Which implies, you have multiple clients who know the connection string to the remote database. This is a security desaster! Stop even thinking about it! Also, what happens if the connection string to the database changes? You need to update every client. The only exception would be, if you are in an trusted environment with trusted connections, but I suspect this, since you are using MySQL.
To your actual problem:
Your main problem is, for every item in your loop you create an connection, send something to the server and close your connection. And again, and again. Basicly you want to send on big command to the server, instead of multiple created by your loop (SQL can handle multiple insert statements at one SQL Command).
The better (more secure way):
Create an Application for your server, which accepts myList as JSON for example and save it there. Probably you need to handle authorization here.
Your Client sends a Save Request with a myList to the Application, I have mentioned above.
We have some Technologies for it:
WebAPI
WCF
And many more
Warning: Also, from a first look, you seem to have a problem with SQL Injections. See what they are, and how you can prevent it.
I have read and implemented several different versions of Microsofts suggested methods for querying a SQL Server database. In all that I have read, each query is surrounded by a using statement, e.g. In some method DoQuery:
List<List<string>> DoQuery(string cStr, string query)
{
using(SqlConnection c = new SqlConnection(cStr))
{
c.Open();
using(SqlCommand cmd = new SqlCommand(queryStr, c))
{
using(SqlDataReader reader = cmd.ExecuteReader())
{
while (reader.Read() )
{
...
//read columns and put into list to return
}
// close all of the using blocks
}
}
}
// return the list of rows containing the list of column values.
}
I need to run this code several hundreds of times for different query strings against the same database. It seems that creating a new connection each time would be inefficient and dropping it each time wasteful.
How should I structure this so that it is efficient? When I tried not using a using block and passing the connection into the DoQuery method, I got messages about the connection had not been closed. If I closed it after the query, then I got messages about it wasn't open.
I'm also trying to improve this because I keep getting somewhat random
IOException: Unable to read data from the transport connection: Operation on non-blocking socket would block.
I'm the only user of the database at this time and I'm not doing anything in multiple threads or async, etc. Just looping through query strings and running DoQuery on them.
Could my structure be part of that problem, i.e. not releasing the resources fast enough and thereby seeing the connection blocked?
I'm stuck here on efficiency and this blocking problem. Thanks in advance.
As it turns out, the query structure was fine and the queries were fine. The problem was that I had an ‘order by X desc’ on each query and that column was not indexed. This caused a full table scan to order the rows even if only returning 2. The table has about 3 million rows and I thought it could handle that better than it does. It timed out with 360 second connection timeout! I indexed the column and no more ‘blocking’ nonsense, which BTW, is a horrible message to return when it was actually a timeout. The queries now run fine if I index every column that appears in a where clause.
I'm a novice C# dev and I'm writing a database app the performs updates on two different tables and inserts on another two tables and each process is running on it's own separate thread. So I have two threads handling inserts on two different tables and two threads handling updates on two different tables. Each process is updating and inserting approximately 4 or 5 times per second so I don't close the connection until the complete session is over then I close the entire app. I wanted to know if I should be closing the connection after each insert and update even though I'm preforming these operations so frequently. 2nd, should I have each thread running on it's own connection and command object.
By the way I'm writing the app in C# and the database is MySQL. Also, as of now I'm using one connection and command object for all four threads. I keep getting an error message saying "There is already an open DataReader associated with this connection that must be closed first", that's why I'm asking if I should be using multiple connection and command objects.
Thanks
-Donld
If you enable connection pooling, it should enable optimal use of MySql connections for your scenario. Either way, generally the best pattern to follow is:
Acquire and open connection
Do work
Close/release connection
Something similar to (I'm a bit rusty on the class names for the MySql connector, so this may not be exactly correct, but you should get the general idea!):
private void DoMyPieceOfWork(int value1, int value2)
{
using(MySqlConnection connection = new MySqlConnection(
CONNECTION_STRING_GOES_HERE))
{
connection.Open();
using(MySqlCommand command = new MySqlCommand(
"INSERT INTO TABLE `blah` (Column1, Column2) VALUES #column1, #column2"))
{
command.Parameters.Add("#column1", MySqlType.Int).Value = value1;
command.Parameters.Add("#column2", MySqlType.Int).Value = value2;
command.ExecuteNonQuery();
}
connection.Close();
}
}
Of course this is a contrived, simplistic, example but the gist of it stands.
you have either to create a new connection for each thread, or (it's an idea) create a synchronized queue of command. Then process the queue in a single working thread.
You may also take a look as the Task class of the .Net framework 4