Add row into database, get id and populate second table - c#

I'm not great at .NET but am learning (at least trying to! ;) ). However, this bit of code I'm working has me baffled. What I want to do is insert a row into a SQL Server 2008 database table called Comment, then use the id of this inserted row to populate a second table (CommentOtherAuthor) with new rows of data. Basically, a comment can have multiple authors.
Here's the code:
public static Comment MakeNew(int parentNodeId, string firstname, string surname, string occupation, string affiliation, string title, string email, bool publishemail, bool competinginterests, string competingintereststext, string[] otherfirstname, string[] othersurname, string[] otheroccupation, string[] otheraffiliation, string[] otheremail, bool approved, bool spam, DateTime created, string commentText, int statusId)
{
var c = new Comment
{
ParentNodeId = parentNodeId,
FirstName = firstname,
Surname = surname,
Occupation = occupation,
Affiliation = affiliation,
Title = title,
Email = email,
PublishEmail = publishemail,
CompetingInterests = competinginterests,
CompetingInterestsText = competingintereststext,
OtherFirstName = otherfirstname,
OtherSurname = othersurname,
OtherOccupation = otheroccupation,
OtherAffiliation = otheraffiliation,
OtherEmail = otheremail,
Approved = approved,
Spam = spam,
Created = created,
CommenText = commentText,
StatusId = statusId
};
var sqlHelper = DataLayerHelper.CreateSqlHelper(umbraco.GlobalSettings.DbDSN);
c.Id = sqlHelper.ExecuteScalar<int>(
#"insert into Comment(mainid,nodeid,firstname,surname,occupation,affiliation,title,email,publishemail,competinginterests,competingintereststext,comment,approved,spam,created,statusid)
values(#mainid,#nodeid,#firstname,#surname,#occupation,#affiliation,#title,#email,#publishemail,#competinginterests,#competingintereststext,#comment,#approved,#spam,#created,#statusid)",
sqlHelper.CreateParameter("#mainid", -1),
sqlHelper.CreateParameter("#nodeid", c.ParentNodeId),
sqlHelper.CreateParameter("#firstname", c.FirstName),
sqlHelper.CreateParameter("#surname", c.Surname),
sqlHelper.CreateParameter("#occupation", c.Occupation),
sqlHelper.CreateParameter("#affiliation", c.Affiliation),
sqlHelper.CreateParameter("#title", c.Title),
sqlHelper.CreateParameter("#email", c.Email),
sqlHelper.CreateParameter("#publishemail", c.PublishEmail),
sqlHelper.CreateParameter("#competinginterests", c.CompetingInterests),
sqlHelper.CreateParameter("#competingintereststext", c.CompetingInterestsText),
sqlHelper.CreateParameter("#comment", c.CommenText),
sqlHelper.CreateParameter("#approved", c.Approved),
sqlHelper.CreateParameter("#spam", c.Spam),
sqlHelper.CreateParameter("#created", c.Created),
sqlHelper.CreateParameter("#statusid", c.StatusId));
c.OnCommentCreated(EventArgs.Empty);
for (int x = 0; x < otherfirstname.Length; x++)
{
sqlHelper.ExecuteScalar<int>(
#"insert into CommentOtherAuthor(firstname,surname,occupation,affiliation,email,commentid) values(#firstname,#surname,#occupation,#affiliation,#email,#commentid)",
sqlHelper.CreateParameter("#firstname", otherfirstname[x]),
sqlHelper.CreateParameter("#surname", othersurname[x]),
sqlHelper.CreateParameter("#occupation", otheroccupation[x]),
sqlHelper.CreateParameter("#affiliation", otheraffiliation[x]),
sqlHelper.CreateParameter("#email", otheremail[x]),
sqlHelper.CreateParameter("#commentid", 123)
);
}
if (c.Spam)
{
c.OnCommentSpam(EventArgs.Empty);
}
if (c.Approved)
{
c.OnCommentApproved(EventArgs.Empty);
}
return c;
}
The key line is:
sqlHelper.CreateParameter("#commentid", 123)
At the moment, I'm just hard-coding the id for the comment as 123, but really I need it to be the id of the record just inserted into the comment table.
I just don't really understand how to grab the last insert from the table Comment without doing a new
SELECT TOP 1 id FROM Comment ORDER BY id DESC
which doesn't strike me as the best way to do this.
Can anyone suggest how to get this working?
Many thanks!

That SELECT TOP 1 id ... query most likely wouldn't give you the proper results anyway in a system under load. If you have 20 or 50 clients inserting comments at the same time, by the time you query the table again, chances are very high you would be getting someone else's id ...
The best way I see to do this would be:
add an OUTPUT clause to your original insert and capture the newly inserted ID
use that ID for your second insert
Something along the lines of:
c.Id = sqlHelper.ExecuteScalar<int>(
#"insert into Comment(......)
output Inserted.ID
values(.............)",
Using this approach, your c.Id value should now be the newly inserted ID - use that in your next insert statement! (note: right now, you're probably always getting a 1 back - the number of rows affected by your statement ...)
This approach assumes your table Comment has a column of type INT IDENTITY that will be automatically set when you insert a new row into it.
for (int x = 0; x < otherfirstname.Length; x++)
{
sqlHelper.ExecuteScalar<int>(
#"insert into CommentOtherAuthor(.....) values(.....)",
sqlHelper.CreateParameter("#firstname", otherfirstname[x]),
sqlHelper.CreateParameter("#surname", othersurname[x]),
sqlHelper.CreateParameter("#occupation", otheroccupation[x]),
sqlHelper.CreateParameter("#affiliation", otheraffiliation[x]),
sqlHelper.CreateParameter("#email", otheremail[x]),
sqlHelper.CreateParameter("#commentid", c.Id) <<=== use that value you got back!
);
}

Assuming you are using Microsoft SQL Server, you could design your table Comment so the column Id has the property Identity set to true. This way the database will generate and auto-increment the id each time a row is inserted into the table.
You would have to use the following line in your SQL request:
OUTPUT INSERTED.Id
in order to return this Id to your C# code when the request is executed.

Related

How to get data without conditions in C#

Hi. I have 2 data tables like this:
I want to get the ID in Table1 if the User in Table2 exists or does not exist
This is the code I test and get the data:
string idGet = "";
string getValue = "Select ID, Port, User from Table1";
DataTable dtgetValue = XLDL.ReadTable(getValue);
if(dtgetValue.Rows.Count > 0)
{
List<ListOtherUser> listOtherUser = new List<ListOtherUser>();
for (int i = 0; i < dtgetValue.Rows.Count; i++)
{
listOtherUser.Add(new ListOtherUser { ID = dtgetValue.Rows[i]["ID"].ToString(), User = dtgetValue.Rows[i]["User"].ToString(), Port = dtgetValue.Rows[i]["Port"].ToString() });
}
foreach (var itemuser in listOtherUser)
{
string checkUser = "Select ID from Table2 where User = N'" + itemuser.User + "'";
DataTable dtcheckUser = XLDL.ReadTable(checkUser);
if (dtcheckUser.Rows.Count > 0)
{
idGet += itemuser.ID + ",";
}
else
{
//Here I want to continue to get the data of row ID=3 from Table1. However I don't know how to solve it?
}
}
}
As the data above I want the output as: idGet = 1 and 3 from Table1
With data from Table1 and Table2:
As the data above I want the output as: idGet = 2 and 3 from Table1
Looking forward to a solution from everyone. Thank you!
The best solution here though would be to do some SQL joins and exclude it coming in, but if you want a solution in code instead, depdending on how large that dataset is/will be, this is a good spot for LINQ.
var result = Table2.Where(p => table1.All(p2 => p2.UserID != p.UserID));
Adapted from this question.
If you opted for SQL, your query would look something more like this and while looking at your logic, you should absolutely not do that how you are trying to. That is so many single db calls for absolutely no reason.
Select Table1.ID
FROM Table1 LEFT OUTER JOIN
Table2 ON Table1.User = Table2.User
WHERE Table2.ID IS NULL;

How to extract an enumerable datatable from database when column names and count may vary?

How do I extract data from a data table in a sql server database, knowing that the table could have different column names/column count depending on the request?
I would usually use a SqlDataAdapter and then do: da.Fill(dt) but this is not an option as I cannot enumerate through a dataTable in a razor view. I wish to reproduce that table in a view using Razor Pages.
Here is an example of what I might normally do, but it involves knowing exactly what the column names will be and how many there will be. What can I put in the while to return all of the table data in a type that is enumerable?:
SqlConnection connectionCalc = new SqlConnection("<connectionString>");
if (connectionCalc.State.Equals(ConnectionState.Closed))
connectionCalc.Open();
using (var command = connectionCalc.CreateCommand())
{
command.CommandText = $#"SELECT * FROM {tableName}";
SqlDataReader reader = command.ExecuteReader();
column = new Dictionary<string, string>();
int FATUidSingle = -999;
while (reader.Read())
{
TableUid.Add(reader[SelectedCalculation + "TABLE_UID"].ToString());
FATUid.Add(Convert.ToInt32(reader["FAT_UID"]));
ScheduledDate.Add(Convert.ToDateTime(reader["SOME_DATE"]));
TableStatusUid.Add(reader[SelectedCalculation + "ST_UID"].ToString());
StartDate.Add(Convert.ToDateTime(reader["ANOTHER_DATE"]));
EndDate.Add(Convert.ToDateTime(reader["OTHER_DATE"]));
Progress.Add(reader["PROGRESS"].ToString());
}
Run a command like this first to get the field names, then you will know which fields to expect. You could use it to build SQL and set ordinals to point to the column you want
SELECT TABLE_NAME, COLUMN_NAME FROM
INFORMATION_SCHEMA.COLUMNS
where table_name = 'EmployeeDetail'
when you make your enumerable list, make a list of 'tuples' of string and object perhaps, where the string is the field name and the object is the value
If you only worry about exact names but you know column sequence and datatype then you could simply rely on index of column from select statement.
E.g
Select * from Orders
Order table might have 3 columns. Id, Name and Price where id is int, Name is string and Price is long.
So you can do:
var id = reader.GetInt32(0);
var name = reader.GetString(1);
var price = reader.GetInt64(2);

Insert Guid and retrieve value with dapper

I am trying to insert a record into my database and retrieve the GUID it just added in.
Let's say if I have a table with 3 columns, GUID, FirstName, LastName. I need to insert a new record and then get back the GUID that was just generated. The problem is that first and last name are duplicated, often. I am not quite sure how to accomplish
Here is what I tried, I know the below won't work as I am not really telling it which column to select back and I'm not sure how to tell it:
var query = #"INSERT INTO MyTable(GUID, FirstName, LastName)
SELECT
#GUID, #FirstName, #LastName);
using (var oConn = CreateConnection())
{
var test = oConn.Query<string>(query, new
{
GUID = Guid.NewGuid(),
"John",
"Doe"
}).Single();
}
The error that I get is
Sequence contains no elements
If you want only the Guid which you inserted, Why not store it in a local variable in your code and use that as needed ?
I also see some errors in your code. The below code is corrected and should work.
var guid = Guid.NewGuid();
var query = #"INSERT INTO
MyTable (GUID, FirstName, LastName) values ( #GUID, #FirstName,#LastName);";
using (var conn = CreateConnection())
{
conn.Execute(query, new { #GUID = guid, #FirstName= "John", #LastName= "Scott" });
}
// You can use the value in guid variable now. It will be Id you inserted just now
Dapper Contrib needs a Auto Generated ID, It cannot be a GUID and you cannot pass a pre generated Guid

Insert Data into MySQL in multiple Tables in C# efficiently

I need to insert a huge CSV-File into 2 Tables with a 1:n relationship within a mySQL Database.
The CSV-file comes weekly and has about 1GB, which needs to be append to the existing data.
Each of them 2 tables have a Auto increment Primary Key.
I've tried:
Entity Framework (takes most time of all approaches)
Datasets (same)
Bulk Upload (doesn't support multiple tables)
MySqlCommand with Parameters (needs to be nested, my current approach)
MySqlCommand with StoredProcedure including a Transaction
Any further suggestions?
Let's say simplified this is my datastructure:
public class User
{
public string FirstName { get; set; }
public string LastName { get; set; }
public List<string> Codes { get; set; }
}
I need to insert from the csv into this database:
User (1-n) Code
+---+-----+-----+ +---+---+-----+
|PID|FName|LName| |CID|PID|Code |
+---+-----+-----+ +---+---+-----+
| 1 |Jon | Foo | | 1 | 1 | ed3 |
| 2 |Max | Foo | | 2 | 1 | wst |
| 3 |Paul | Foo | | 3 | 2 | xsd |
+---+-----+-----+ +---+---+-----+
Here a sample line of the CSV-file
Jon;Foo;ed3,wst
A Bulk load like LOAD DATA LOCAL INFILE is not possible because i have restricted writing rights
Referring to your answer i would replace
using (MySqlCommand myCmdNested = new MySqlCommand(cCommand, mConnection))
{
foreach (string Code in item.Codes)
{
myCmdNested.Parameters.Add(new MySqlParameter("#UserID", UID));
myCmdNested.Parameters.Add(new MySqlParameter("#Code", Code));
myCmdNested.ExecuteNonQuery();
}
}
with
List<string> lCodes = new List<string>();
foreach (string code in item.Codes)
{
lCodes.Add(String.Format("('{0}','{1}')", UID, MySqlHelper.EscapeString(code)));
}
string cCommand = "INSERT INTO Code (UserID, Code) VALUES " + string.Join(",", lCodes);
using (MySqlCommand myCmdNested = new MySqlCommand(cCommand, mConnection))
{
myCmdNested.ExecuteNonQuery();
}
that generates one insert statement instead of item.Count
Given the great size of data, the best approach (performance wise) is to leave as much data processing to the database and not the application.
Create a temporary table that the data from the .csv file will be temporarily saved.
CREATE TABLE `imported` (
`id` int(11) NOT NULL,
`firstname` varchar(45) DEFAULT NULL,
`lastname` varchar(45) DEFAULT NULL,
`codes` varchar(450) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Loading the data from the .csv to this table is pretty straightforward. I would suggest the use of MySqlCommand (which is also your current approach). Also, using the same MySqlConnection object for all INSERT statements will reduce the total execution time.
Then to furthermore process the data, you can create a stored procedure that will handle it.
Assuming these two tables (taken from your simplified example):
CREATE TABLE `users` (
`PID` int(11) NOT NULL AUTO_INCREMENT,
`FName` varchar(45) DEFAULT NULL,
`LName` varchar(45) DEFAULT NULL,
PRIMARY KEY (`PID`)
) ENGINE=InnoDB AUTO_INCREMENT=3737 DEFAULT CHARSET=utf8;
and
CREATE TABLE `codes` (
`CID` int(11) NOT NULL AUTO_INCREMENT,
`PID` int(11) DEFAULT NULL,
`code` varchar(45) DEFAULT NULL,
PRIMARY KEY (`CID`)
) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8;
you can have the following stored procedure.
CREATE DEFINER=`root`#`localhost` PROCEDURE `import_data`()
BEGIN
DECLARE fname VARCHAR(255);
DECLARE lname VARCHAR(255);
DECLARE codesstr VARCHAR(255);
DECLARE splitted_value VARCHAR(255);
DECLARE done INT DEFAULT 0;
DECLARE newid INT DEFAULT 0;
DECLARE occurance INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE cur CURSOR FOR SELECT firstname,lastname,codes FROM imported;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur;
import_loop:
LOOP FETCH cur INTO fname, lname, codesstr;
IF done = 1 THEN
LEAVE import_loop;
END IF;
INSERT INTO users (FName,LName) VALUES (fname, lname);
SET newid = LAST_INSERT_ID();
SET i=1;
SET occurance = (SELECT LENGTH(codesstr) - LENGTH(REPLACE(codesstr, ',', '')) + 1);
WHILE i <= occurance DO
SET splitted_value =
(SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(codesstr, ',', i),
LENGTH(SUBSTRING_INDEX(codesstr, ',', i - 1)) + 1), ',', ''));
INSERT INTO codes (PID, code) VALUES (newid, splitted_value);
SET i = i + 1;
END WHILE;
END LOOP;
CLOSE cur;
END
For every row in the source data, it makes an INSERT statement for the user table. Then there is a WHILE loop to split the comma separated codes and make for each one an INSERT statement for the codes table.
Regarding the use of LAST_INSERT_ID(), it is reliable on a PER CONNECTION basis (see doc here). If the MySQL connection used to run this stored procedure is not used by other transactions, the use of LAST_INSERT_ID() is safe.
The ID that was generated is maintained in the server on a per-connection basis. This means that the value returned by the function to a given client is the first AUTO_INCREMENT value generated for most recent statement affecting an AUTO_INCREMENT column by that client. This value cannot be affected by other clients, even if they generate AUTO_INCREMENT values of their own. This behavior ensures that each client can retrieve its own ID without concern for the activity of other clients, and without the need for locks or transactions.
Edit: Here is the OP's variant that omits the temp-table imported. Instead of inserting the data from the .csv to the imported table, you call the SP to directly store them to your database.
CREATE DEFINER=`root`#`localhost` PROCEDURE `import_data`(IN fname VARCHAR(255), IN lname VARCHAR(255),IN codesstr VARCHAR(255))
BEGIN
DECLARE splitted_value VARCHAR(255);
DECLARE done INT DEFAULT 0;
DECLARE newid INT DEFAULT 0;
DECLARE occurance INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
INSERT INTO users (FName,LName) VALUES (fname, lname);
SET newid = LAST_INSERT_ID();
SET i=1;
SET occurance = (SELECT LENGTH(codesstr) - LENGTH(REPLACE(codesstr, ',', '')) + 1);
WHILE i <= occurance DO
SET splitted_value =
(SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(codesstr, ',', i),
LENGTH(SUBSTRING_INDEX(codesstr, ',', i - 1)) + 1), ',', ''));
INSERT INTO codes (PID, code) VALUES (newid, splitted_value);
SET i = i + 1;
END WHILE;
END
Note: The code to split the codes is taken from here (MySQL does not provide a split function for strings).
I developed my WPF application application using the Entity Framework and used SQL server database and needed to read data from an excel file and had to insert that data into 2 tables that has relationship between them. For roughly about 15000 rows in excel it used to take around 4 hours of time. Then what I did was I used a block of 500 rows per insert and this speeded up my insertion to unbelievalbe fast and now it takes mere 3-5 seconds to import that same data.
So I would suggest you add your rows to a Context like 100/200/500 at a time and then call the SaveChanges method (if you really want to be using EF). There are other helpful tips as well to speed up the performance for EF. Please read this for your reference.
var totalRecords = TestPacksData.Rows.Count;
var totalPages = (totalRecords / ImportRecordsPerPage) + 1;
while (count <= totalPages)
{
var pageWiseRecords = TestPacksData.Rows.Cast<DataRow>().Skip(count * ImportRecordsPerPage).Take(ImportRecordsPerPage);
count++;
Project.CreateNewSheet(pageWiseRecords.ToList());
Project.CreateNewSpool(pageWiseRecords.ToList());
}
And here is the CreateNewSheet method
/// <summary>
/// Creates a new Sheet record in the database
/// </summary>
/// <param name="row">DataRow containing the Sheet record</param>
public void CreateNewSheet(List<DataRow> rows)
{
var tempSheetsList = new List<Sheet>();
foreach (var row in rows)
{
var sheetNo = row[SheetFields.Sheet_No.ToString()].ToString();
if (string.IsNullOrWhiteSpace(sheetNo))
continue;
var testPackNo = row[SheetFields.Test_Pack_No.ToString()].ToString();
TestPack testPack = null;
if (!string.IsNullOrWhiteSpace(testPackNo))
testPack = GetTestPackByTestPackNo(testPackNo);
var existingSheet = GetSheetBySheetNo(sheetNo);
if (existingSheet != null)
{
UpdateSheet(existingSheet, row);
continue;
}
var isometricNo = GetIsometricNoFromSheetNo(sheetNo);
var newSheet = new Sheet
{
sheet_no = sheetNo,
isometric_no = isometricNo,
ped_rev = row[SheetFields.PED_Rev.ToString()].ToString(),
gpc_rev = row[SheetFields.GPC_Rev.ToString()].ToString()
};
if (testPack != null)
{
newSheet.test_pack_id = testPack.id;
newSheet.test_pack_no = testPack.test_pack_no;
}
if (!tempSheetsList.Any(l => l.sheet_no == newSheet.sheet_no))
{
DataStore.Context.Sheets.Add(newSheet);
tempSheetsList.Add(newSheet);
}
}
try
{
DataStore.Context.SaveChanges();
**DataStore.Dispose();** This is very important. Dispose the context
}
catch (DbEntityValidationException ex)
{
// Create log for the exception here
}
}
CreateNewSpool is ditto same method except for the fields name and table name, because it updates a child table. But the idea is the same
1 - Add a column VirtualId to User table & class.
EDITED
2 - Assign numbers in a loop for the VirtualId (use negative numbers starting -1 to avoid collisions in the last step) field in each User Object. For each Code c object belonging to User u object set the c.UserId = u.VirtualId.
3 - Bulk load Users into User table, Bulk load Codes into Code table.
4- UPDATE CODE C,USER U SET C.UserId = U.Id WHERE C.UserId = U.VirtualId.
NOTE : If you have a FK Constraint on Code.UserId you can drop it and re-add it after the Insert.
public class User
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int VirtualId { get; set; }
}
public class Code
{
public int Id { get; set; }
public string Code { get; set; }
public string UserId { get; set; }
}
Can you break the CSV into two files?
E.g. Suppose your file has the following columns:
... A ... | ... B ...
a0 | b0
a0 | b1
a0 | b2 <-- data
a1 | b3
a1 | b4
So one set of A might have multiple B entries. After you break it apart, you get:
... A ...
a0
a1
... B ...
b0
b1
b2
b3
b4
Then you bulk insert them separately.
Edit: Pseudo code
Based on the conversation, something like:
DataTable tableA = ...; // query schema for TableA
DataTable tableB = ...; // query schmea for TableB
List<String> usernames = select distinct username from TableA;
Hashtable htUsername = new Hashtable(StringComparer.InvariantCultureIgnoreCase);
foreach (String username in usernames)
htUsername[username] = "";
int colUsername = ...;
foreach (String[] row in CSVFile) {
String un = row[colUsername] as String;
if (htUsername[un] == null) {
// add new row to tableA
DataRow row = tableA.NewRow();
row["Username"] = un;
// etc.
tableA.Rows.Add(row);
htUsername[un] = "";
}
}
// bulk insert TableA
select userid, username from TableA
Hashtable htUserId = new Hashtable(StringComparer.InvariantCultureIgnoreCase);
// htUserId[username] = userid;
int colUserId = ...;
foreach (String[] row in CSVFile) {
String un = row[colUsername] as String;
int userid = (int) htUserId[un];
DataRow row = tableB.NewRow();
row[colUserId] = userId;
// fill in other values
tableB.Rows.Add(row);
if (table.Rows.Count == 65000) {
// bulk insert TableB
var t = tableB.Clone();
tableB.Dispose();
tableB = t;
}
}
if (tableB.Rows.Count > 0)
// bulk insert TableB
AFAIK the insertions done in a table are sequential while the insertions in different table can be done in parallel. Open two separate new connections to the same database and then insert in parallel maybe by using Task Parallel Library.
However, if there are integrity constraints about 1:n relationship between the tables, then:
Insertions might fail and thus any parallel insert approach would be wrong. Clearly then your best bet would be to do sequential inserts only, one table after the other.
You can try and sort the data of both tables write the InsertInto method written below such that insert in second table will happen only after you are done inserting the data in the first one.
Edit: Since you have requested, if there is a possibility for you to perform the inserts in parallel, following is the code template you can use.
private void ParallelInserts()
{
..
//Other code in the method
..
//Read first csv into memory. It's just a GB so should be fine
ReadFirstCSV();
//Read second csv into memory...
ReadSecondCSV();
//Because the inserts will last more than a few CPU cycles...
var taskFactory = new TaskFactory(TaskCreationOptions.LongRunning, TaskContinuationOptions.None)
//An array to hold the two parallel inserts
var insertTasks = new Task[2];
//Begin insert into first table...
insertTasks[0] = taskFactory.StartNew(() => InsertInto(commandStringFirst, connectionStringFirst));
//Begin insert into second table...
insertTasks[1] = taskFactory.StartNew(() => InsertInto(commandStringSecond, connectionStringSecond));
//Let them be done...
Task.WaitAll(insertTasks);
Console.WriteLine("Parallel insert finished.");
}
//Defining the InsertInto method which we are passing to the tasks in the method above
private static void InsertInto(string commandString, string connectionString)
{
using (/*open a new connection using the connectionString passed*/)
{
//In a while loop, iterate until you have 100/200/500 rows
while (fileIsNotExhausted)
{
using (/*commandString*/)
{
//Execute command to insert in bulk
}
}
}
}
When you say "efficiently" are you talking memory, or time?
In terms of improving the speed of the inserts, if you can do multiple value blocks per insert statement, you can get 500% improvement in speed. I did some benchmarks on this over in this question: Which is faster: multiple single INSERTs or one multiple-row INSERT?
My approach is described in the answer, but simply put, reading in up to say 50 "rows" (to be inserted) at once and bundling them into a single INSERT INTO(...), VALUES(...),(...),(...)...(...),(...) type statement seems to really speed things up. At least, if you're restricted from not being able to bulk load.
Another approach btw if you have live data you can't drop indexes on during the upload, is to create a memory table on the mysql server without indexes, dump the data there, and then do an INSERT INTO live SELECT * FROM mem. Though that uses more memory on the server, hence the question at the start of this answer about "what do you mean by 'efficiently'?" :)
Oh, and there's probably nothing wrong with iterating through the file and doing all the first table inserts first, and then doing the second table ones. Unless the data is being used live, I guess. In that case you could definitely still use the bundled approach, but the application logic to do that is a lot more complex.
UPDATE: OP requested example C# code for multivalue insert blocks.
Note: this code assumes you have a number of structures already configured:
tables List<string> - table names to insert into
fieldslist Dictionary<string, List<String>> - list of field names for each table
typeslist Dictionary<string, List<MySqlDbType>> - list of MySqlDbTypes for each table, same order as the field names.
nullslist Dictionary<string, List<Boolean>> - list of flags to tell if a field is nullable or not, for each table (same order as field names).
prikey Dictionary<string, string> - list of primary key field name, per table (note: this doesn't support multiple field primary keys, though if you needed it you could probably hack it in - I think somewhere I have a version that does support this, but... meh).
theData Dictionary<string, List<Dictionary<int, object>>> - the actual data, as a list of fieldnum-value dictionaries, per table.
Oh yeah, and the localcommand is MySqlCommand created by using CreateCommand() on the local MySqlConnection object.
Further note: I wrote this quite a while back when I was kind of starting. If this causes your eyes or brain to bleed, I apologise in advance :)
const int perinsert = 50;
foreach (string table in tables)
{
string[] fields = fieldslist[table].ToArray();
MySqlDbType[] types = typeslist[table].ToArray();
bool[] nulls = nullslist[table].ToArray();
int thisblock = perinsert;
int rowstotal = theData[table].Count;
int rowsremainder = rowstotal % perinsert;
int rowscopied = 0;
// Do the bulk (multi-VALUES block) INSERTs, but only if we have more rows than there are in a single bulk insert to perform:
while (rowscopied < rowstotal)
{
if (rowstotal - rowscopied < perinsert)
thisblock = rowstotal - rowscopied;
// Generate a 'perquery' multi-VALUES prepared INSERT statement:
List<string> extravals = new List<string>();
for (int j = 0; j < thisblock; j++)
extravals.Add(String.Format("(#{0}_{1})", j, String.Join(String.Format(", #{0}_", j), fields)));
localcmd.CommandText = String.Format("INSERT INTO {0} VALUES{1}", tmptable, String.Join(",", extravals.ToArray()));
// Now create the parameters to match these:
for (int j = 0; j < thisblock; j++)
for (int i = 0; i < fields.Length; i++)
localcmd.Parameters.Add(String.Format("{0}_{1}", j, fields[i]), types[i]).IsNullable = nulls[i];
// Keep doing bulk INSERTs until there's less rows left than we need for another one:
while (rowstotal - rowscopied >= thisblock)
{
// Queue up all the VALUES for this block INSERT:
for (int j = 0; j < thisblock; j++)
{
Dictionary<int, object> row = theData[table][rowscopied++];
for (int i = 0; i < fields.Length; i++)
localcmd.Parameters[String.Format("{0}_{1}", j, fields[i])].Value = row[i];
}
// Run the query:
localcmd.ExecuteNonQuery();
}
// Clear all the paramters - we're done here:
localcmd.Parameters.Clear();
}
}

Special character not saved in MS SQL

I have a small problem with this Czech character ů. This little bugger won't save into my MS SQL table when the update is triggered from my C# code.
If I manually add it in SQL Management Studio it is saved as it should, and it is also shown on the website as normal.
But the process of saving it from C# only ends in the character u beeing saved in the table and not ů.
My DB fields are of type nvarchar and ntext and the collation of the DB is French_CI_AS. I'm using MSSQL 2008.
Code:
SqlCommand sqlcomLoggedIn = new SqlCommand("UPDATE Table SET id = 1, title = 'Title with ů' WHERE id = 1", sqlCon, sqlTrans);
int status = sqlcomLoggedIn.ExecuteNonQuery();
Any thoughts?
The immediate problem is that you aren't using a unicode literal; it would have to be:
UPDATE Table SET id = 1, title = N'Title with ů' WHERE id = 1
the N is important.
The much bigger issue is that you should be using parameters:
int id = 1;
string title = "Title with ů";
// ^^ probably coming from C# parameters
using(var sqlcomLoggedIn = new SqlCommand(
"UPDATE Table SET title = #title WHERE id = #id", sqlCon, sqlTrans))
{
sqlcomLoggedIn.Parameters.AddWithValue("id", id);
sqlcomLoggedIn.Parameters.AddWithValue("title", title);
int status = sqlcomLoggedIn.ExecuteNonQuery();
}
then the problem goes away, you get to use cached query plans, and you avoid sql injection

Categories