How to insert self referencing entities into sql?

How to insert self referencing entities into sql? - c#

Assume I have the following structure for a sql table:
Name:
UserTable
Fields:
ID bigint IDENTITY(1, 1)
Name nvarchar(200) NOT NULL
ParentID bigint NULL
Note:
ParentID is a self referencing foreign key to the primary key ID which is optional.
Now switching over to my c#-project, I find myself wondering how to insert this entity many times from an import.
public static void InsertTable(DataTable table)
{
var connection = CreateConnection();
string query = "INSERT INTO [dbo].[User] (Name, ParentID) " +
"OUTPUT INSERTED.ID " +
"VALUES " +
"(#Name, #ParentID)";
using (connection)
{
for (int i = 0; i < table.Rows.Count; i++)
{
DataRow row = table.Rows[i];
using (SqlCommand command = connection.CreateCommand())
{
command.CommandText = query;
InsertParameters(row, command);
long insertedID = (long)command.ExecuteScalar();
row["ID"] = insertedID;
}
}
}
}
I set the parameters like this:
private static void InsertParameters(DataRow row, SqlCommand command)
{
string name = (string)row["Name"];
command.Parameters.AddWithValue("#Name", name);
if(row["ParentID"] is DBNull)
{
command.Parameters.AddWithValue("#ParentID", DBNull.Value);
}
else
{
command.Parameters.AddWithValue("#ParentID", (long)row["ParentID"]);
}
}
I figured that I won't be able to insert these entities into this table at any order. My approach was to try to insert entities with no reference to any parent first. While this can work in a simple example like this, I struggle to find an approach for multiple references.
I worked around this by just mapping the relations in some Dictionary<T1, T2> objects and revisit the ones with references later, when the ID-property of the referenced entity is set.
My problem with this is that I can clearly map one DataRow to another, but not insert them so easy, when I can not know the ID beforehand. I'd like to know if there are some better approaches to this.
I stumbled upon this particular problem while doing an import for some customer-related data. My solution so far is okay-ish, but not satisfactory. One case where it all breaks could be a loop reference, I think.
Anyway,
How would you tackle this problem and how to improve my method so far?

I would create stored procedure which does the whole process and can get the ids as such. Then in C# code call the sproc.
This is an example from my nuget package SQLJSONReader (github project page) where the SQL server sproc returns JSON and my reader ExecuteJsonReader then converts the table result, to a string of JSON.
string sproc = "dbo.DoIt";
string result;
using (SqlConnection conn = new SqlConnection(connection))
{
conn.Open();
using (var cmd = new SqlCommand(sproc, conn) { CommandType = CommandType.StoredProcedure, CommandTimeout = 600 })
{
if (parameters != null)
cmd.Parameters.AddRange(parameters);
var reader = await cmd.ExecuteJsonReaderAsync();
result = await reader.ReadAllAsync();
}
}
So your process is similar, just use your own reader.

Related

Multithread SQL select statements

I am a multithreading novice and a SQL novice, so please excuse any rookie mistakes.
I am trying to execute many SQL queries asynchronously. The queries are all select statements from the same table in the same database. I can run them synchronously and everything works fine, but testing a small subset leads me to believe that to run all the queries synchronously would take approximately 150 hours, which is far too long. As such, I'm trying to figure out how to run them in parallel.
I have tried to model the code after the answer at run a method multiple times simultaneously in c#, but my code is not executing correctly (it's erroring, though I do not know specifically how. The code just says an error occurs).
Here is what I have (A much smaller and simpler version of what I am actually doing):
class Program
{
static void Main(string[] args)
{
List<string> EmployeeIDs = File.ReadAllLines(/* Filepath */);
List<Tuple<string, string>> NamesByID = new List<Tuple<string, string>>();
//What I do not want to do (because it takes too long) ...
using (SqlConnection conn = new SqlConnection(/* connection string */))
{
foreach (string id in EmployeeIDs)
{
using (SqlCommand cmd = new SqlCommand("SELECT FirstName FROM Employees WITH (NOLOCK) WHERE EmployeeID = " + id, conn))
{
try
{
conn.Open();
NamesByID.Add(new Tuple<string, string> (id, cmd.ExecuteScalar().ToString()));
}
finally
{
conn.Close();
}
}
}
}
//What I do want to do (but it errors) ...
var tasks = EmployeeIDs.Select(id => Task<Tuple<string, string>>.Factory.StartNew(() => RunQuery(id))).ToArray();
Task.WaitAll(tasks);
NamesByID = tasks.Select(task => task.Result).ToList();
}
private static Tuple<string, string> RunQuery(string id)
{
using (SqlConnection conn = new SqlConnection(/* connection string */))
{
using (SqlCommand cmd = new SqlCommand("SELECT FirstName FROM Employees WITH (NOLOCK) WHERE EmployeeID = " + id, conn))
{
try
{
conn.Open();
return new Tuple<string, string> (id, cmd.ExecuteScalar().ToString());
}
finally
{
conn.Close();
}
}
}
}
}
Note: I do not care exactly how this is multithreaded (tasks, parallel.foreach, backgroundworker, etc). This is going to be used to run ~30,000 select queries exactly 1 time, so I just need it to run fast (I'm hoping for ~8 hrs = one work day, but I'll take what I can get) one time. It doesn't have to really be pretty.
Thank you in advance!

This is just plain wrong. You should build one query to select all FirstNames you need. If you need to pass a bunch of ids to the server, that is no problem, just use table valued parameter (aka TVP), coma separated list of values really does not scale well. If the query is correctly written and the tables indexed, that should be quite fast. 100k rows table is a small table.
The query then may look like this
SELECT DollarAmount, comp.CompanyID
FROM Transactions
JOIN (SELECT MIN(TransactionID) as minTransactionID, CompanyID
FROM CompanyTransactions
GROUP BY CompanyID
) AS comp
ON Transactions.TransactionID = comp.minTransactionID
JOIN #IDList ON id = comp.CompanyID
You may use IN instead of JOIN if the ids in TVP are not unique.
Btw. do you know what NOLOCK means? If you are the only user of the database and use it single threaded or do not modify any data, then you are safe. Other than that it means that you are okay with a small chance of:
some records may be missing in the result
there are duplicate records in the result
there are rows in the result, that have never been committed and never were accepted as valid data
if you use varchar(max), you may get text that has never been stored

You want to do one query to get all of the ID/Name combinations, then put them into a dictionary (for quick access). This will remove the very slow process of running 30,000 queries as well as reduce the complexity of your code.
I could get you something more concrete if you posted the actual SQL query (you can change the column and table names if you need) but this should be close:
;WITH CompTransCTE AS (
SELECT CompanyID, MIN(TransactionID) AS TransactionID
FROM CompanyTransactions
WHERE CompanyID IN (/*Comma seperated list of values*/)
GROUP BY CompanyID
)
SELECT CT.CompanyID, T.DollarAmount, T.TransactionID
FROM Transactions AS T
INNER JOIN CompTransCTE AS CT ON CT.TransactionID = T.TransactionID;

Without creating a User-Defined Table Type in the database, you can use SqlBulkCopy to load the IDs into a temp table, and reference that in the query.
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;
namespace ConsoleApp11
{
class Program
{
static void Main(string[] args)
{
//var EmployeeIDs = File.ReadAllLines(""/* Filepath */);
var EmployeeIDs = Enumerable.Range(1, 30 * 1000).ToList();
var dt = new DataTable();
dt.Columns.Add("id", typeof(int));
dt.BeginLoadData();
foreach (var id in EmployeeIDs)
{
var row = dt.NewRow();
row[0] = id;
dt.Rows.Add(row);
}
dt.EndLoadData();
using (SqlConnection conn = new SqlConnection("server=.;database=tempdb;integrated security=true"))
{
conn.Open();
var cmdCreateTemptable = new SqlCommand("create table #ids(id int primary key)",conn);
cmdCreateTemptable.ExecuteNonQuery();
//var cmdCreateEmpable = new SqlCommand("create table Employees(EmployeeId int primary key, FirstName varchar(2000))", conn);
//cmdCreateEmpable.ExecuteNonQuery();
var bc = new SqlBulkCopy(conn);
bc.DestinationTableName = "#ids";
bc.ColumnMappings.Add("id", "id");
bc.WriteToServer(dt);
var names = new List<string>();
var cmd = new SqlCommand("SELECT FirstName, EmployeeId FROM Employees WHERE EmployeeID in (select id from #ids)", conn);
using (var rdr = cmd.ExecuteReader())
{
var firstName = rdr.GetString(0);
var id = rdr.GetInt32(1);
names.Add(firstName);
}
Console.WriteLine("Hit any key to continue");
Console.ReadKey();
}
}
}
}

Get many rows from a list of IDs

I'm using C# and SQL Server. I have a list of IDs for documents which corresponds to the primary key for a table in SQL Server that has a row for each document and the row contains (among other things) the ID and the document for that ID. I want to get the document in the row for each of the IDs. Currently, I execute a query for each ID, but since there are 10,000s of them, this runs a ton of queries and takes a very long time. It ends up being faster to simply load everything from the table into memory and then filter by the ids I have, but that seems inefficient and won't scale over time. If that doesn't make sense, hopefully the following code that takes a long time to run shows what I'm trying to do.
private static Dictionary<Guid, string> foo(IEnumerable<Guid> guids, SqlConnection conn)
{
using (SqlCommand command = new SqlCommand(null, conn))
{
command.CommandText = "select document from Documents where id = #id";
SqlParameter idParam = new SqlParameter("#id", SqlDbType.UniqueIdentifier);
command.Parameters.Add(idParam);
command.Prepare();
var documents = new Dictionary<Guid, string>();
foreach (var guid in guids)
{
idParam.Value = guid;
object obj = command.ExecuteScalar();
if (obj != null)
{
documents[guid] = (string)obj;
}
}
return documents;
}
}
I could programmatically construct query strings to use where clause like this: ".... where id in (ID1, ID2, ID3, ..., ID100)" to get 100 documents at a time or something like that, but this feels janky and it seems to me like there's got to be a better way.
I'm sure I'm not the only one to run into this. Is there an accepted way to go about this?

You can use Table-Valued Parameters with no limits in amount of guids
In the code you will create SqlParameter with all Id's you need to
First you need create type of parameter in the sql server
CREATE TYPE IdTableType AS TABLE
(
Id uniqueidentifier
);
Then in the code
private static Dictionary<Guid, string> foo(IEnumerable<Guid> guids, SqlConnection conn)
{
using (SqlCommand command = new SqlCommand(null, conn))
{
// use parameter as normal table in the query
command.CommandText =
"select document from Documents d inner join #AllIds a ON d.id = a.Id";
// DataTable is used for Table-Valued parameter as value
DataTable allIds = new DataTable();
allIds.Columns.Add("Id"); // Name of column need to be same as in created Type
foreach(var id in guids)
allids.Rows.Add(id);
SqlParameter idParam = new SqlParameter
{
ParameterName = "#AllIds",
SqlDbType=SqlDbType.Structured // Important for table-valued parameters
TypeName = "IdTableType", // Important! Name of the type must be provided
Value = allIds
};
command.Parameters.Add(idParam);
var documents = new Dictionary<Guid, string>();
using (var reader = command.ExecuteReader())
{
while (reader.Read())
{
documents[guid] = reader[0].ToString();
}
}
return documents;
}
}
You don't need to prepare the command any more. Besides after first execution next queries will use same compiled query plan, because query text remain same.

You can bunch them into sets of ids and pass a table valued parameter into the query. With Dapper this looks a bit like:
connection.Query("select document from Documents where id in #ids", new { ids = guids});
BEWARE though theres an 8000 parameter limit in sql so you will need to batch up your reads.
btw.. I'd highly recommend looking at Dapper or another micro orm for this type of data access.

Using Stored Procedure for only delete in Entity

Here's my stored proc I want to use in my DB:
ALTER PROCEDURE dbo.usp_DeleteExample
#ExampleID int, #LoggedInUserID int, #SessionID int, #AppID smallint
as
Declare #ExampleName varchar(255)
Set #ExampleName = (Select VehicleName from Example where ExampleID = #ExampleID)
Delete from ExampleToGeofence where ExampleID = #ExampleID;
exec usp_DeleteExampleHistoryData #ExampleID;
Delete From Example where ExampleID = #ExampleID;
Insert into UserActionHistory(ActionID, UserID, SessionID, ItemID, OldValue, ApplicationID, EventUTCDate)
Values (203, #LoggedInUserID, #SessionID, #ExampleID, #ExampleName, #AppID, getutcdate());
Here's my code where I'm trying to use it:
public Example Delete(Example example)
{
db.Examples.SqlQuery("usp_DeleteExample #ExampleID",
new SqlParameter("ExampleID", example.ExampleID)
);
db.SaveChanges();
return example;
}
Yet, when I call this through my WebAPI, nothing gets deleted.
What am I doing wrong here? Please let me know if you need more information.
Thank you.
Edit:
ModelBuilder configuration I just added:
modelBuilder.Entity<Example>()
.MapToStoredProcedures(e =>
e.Delete(v => v.HasName("usp_DeleteExample")));

I believe you should be using DbContext.Database.ExecuteSqlCommand. I'm not aware of any such format DbContext.TableName.ExecuteSqlCommand.
db.Database.ExecuteSqlCommand("usp_DeleteExample #ExampleID, #LoggedInUserID, #SessionID, #AppID ",
new SqlParameter("LoggedInUserID", variableName), new SqlParameter("SessionID", variableName),new SqlParameter("AppID ", variableName));
Because you are not modifying an entity there's no need to call SaveChanges() either. Personally, I import the SP's in the designer so they are readily available in C#.
Also note that you need to add the other 3 parameters because none of the parameters are optional.

SqlQuery is used to return entities. I think you want ExecuteSQlCommand:
public void Delete(Example example)
{
db.Examples.ExecuteSqlCommand("exec usp_DeleteExample #ExampleID",
new SqlParameter("#ExampleID", example.ExampleID)
);
return;
}

It seems there's some kind of limitations for the use of stored procedures for CRUD operations.
You must map insert, update and delete stored procedures to an entity
if you want to use stored procedure for CUD operations. Mapping only
one of them is not allowed.
Source

public HttpResponseMessage Delete(int id)
{
conn.ConnectionString = #"Data Source=DESKTOP-QRACUST\SQLEXPRESS;Initial Catalog=LeadManagement;Integrated Security=True";
conn.Open();
DataTable dt = new DataTable();
using (conn)
//data get by query
//using (var cmd = new SqlCommand("DELETE FROM EmployeeDetails WHERE EmployeeId=3;", conn))
//data get by sp
using (var cmd = new SqlCommand("EmployeeDetailsDelete", conn))
using (var da = new SqlDataAdapter(cmd))
{
da.Fill(dt);
}
return Request.CreateResponse(HttpStatusCode.OK, dt);
}

Transferring data between two Access databases

I am writing a simple reporting tool that will need to move data from a table in one Access database to a table in another Access database (the table structure is identical). However, I am new to C# and am finding it hard to come up with a reliable solution.
Any pointers would be greatly appreciated.

Access SQL supports using an IN clause to specify that a table resides in a different database. The following C# code SELECTs rows from a table named [YourTable] in Database1.accdb and INSERTs them into an existing table named [YourTable] (with the identical structure) in Database2.accdb:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.OleDb;
namespace oleDbTest
{
class Program
{
static void Main(string[] args)
{
string myConnectionString;
myConnectionString =
#"Provider=Microsoft.ACE.OLEDB.12.0;" +
#"Data Source=C:\Users\Public\Database1.accdb;";
using (var con = new OleDbConnection())
{
con.ConnectionString = myConnectionString;
con.Open();
using (var cmd = new OleDbCommand())
{
cmd.Connection = con;
cmd.CommandType = System.Data.CommandType.Text;
cmd.CommandText =
#"INSERT INTO YourTable IN 'C:\Users\Public\Database2.accdb' " +
#"SELECT * FROM YourTable WHERE ID < 103";
cmd.ExecuteNonQuery();
}
con.Close();
}
Console.WriteLine("Done.");
}
}
}

Many ways.
0) If it's only once, copy and paste the table.
1) If you want to do this inside Access, the easiest way is to create a linked table in the new database, and then a make table query in the new database.
2) You can reference the second table directly.
SELECT *
FROM TableInDbX IN 'C:\SomeFolder\DB X';
3) In a macro, you can use the TransferDatabase method of the DoCmd object to link relevant tables and then run suitable append and update queries to synchronize.
4) VBA
http://www.techonthenet.com/access/questions/new_mdb.php

Given column names Col1, Col2, and Col3:
private static void Migrate(string dbConn1, string dbConn2) {
// DataTable to store your info into
var table = new DataTable();
// Modify your SELECT command as needed
string sqlSelect = "SELECT Col1, Col2, Col3 FROM aTableInOneAccessDatabase ";
// Notice this uses the connection string to DB1
using (var cmd = new OleDbCommand(sqlSelect, new OleDbConnection(dbConn1))) {
cmd.Connection.Open();
table.Load(cmd.ExecuteReader());
cmd.Connection.Close();
}
// Modify your INSERT command as needed
string sqlInsert = "INSERT INTO aTableInAnotherAccessDatabase " +
"(Col1, Col2, Col3) VALUES (#Col1, #Col2, #Col3) ";
// Notice this uses the connection string to DB2
using (var cmd = new OleDbCommand(sqlInsert, new OleDbConnection(dbConn2))) {
// Modify these database parameters to match the signatures in the new table
cmd.Parameters.Add("#Col1", DbType.Int32);
cmd.Parameters.Add("#Col2", DbType.String, 50);
cmd.Parameters.Add("#Col3", DbType.DateTime);
cmd.Connection.Open();
foreach (DataRow row in table.Rows) {
// Fill in each parameter with data from your table's row
cmd.Parameters["#Col1"].Value = row["Col1"];
cmd.Parameters["#Col2"].Value = row["Col2"];
cmd.Parameters["#Col3"].Value = row["Col3"];
// Insert that data
cmd.ExecuteNonQuery();
}
cmd.Connection.Close();
}
}
Now, I do not work with Access databases very often, so you may need to tweak something up there.
That should get you well on your way, though.
Worth noting:
If I remember correctly, Access does NOT pay attention to your OleDbParameter names! You could call them whatever you want, and in fact most people just use a question mark ? for the parameter fields.
So, you have to add and update these parameters in the same order that your statement calls them.
So, why did I name the parameters #Col1, #Col2, #Col3? Here, it just to help you and me understand where each parameter is intended to map to. It is also good practice to get into. If you ever migrate to a better database, hopefully it will pay attention to what the parameters are named.

ODAC seems to be caching table schema?

I'm using Oracle's ODAC.NET for a .NET 3.5 project against an Oracle 11 Express database, and I'm seeing behavior that I can't explain (and can't seem to work around).
ODAC should be the latest, I just pulled it 3 days ago, but the versions are as follows:
Oracle.DataAccess.dll version 2.112.3.0 (release 5)
oci.dll (instant client) version 11.2.0.1
I have a Table, People, that has 3 columns:
ID
FirstName
LastName
In code I run an ALTER TABLE command, using OracleCommand.ExecuteNonQuery, to add a new column named "MIDDLE_NAME" to the table. That command succeeds. If I look at the table with Oracle SQL Developer, the columns shows up. All well and good.
Now if I run use OracleCommand.ExecuteReader with a command text of SELECT * FROM People right after I do the alter table, I get back data with only 3 columns, not 4!
Here is code that reproduces the problem:
public void FieldTest()
{
var sql1 = "CREATE TABLE People (" +
"ID NUMBER PRIMARY KEY, " +
"FirstName NVARCHAR2 (200), " +
"LastName NVARCHAR2 (200) NOT NULL)";
var sql2 = "ALTER TABLE People " +
"ADD Middle_Name NUMBER";
var sql3 = "SELECT * FROM People";
var sql4 = "SELECT column_name FROM all_tab_cols WHERE table_name = 'PEOPLE'";
var cnInfo = new OracleConnectionInfo("192.168.10.246", 1521, "XE", "system", "password");
var connectionString = BuildConnectionString(cnInfo);
using (var connection = new OracleConnection(connectionString))
{
connection.Open();
using (var create = new OracleCommand(sql1, connection))
{
create.ExecuteNonQuery();
}
using (var get = new OracleCommand(sql3, connection))
{
using (var reader = get.ExecuteReader())
{
Debug.WriteLine("Columns: " + reader.FieldCount);
// outputs 3, which is right
}
}
using (var alter = new OracleCommand(sql2, connection))
{
alter.ExecuteNonQuery();
}
using (var get = new OracleCommand(sql3, connection))
{
using (var reader = get.ExecuteReader())
{
Debug.WriteLine("Columns: " + reader.FieldCount);
// outputs 3, which is *wrong* <---- Here's the problem
}
}
using (var cols = new OracleCommand(sql4, connection))
{
using (var reader = cols.ExecuteReader())
{
int count = 0;
while (reader.Read())
{
count++;
Debug.WriteLine("Col: " + reader.GetString(0));
}
Debug.WriteLine("Columns: " + count.ToString());
// outputs 4, which is right
}
}
}
}
I've tried some things to prevent the behavior, and none of them give me back the 4th column:
I close the connection and re-open it
I use a new OracleConnection for the SELECT than for the ALTER
I use the same OracleConnection for the SELECT and for the ALTER
I use a new OracleCommand for the SELECT than for the ALTER
I use the same OracleCommand for the SELECT and for the ALTER
I call PurgeStatementCache on the connection between the ALTER and SELECT
I call FlushCache on the connection between the ALTER and SELECT
I explicitly Close and Dispose the OracleCommand and OracleConnection (as opposed to the using block) used for the ALTER and SELECT
Restarted the calling PC and the PC hosting the Oracle database.
If I look at the column list by doing a SELECT * FROM all_tab_cols, the new column is there.
The only thing that seems to work reliably is closing the app and re-starting it (well it's from a unit test, but it's a shutdown and restart of the test host). Then I get that 4th column. Sometimes I can use breakpoints and re-execute queries and the 4th column will appear, but nothing that is specifically repeatable with straight execution of code (meaning without setting a break point and moving the execution point back up).
Something in the bowels of ODAC seems to be caching the schema of that table, but I can figure out what, why or how to prevent it. Anyone have any experience with this, or ideas how I might prevent it?

I know this answer comes years later but if new readers run into problems with caching try setting:
Metadata Pooling = false, Self Tuning = False and Statement Cache Size = 0
...in the connection string. Keep in mind that there are performance implications for doing so.
https://docs.oracle.com/database/122/ODPNT/featConnecting.htm#GUID-0CFEB161-68EF-4BC2-8943-3BDFFB878602

Maybe post some of your C# code. The following is a test that behaves as expected, meaning I can see the new column immediately after adding it. This is using odp 11.2 rel 5 hitting an 11g db, using 4.0 framework:
The test table is:
CREATE TABLE T1
(
DTE DATE default sysdate
);
Drop and recreate it after each run of the following C# code (a bit dirty but anyway):
string connStr = "User Id=xxx;Password=yyy;Data Source=my11gDb;";
using (OracleConnection con = new OracleConnection(connStr))
{
string s = "ALTER TABLE T1 ADD (added_col VARCHAR2(10))";
using (OracleCommand cmd = new OracleCommand(s, con))
{
con.Open();
cmd.ExecuteNonQuery();
string s2 = "select column_name from all_tab_columns where table_name = 'T1'";
//con.FlushCache(); // doesn't seem to matter, works with or without
using (OracleCommand cmd2 = new OracleCommand(s2, con))
{
OracleDataReader rdr = cmd2.ExecuteReader();
for (int i = 0; rdr.Read(); i++)
{
Console.WriteLine("Column {0} => {1}",i+1,rdr.GetString(0));
}
rdr.Close();
}
}
}
Output:
Column 1 => DTE
Column 2 => ADDED_COL
Edit:
Ah, ok, I see what you're saying, it looks like statement caching. I played around with changing the cache size to 0 (in conn string, use "Statement Cache Size=0"), and also tried cmd.AddToStatementCache = false, but these did not work.
One thing that does work is to use a slightly different string, like adding a space. I know its a hack, but this is all I can get to work for me anyway.
Try your example with:
var sql3 = "SELECT * FROM People";
var sql5 = "SELECT * FROM People "; // note extra space
And use sql3 before adding column, and sql5 after adding a column.
Hope that helps

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to insert self referencing entities into sql? - c#

Related

Multithread SQL select statements

Get many rows from a list of IDs

Using Stored Procedure for only delete in Entity

Transferring data between two Access databases

ODAC seems to be caching table schema?

Categories

Resources