I have a C# application, using ADO.Net to connect to MSSQL
I need to create the table (with a dynamic number of columns), then insert many records, then do a select back out of the table.
Each step must be a separate C# call, although I can keep a connection/transaction open for the duration.
There are two types of temp tables in SQL Server, local temp tables and global temp tables. From the BOL:
Prefix local temporary table names with single number sign (#tablename), and prefix global temporary table names with a double number sign (##tablename).
Local temp tables will live for just your current connection. Globals will be available for all connections. Thus, if you re-use (and you did say you could) the same connection across your related calls, you can just use a local temp table without worries of simultaneous processes interfering with each others' temp tables.
You can get more info on this from the BOL article, specifically under the "Temporary Tables" section about halfway down.
The issue is that #Temp tables exist only within the Connection AND the Scope of the execution.
When the first call from C# to SQL completes, control passes up to a higher level of scope.
This is just as if you had a T-SQL script that called two stored procedures. Each SP created a table named #MyTable. The second SP is referencing a completly different table than the first SP.
However, if the parent T-SQL code created the table, both SP's could see it, but they can't see each others.
The solution here is to use ##Temp tables. They cross scope and connections.
The danger though is that if you use a hard coded name, then two instances of your program running at the same time could see the same table. So dynamically set the table name to something that will be always be unique.
You might take a look at the repository pattern as far as dealing with this concept in C#. This allows you to have a low level repository layer for data access where each method performs a task. But the connection is passed in to the method and actual actions are performed with in a transaction scope. This means you can theoretically call many different methods in your data access layer (implemented as repository) and if any of them fail you can roll back the whole operation.
http://martinfowler.com/eaaCatalog/repository.html
The other aspects of your question would be handled by standard sql where you can dynamically create a table, insert into it, delete from it, etc. The tricky part here is keeping one transaction away from another transaction. You might look to using temp tables...or you might simply have a 2nd database specifically for performing this dynamic table concept.
Personaly I think you are doing this the hard way. Do all the steps in one stored proc.
One way to extend the scope/lifetime of your single pound sign #Temp is to use a transaction. For as long as the transaction lives, the #temp table continues to exist. You can also use TransactionScope to give you the same effect, because TransactionScope creates an ambient transaction in the background.
The below test methods pass, proving that the #temp table contents survive between executions.
This may be preferable to using double-pound temp tables, because ##temp tables are global objects. If you have more than one client that happens to use the same ##temp table name, then they could step on each other. Also, ##temp tables do not survive a server restart, so their lifespan is technically not forever. IMHO it's best to control the scope of #temp tables because they're meant to be limited.
using System.Transactions;
using Dapper;
using Microsoft.Data.SqlClient;
using IsolationLevel = System.Data.IsolationLevel;
namespace TestTempAcrossConnection
{
[TestClass]
public class UnitTest1
{
private string _testDbConnectionString = #"Server=(localdb)\mssqllocaldb;Database=master;trusted_connection=true";
class TestTable1
{
public int Col1 { get; set; }
public string Col2 { get; set; }
}
[TestMethod]
public void TempTableBetweenExecutionsTest()
{
using var conn = new SqlConnection(_testDbConnectionString);
conn.Open();
var tran = conn.BeginTransaction(IsolationLevel.ReadCommitted);
conn.Execute("create table #test1(col1 int, col2 varchar(20))", transaction: tran);
conn.Execute("insert into #test1(col1,col2) values (1, 'one'),(2,'two')", transaction: tran);
var tableResult = conn.Query<TestTable1>("select col1, col2 from #test1", transaction: tran).ToList();
Assert.AreEqual(1, tableResult[0].Col1);
Assert.AreEqual("one", tableResult[0].Col2);
tran.Commit();
}
[TestMethod] public void TempTableBetweenExecutionsScopeTest()
{
using var scope = new TransactionScope();
using var conn = new SqlConnection(_testDbConnectionString);
conn.Open();
conn.Execute("create table #test1(col1 int, col2 varchar(20))");
conn.Execute("insert into #test1(col1,col2) values (1, 'one'),(2,'two')");
var tableResult = conn.Query<TestTable1>("select col1, col2 from #test1").ToList();
Assert.AreEqual(2, tableResult[1].Col1);
Assert.AreEqual("two", tableResult[1].Col2);
scope.Complete();
}
}
}
Related
I'm trying to determine how I should handle a one to many relationship in my DB, when using the data to build a model in C#. Ideally, I'd like to make a single call to the DB. However, it seems that two (or more) calls might be required.
For simplicity, assume my tables look like this...
CREATE TABLE [dbo].[Users]
(
[userId] INT PRIMARY KEY IDENTITY(0,1),
[userName] NVARCHAR(500) NOT NULL
)
CREATE TABLE [dbo].[Tasks]
(
[taskId] INT PRIMARY KEY IDENTITY(0,1),
[description] NVARCHAR(1000) NOT NULL,
[userId] INT FOREIGN KEY REFERENCES [dbo].[Users](userId)
)
So each user can have many tasks. I have a stored procedure that will return the details of a user, that looks like this...
CREATE PROCEDURE [dbo].[sp_GetUserDetail]
#userId INT
AS
BEGIN
SELECT
[dbo].[Users].[userName] AS 'User',
[dbo].[Tasks].[description] AS 'Task Description'
FROM
[dbo].[Users]
INNER JOIN
[dbo].[Tasks]
ON
[dbo].[Tasks].[userId] = [dbo].[Users].[userId]
WHERE
[dbo].[Users].[userId] = #userId
END
This procedure returns as many rows, as tasks that are assigned to a user. The model I'm trying to fill, would look something like this.
public interface User
{
string Name { get; set; }
List<string> Tasks { get; set; }
}
I see my options as follows:
Use this code, and loop through the rows that are returned from the DB to build the Tasks list.
Call one stored procedure to return the data from the Users table, then another to get the data from the Tasks table.
Some (unknown to me magic) way to have a single stored procedure return all the data in a single row.
Some other option I don't even know about.
How is this problem typically handled by experienced Developers?
There are some language/framework specific answers which I won't cover (because C# is not my forte), but it's worth looking at "data binding", which is one of the features of the .Net framework. You could also look at ORM tools for C#.
The example you give - "how do I load child information for my parent" - is common, and you have to trade the number of database queries against the amount of data each query returns, and the complexity of your user interface code. For instance, if tasks have foreign keys to sub tasks (i.e. a self join), and task_type, and project_id, you have either:
1 query per table (your option 1): simplest to implement in the UI layer, simplest to implement in the database layer, but could easily cause dozens of database calls per screen.
1 query to retrieve all data for the screen (your option 2): single database hit, so should be faster, but complex UI and database logic; could potentially load the entire database into memory if you keep following foreign key relationships. Not all data may be necessary for the screen.
There is no "right" answer to this - it really depends on your application design.
However, there is an option you haven't mentioned (this is SQL Server-specific): a stored procedure can return multiple result sets. So, you could have one result set to provide the "header" data (user information), and one to provide task information.
Is a single call to ExecuteNonQuery() atomic or does it make sense to use Transactions if there are multiple sql statements in a single DbCommand?
See my example for clarification:
using (var ts = new TransactionScope())
{
using (DbCommand lCmd = pConnection.CreateCommand())
{
lCmd.CommandText = #"
DELETE FROM ...;
INSERT INTO ...";
lCmd.ExecuteNonQuery();
}
ts.Complete();
}
If you don't ask for a transaction, you (mostly) don't get one. SQL Server wants everything in transactions and so, by default (with no other transaction management), for each separate statement, SQL Server will create a transaction and automatically commit it. So in your sample (if there was no TransactionScope), you'll get two separate transactions, both independently committed or rolled back (on error).
(Unless you've turned IMPLICIT_TRANSACTIONS on on that connection, in which case you'll get one transaction but you need an explicit COMMIT or ROLLBACK at the end. The only people I've found using this mode are people porting from Oracle and trying to minimize changes. I wouldn't recommend turning it on for greenfield work because it'll just confuse people used to SQL Server's defaults)
It's not. SQL engine will treat this text as two separate instructions. TransactionScope is required (or any other form of transaction, i.e. implicit BEGIN TRAN-COMMIT in SQL text if you prefer).
No, as the above answers say the command (as opposed to individual statements within the command) will not be run inside a transaction
Will be easy to verify
Sample code
create table t1
(
Id int not null,
Name text
)
using (var conn = new SqlConnection(...))
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = #"
insert into t1 values (1, 'abc');
insert into t1 values (null, 'pqr');
";
cmd.ExecuteNonQuery();
}
The second statement will fail. But the first statement will execute and you'll have a row in the table.
I am building an application in which I will producing some reports based off the results of some SQL queries executed against a number of different databases and servers. Since I am unable to create stored procedures on each server, I have my SQL scripts saved locally, load them into my C# application and execute them against each server using ADO.NET. All of the SQL scripts are selects that return tables, however, some of them are more complicated than others and involve multiple selects into table variables that get joined on, like the super basic example below.
My question is, using ADO.NET, is it possible to assign a string of multiple SQL queries that ultimately only returns a single data table to a SqlCommand object - e.g. the two SELECT statements below comprising my complete script? Or would I have to create a transaction and execute each individual query separately as its own command?
-- First Select
SELECT *
INTO #temp
FROM Table1;
--Second Select
SELECT *
FROM Table1
JOIN #temp
ON Table1.Id = #temp.Id;
Additionally, some of my scripts have comments embedded in them like the rudimentary example above - would these need to be removed or are they effectively ignored within the string? This seems to be working with single queries, in other words the "--This is a comment" is effectively ignored.
private void button1_Click(object sender, EventArgs e)
{
string ConnectionString = "Server=server1;Database=test1;Trusted_Connection=True";
using (SqlConnection conn = new SqlConnection(ConnectionString))
{
SqlCommand cmd = new SqlCommand("--This is a comment \n SELECT TOP 10 * FROM dbo.Tablw1;");
DataTable dt = new DataTable();
SqlDataAdapter sqlAdapt = new SqlDataAdapter(cmd.CommandText.ToString(), conn);
sqlAdapt.Fill(dt);
MessageBox.Show(dt.Rows.Count.ToString());
}
}
Yes, that is absolutely fine. Comments are ignored. It should work fine. The only thing to watch is the scopin of temporary tables - if you are used to working with stored procedures, the scope is temporary (they are removed when the stored procedure ends); with direct commands: it isn't - they are connection-specific but survive between multiple operations. If that is a problem, take a look at "table variables".
Note: technically this is up to the backend provider; assuming you are using a standard database engine, you'll be OK. If you are using something exotic, then it might be a genuine question. For example, it might not work on "Bob's homemade OneNote ADO.NET provider".
Yes, you can positively do it.
You can play with different types of collections, or with string Builder for passing queries even you can put the string variable and assign the query to it.
While the loop is running put in temp table or CTE, its totally depends on you to choose the approach. and add the data to datatable.
So if you want the entire data to be inserted or Updated or deleted then you can go for transaction,it won't be any issue.
I don't use ado.net, I use Entity Framework but I think this is more a SQL question than an ADO.NET question; Forgive me if I'm wrong. Provided you are selecting from Table1 in both queries I think you should use this query instead.
select *
from Table1 tbl1
join Table1 tbl2
on tbl1.id = tbl2.id
Actually I really don't ever see a reason you would have to move things into temp tables with options like Common Table Expressions available to you.
look up CTEs if you don't already know about them
https://www.simple-talk.com/sql/t-sql-programming/sql-server-cte-basics/
I am creating a progress tracker for a school.
The progress tracker stores scores for each student in various threads and criteria within the threads.
I am currently planning on using a table per class (of students) which stores their progress in each thread and then a table per thread which stores their progress in each criteria within that thread.
I have no way of knowing how many classes (tables) are going to be in the school so I need to find some way of allowing the Administrator accounts to create classes (tables) with a name specified by the Admin.
The easiest way I thought of doing this was with using variables as the table name upon creation but there could be a better way of doing this?
You CAN do something like that, but as D Stanley highlighted you can't use parameters for table names. As such you wouldn't be able to parameterise the user's input if that's to be used as the table name and therefore it makes it a very bad idea. This would immediately open you up to SQL injection, which is never a good plan.
Even with tight sanitization of the user's input there are too many variables to consider, which no doubt require far more work than desired and could still fall prone to attacks as sql evolves.
I would suggest rewording your question to perhaps giving a general idea of what your app is trying to achieve to see if there's another way forwards without creating a table per user.
UPDATE
Based on your rewording of your question it sounds like you need to think about your desired database structure. I'd be tempted to have the following tables:
Students, with 1 entry per student, primary key of StudentId
Classes - with 1 entry per class, primary key of ClassId
Criteria - 1 entry per type of class criteria, primary key of CriteriaId
Progress - potentially multiple entries per student referencing the StudentId, ClassId, CriteriaId and the Score (perhaps ClassScore and CriteriaScore).
You could then have queries to the Progress table that pulled out a student's progress based on just their Id, or their Id and ClassId, or further still their Id, ClassId and CriteriaId etc.
In terms of allowing Admins to create their own you'd simply create queries that allow Admins to insert student records into the Student table, classes into the Class table and criteria into the Criteria table. On creating a Student record you'd also presumably capture their classes and criteria at the same time, which would insert their record into the Progress table (initially 0 for progress so far). You'd presumably also want an update statement to allow admins to update the Progress table for any given student.
Anyway, hopefully this is enough of a pointer to enable you to not have to create a table per student or per class etc.
Well, firstable you must create the database and the Table (or you can create it later using C#). You must connect C# with SQL using the resources files, which is something like this example
Provider=XXXXXX.DataSource.1 ; Data Source=XXXX.XXXX.XXXX.CXXX;
Persist Security Info=True;User ID=XXXXX;pASSWORD=XXXXXX;
Initial Catalog=XXXXX;Force Translate=0;
Catalog Library List=XXXXXX,XXXXX
Then you create a SQLConnection object, create the connection with this method
CreateConnection()
Select the one you put in your resx file (or Resources) with this method:
`NameOfObject.ConnectionString = ConnStr();
and use this method to Open it NameOfObject.Open();
Now you can insert, delete, execute queries with this structure, you finally must get in your code something like this:
SqlConnection sqlConnection1 = new SqlConnection("Your Connection String"); //Here you can put the string that you'll use in your resx file
SqlCommand cmd = new SqlCommand(); //Initialize the command object for executing instructions (queries)
SqlDataReader reader;
cmd.CommandText = "INSERT INTO TABLE_NAME_HERE VALUES (" + nameYouWillExtractFromTheUser + ")";
cmd.CommandType = CommandType.Text; //We say that the command is a textType
cmd.Connection = sqlConnection1; //Initiate the connection
sqlConnection1.Open(); //Opens the connection
reader = cmd.ExecuteReader(); //Execute the connection
I am building a batch processing system. Batches of Units come in quantities from 20-1000. Each Unit is essentially a hierarchy of models (one main model and many child models). My task involves saving each model hierarchy to a database as a single transaction (either each hierarchy commits or it rolls back). Unfortunately EF was unable to handle two portions of the model hierarchy due to their potential to contain thousands of records.
What I've done to resolve this is set up SqlBulkCopy to handle these two potentially high count models and let EF handle the rest of the inserts (and referential integrity).
Batch Loop:
foreach (var unitDetails in BatchUnits)
{
var unitOfWork = new Unit(unitDetails);
Task.Factory.StartNew(() =>
{
unitOfWork.ProcessX(); // data preparation
unitOfWork.ProcessY(); // data preparation
unitOfWork.PersistCase();
});
}
Unit:
class Unit
{
public PersistCase()
{
using (var dbContext = new CustomDbContext())
{
// Need an explicit transaction so that
// EF + SqlBulkCopy act as a single block
using (var scope = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions() {
IsolationLevel = System.Transaction.IsolationLevel.ReadCommitted
}))
{
// Let EF Insert most of the records
// Note Insert is all it is doing, no update or delete
dbContext.Units.Add(thisUnit);
dbContext.SaveChanges(); // deadlocks, DbConcurrencyExceptions here
// Copy Auto Inc Generated Id (set by EF) to DataTables
// for referential integrity of SqlBulkCopy inserts
CopyGeneratedId(thisUnit.AutoIncrementedId, dataTables);
// Execute SqlBulkCopy for potentially numerous model #1
SqlBulkCopy bulkCopy1 = new SqlBulkCopy(...);
...
bulkCopy1.WriteToServer(dataTables["#1"]);
// Execute SqlBulkCopy for potentially number model #2
SqlBulkCopy bulkCopy2 = new SqlBulkCopy(...);
...
bulkCopy2.WriteToServer(dataTables["#2"]);
// Commit transaction
scope.Complete();
}
}
}
}
Right now I'm essentially stuck between a rock and a hard place. If I leave the IsolationLevel set to ReadCommitted, I get deadlocks between EF INSERT statements in different Tasks.
If I set the IsolationLevel to ReadUncommitted (which I thought would be fine since I'm not doing any SELECTs) I get DbConcurrencyExceptions.
I've been unable to find any good information about DbConcurrencyExceptions and Entity Framework but I'm guessing that ReadUncommitted is essentially causing EF to receive invalid "rows inserted" information.
UPDATE
Here is some background information on what is actually causing my deadlocking issues while doing INSERTS:
http://connect.microsoft.com/VisualStudio/feedback/details/562148/how-to-avoid-using-scope-identity-based-insert-commands-on-sql-server-2005
Apparently this same issue was present a few years ago when Linq To SQL came out and Microsoft fixed it by changing how scope_identity() gets selected. Not sure why their position has changed to this being a SQL Server problem when the same issue came up with Entity Framework.
This issue is explained fairly well here: http://connect.microsoft.com/VisualStudio/feedback/details/562148/how-to-avoid-using-scope-identity-based-insert-commands-on-sql-server-2005
Essentially its an internal EF issue. I migrated my code to use Linq To SQL and it now works fine (no longer does the unnecessary SELECT for the identity value).
Relevant quote from the exact same issue in Linq To Sql which was fixed:
When a table has an identity column, Linq to SQL generates extremely
inefficient SQL for insertion into such a table. Assume the table is
Order and the identiy column is Id. The SQL generated is:
exec sp_executesql N'INSERT INTO [dbo].[Order]([Colum1], [Column2])
VALUES (#p0, #p1)
SELECT [t0].[Id] FROM [dbo].[Order] AS [t0] WHERE [t0].[Id] =
(SCOPE_IDENTITY()) ',N'#p0 int,#p1 int,#p0=124,#p1=432
As one can see instead of returning SCOPE_IDENTITY() directly by using
'SELECT SCOPE_IDENTITY()', the generated SQL performs a SELECT on the
Id column using the value returned by SCOPE_IDENTITY(). When the
number of the records in the table is large, this significantly slows
down the insertion. When the table is partitioned, the problem gets
even worse.