Related
We are using the following code sample to send data from my application to an external DB via stored procedure (SQL Server).
Here I need to support MySQL also. So based on DB selection by end user, we need to send the data to the either MySQL or SQL Server
The c# code will be running on a different machine and the DB server will be different server.
C# Code
using (SqlConnection sqlConnection = new SqlConnection(<<MyConnectionString>>))
{
using (SqlCommand sqlCommand = new SqlCommand(<<StoredProcedureName>>, sqlConnection))
{
sqlCommand.CommandType = CommandType.StoredProcedure;
sqlCommand.Parameters.Add("tblStudent", SqlDbType.Xml).Value = students.ToList().ToXML();
sqlConnection.Open();
sqlCommand.ExecuteNonQuery();
}
}
Stored Procedure
CREATE PROCEDURE `usp_UpdateStudent` (#tblStudent XML)
BEGIN
SET NOCOUNT ON;
INSERT INTO Student(StudentId,StudentName)
SELECT Student.value('(StudentId)[1]', 'nvarchar(100)') as StudentId,
Student.value('(StudentName)[1]', 'nvarchar(100)') as StudentName
FROM
#tblStudent.nodes('/ArrayOfStudent/Student')AS TEMPTABLE(Student)
END
I searched on the web on how to pass xml string as input parameter from c# to a stored procedure. But I don't get any concrete answer.
Please advice on how to create a stored procedure with XML as input parameter and also how to pass the XML string from c# to the same.
Note: The above code works as expected in SQL Server. When I tried to implement the same with MySQL, I found that MySQL do not support xml as input type parameter in Stored Procedure. It looks like I need to pass the xml as normal text and parse the text in stored procedure.
Please let me know if there is more efficient way to do this.
Using Load_File() imports the xml data into a local variable, while ExtractValue() then queries the XML data using XPath. For instance, in the code below, it retrieves a count of students from the xml_content variable:
declare xml_content text;
declare v_row_count int unsigned;
set xml_content = load_file(path);
set v_row_count = extractValue(xml_content, concat('count(', node, ')'))
path: 'C:\students1.xml', node: '/student_list/student'
I realise this is an old question, but I too failed to find a good answer on SO, so I was forced to do a bit of work myself! The end result seems to work. Adapting it for your purposes, gives you something like:
DELIMITER $$
CREATE procedure `usp_InsertStudent` (ptblStudent text)
BEGIN
declare cnt int;
declare ptr int;
declare rowPtr varchar(100);
set cnt = (extractValue(ptblStudent, 'count(/ArrayOfStudent/Student)'));
set ptr = 0;
while ptr < cnt do
SET ptr = ptr + 1;
SET rowPtr = concat('/ArrayOfStudent/Student[', ptr, ']');
INSERT INTO Student (StudentId,StudentName)
values (extractValue(ptblStudent, concat(rowPtr, '/StudentId')),
extractValue(ptblStudent, concat(rowPtr, '/StudentName')));
end while;
SELECT ptr;
END;
$$
DELIMITER ;
As an aside - I changed the routine name (your example was doing an insert not an update).
By way of explanation. If you do extractValue('/ArrayOfStudent/Student/StudentId') you get a single result string, with all the values separated by a space. Given that it is not so easy (in my limited experience) to split a string in MySQL, it seemed better to extract the individual values row by row (this was particularly true when there are many fields - I did first try extracting all fields at once into space separated strings and then splitting the strings into separate temporary tables each with an auto_increment id, and then joining the temporary tables on the ids, but this quickly became messy when more than three fields were required), hence the requirement to check the row count, and the use of the while loop. This does mean that the inserts become single inserts, which is why I return ptr to indicate the number of rows added, rather than row_count() (which in this case would be 1!).
I was pleasantly pleased at the flexibility of MySQL when it came to implicit casting: I have so far tested ints, doubles and DateTimes successfully, in addition to strings - no explicit casting has to date been necessary.
On a more general level, your question also raised the issue of coding to multiple data providers. Some years ago, a colleague of mine persuaded me to go the DbConnection route as advocated in some of the comments. With the benefit of hindsight, this was a mistake. The problem is, that you lose the chance to take advantage of particular features of one or other db provider as exposed through their .Net libraries. So what I do now, is very much what you were proposing: I define my Data Access Layer by means of an interface; this interface is then implemented by one class per db provider, each using the native .Net libraries. Thus the SQL Server implementation uses SqlConnection, the MySQL implementation MySqlConnection, the Oracle implementation OracleConnection and so on. In this way my application does not care about the implementation details, but I am free to take advantage of features unique to one db or another. To give a simple example: MS SQL server allows stored procedures to return multiple recordsets (with differing fields), thus allowing you to populate a complex DataSet in one call to the db; all other dbs that I use, require one procedure per recordset, making it necessary to build the DataSet within the DAL. To the application, there is no difference, as the Interface expects a DataSet to be returned.
I'm writing a customized, simple Web interface for Oracle DB, using ASP.NET, a Web API project in C#, and Oracle.DataAccess (ODP.NET). This is an educational project which I am designing for an extra project for a college course. There's several reasons for me designing this project, but the upshot is that using Oracle-provided tools (SQL Developer, Enterprise Manage Express, etc.) are not suitable for the task at hand.
I have an API call that can accept a query string, execute it against the DBMS and return the DBMS's output as JSON data, along with some additional return data. This has been sufficient for simple SELECT queries and other basic DDL/DML queries. However, now we're branching into PL/SQL.
For example, the most basic PL/SQL HELLO WORLD program that we'd execute looks like:
BEGIN
DBMS_OUTPUT.PUT_LINE('Hello World');
END;
When I feed this query into my C# API, it does execute successfully. However, I want to be able to retrieve the output of the DBMS_OUTPUT.PUT_LINE call(s).
This question has been addressed before and I have looked into a few of the solutions, and came down on one involving a piece of code which calls the following PL/SQL on the database:
BEGIN
Dbms_output.get_line(:line, :status);
END;
The C# code obviously creates and adds the correct parameter objects to the request before sending it. I plan to call this function repeatedly until a NULL value comes back, indicating the end of output. This data would then be added to the JSON object returned by the API so that the Web interface can display the output. However, this function never returns any lines of output.
My hunch (I'm still learning Oracle myself, so not sure) is that either the server isn't actually outputting the data, or that the buffer is flushed after the PL/SQL anonymous procedure (the Hello World) program finishes.
It was also suggested to add set serveroutput on; to the PL/SQL query but this did not work: it produced the error ORA-00922: missing or invalid option.
Here is the actual C# code being used to retrieve a line of output from the DBMS_OUTPUT buffer:
private string GetDbmsOutputLine(OracleConnection conn)
{
OracleCommand command = new OracleCommand
{
CommandText = "begin dbms_output.get_line(:line, :status); end;",
CommandType = CommandType.Text,
Connection = conn,
};
OracleParameter lineParameter = new OracleParameter("line",
OracleDbType.Varchar2);
lineParameter.Size = 32000;
lineParameter.Direction = ParameterDirection.Output;
command.Parameters.Add(lineParameter);
OracleParameter statusParameter = new OracleParameter("status",
OracleDbType.Int32);
statusParameter.Direction = ParameterDirection.Output;
command.Parameters.Add(statusParameter);
command.ExecuteNonQuery();
if (command.Parameters["line"].Value is DBNull)
return null;
string line = command.Parameters["line"].Value as string;
return line;
}
Edit: I tried manually calling the following procedure prior to executing the user's code: BEGIN DBMS_OUTPUT.ENABLE(32768); END;. This executes without error but after doing so the later calls to DBMS_OUTPUT.GET_LINE still return null.
It looks like what may be happening is that each time I execute a new query to the database, even though it's on the same connection, that the DBMS_OUTPUT buffer is being cleared. I am not sure if this is the case, but it seems to be - nothing else would readily explain the lack of data in the buffer.
Still searching for a way to handle this...
Points to keep in mind:
This is an academic project for student training and development; hence, it is not expected that this mini-application be "production-ready" in any way. Allowing users to execute raw queries posted via the Web obviously leads to all sorts of security risks - which is why this would never be put into an actual production scenario.
I currently open a connection and maintain it throughout a single API call by passing it into each OracleCommand object I create. This, in theory, should mean that the buffer is maintained, but it doesn't appear to be the case. Either the data I write is not making it to the buffer in the first place, or the buffer is flushed each time an OracleCommand object is actually executed against the database connection.
With the caveat that in reality you'd never write code that expects that anyone will ever see data that you attempt to write to the dbms_output...
Within a session, you'd need to call dbms_output.enable that allocates the buffer that is written to by dbms_output. Depending on the Oracle version, you may be able to pass in a null to indicate that you want an unlimited buffer size. In older versions, you'd need to allocate a fixed buffer size (and you'd get an error if you try to write too much data to the buffer). Then you'd call the procedure that calls dbms_output.put[_line]. Finally, you'd be able to call dbms_output.get[_line]. Note that all three things have to happen in the context of a single session. Each session has a separate dbms_output buffer (or no dbms_output buffer).
If I run this command in SSMS:
set showplan_xml on
GO
exec some_procedure 'arg1', 'arg2','arg3'
GO
set showplan_xml off
GO
I get XML output of the full call stack involved in the query execution, as well as any suggested indexes etc.
How might one read this from C#?
(One use case might be to periodically enable this and log these results in a production environment to keep an eye on index suggestions.)
This is, for the most part, two separate (though related) questions.
Is it possible to capture or somehow get the Missing Index information?
If you want only the Suggested Indexes (and don't care about the rest of the execution plan), then you would probably be better off using the DMVs associated with missing indexes. You just need to write some queries instead of app code. Of course, DMV info is reset whenever the service restarts, but you can capture query results into a table if you want/need to keep a history. Please see the following MSDN pages for full details:
sys.dm_db_missing_index_groups
sys.dm_db_missing_index_group_stats
sys.dm_db_missing_index_details
sys.dm_db_missing_index_columns
The only benefit that I can see to capturing the Execution Plan to get this info is that it would include the query text that resulted in the suggestion, which obviously is great for doing that research for determining which indexes to implement, but will also potentially explode the number of rows of data if many variations of a query or queries result in the same suggested index. Just something to keep in mind.
Do not implement suggested indexes programmatically. They are for review and consideration. They are assessed per each query at that moment, and do not take into account:
how many indexes are already on the table
what other queries might benefit from a similar index (meaning, there could be a combination of fields that is not apparent to any individual query, but helps 3 or more queries, and hence only adds one index instead of 3 or more to the table).
Is it possible to programmatically capture execution plans?
Yes, this is definitely doable and I have done it myself. You can do it in .NET whether it is a Console App, Windows Form, Web App, SQLCLR, etc.
Here are the details of what you need to know if you want to capture XML plans:
XML Execution plans are:
sent as separate result sets
sent as datatype of NVARCHAR / string
of two types: Estimated and Actual
ESTIMATED plans:
are just that: estimated
are returned if you execute: SET SHOWPLAN_XML ON;
return only 1 plan that will contain multiple queries if there was more than 1 query in the batch
will return plans for simple queries such as SELECT 1 and DECLARE #Bob INT; SET #Bob = 52;
do not execute any of the queries. Hence, this method will return a single result set being the execution plan
ACTUAL plans:
are the real deal, yo!
are returned if you execute: SET STATISTICS XML ON;
return 1 plan per query as a separate result set
will not return plans for simple queries such as SELECT 1 and DECLARE #Bob INT; SET #Bob = 52;
Execute all queries in the batch. Hence,
Per query, this method will return one or two result sets: if the query returns data, then the query results will be the first result set, and the execution plan will be either the only result set (if the query doesn't return data) or the second result set
For multiple queries, the execution plans will be interspersed with any query results. But, since some queries do not return any results, you cannot simple capture every other result set. I test for a single field in the result set, of type NVARCHAR, with a field name of Microsoft SQL Server 2005 XML Showplan (which has been consistent, at least up through SQL Server 2014; I haven't yet tested SQL Server 2016).
for testing purposes you might want to wrap these queries in a BEGIN TRAN; / COMMIT TRAN; so that no actual data modifications occur.
SET commands need to be in their own batch, so get plans via something like:
SqlConnection _Connection = new sqlConnection(_ConnectionStringFromSomewhere);
SqlCommand _Command = _Connection.CreateCommand();
SqlDataReader _Reader = null;
try
{
_Connection.Open();
// SET command needs to be in its own batch
_Command.CommandText = "SET something ON";
_Command.ExecuteNonQuery();
// Now we can run the desired query
_Command.CommandText = _QueryToTest;
_Reader = _Command.ExecuteReader();
..get you some execution plans!
}
finally
{
if (_Reader != null)
{
_Reader.Dispose();
}
_Command.Dispose();
_Connection.Dispose();
}
As a final note I will mention that for anyone interested in capturing execution plans but not interested in writing any code to get them, I have already implemented this as a SQLCLR stored procedure. The procedure gets not only the XML Execution Plan(s), but also the output from STATISTICS TIME and STATISTICS IO, both of which are harder to capture as they are returned as messages (just like PRINT statements). And, the results of all 3 types of output can be captured into tables for further analysis across multiple executions (handy for doing A / B comparisons of current and revised code). This is available in the SQL# SQLCLR library (which again, I am the author of). Please note that while there is a Free version of SQL#, this particular stored procedure, DB_GetQueryInfo, is only available in the Full version, not the Free version.
UPDATE:
Interestingly enough, I just ran across the following MSDN article that describes how to use SQLCLR to grab the estimated plan, extract the estimated cost, pass it back as an OUTPUT parameter of the SQLCLR Stored Procedure, and then make a decision based on that. I don't think I would use it for such a purpose, but very interesting given that the article was written in 2005:
Processing XML Showplans Using SQLCLR in SQL Server 2005
I'm working on a .NET component that gets a set of data from the database, performs some business logic on that set of data, and then updates single records in the database via a stored procedure that looks something like spUpdateOrderDetailDiscountedItem.
For small sets of data, this isn't a problem, but when I had a very large set of data that required an iteration of 368 stored proc calls to update the records in the database, I realized I had a problem. A senior dev looked at my stored proc code and said it looked fine, but now I'd like to explore a better method for sending "batch" data to the database.
What options do I have for updating the database in batch? Is this possible with stored procs? What other options do I have?
I won't have the option of installing a full-fledged ORM, but any advice is appreciated.
Additional Background Info:
Our current data access model was built 5 years ago and all calls to the db currently get executed via modular/static functions with names like ExecQuery and GetDataTable. I'm not certain that I'm required to stay within that model, but I'd have to provide a very good justification for going outside of our current DAL to get to the DB.
Also worth noting, I'm fairly new when it comes to CRUD operations and the database. I much prefer to play/work in the .NET side of code, but the data has to be stored somewhere, right?
Stored Proc contents:
ALTER PROCEDURE [dbo].[spUpdateOrderDetailDiscountedItem]
-- Add the parameters for the stored procedure here
#OrderDetailID decimal = 0,
#Discount money = 0,
#ExtPrice money = 0,
#LineDiscountTypeID int = 0,
#OrdersID decimal = 0,
#QuantityDiscounted money = 0,
#UpdateOrderHeader int = 0,
#PromoCode varchar(6) = '',
#TotalDiscount money = 0
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
Update OrderDetail
Set Discount = #Discount, ExtPrice = #ExtPrice, LineDiscountTypeID = #LineDiscountTypeID, LineDiscountPercent = #QuantityDiscounted
From OrderDetail with (nolock)
Where OrderDetailID = #OrderDetailID
if #UpdateOrderHeader = -1
Begin
--This code should get code the last time this query is executed, but only then.
exec spUpdateOrdersHeaderForSkuGroupSourceCode #OrdersID, 7, 0, #PromoCode, #TotalDiscount
End
If you are using SQL 2008, then you can use a table-valued parameter to push all of the updates in one s'proc call.
update
Incidentally, we are using this in combination with the merge statement. That way sql server takes care of figuring out if we are inserting new records or updating existing ones. This mechanism is used at several major locations in our web app and handles hundreds of changes at a time. During regular load we will see this proc get called around 50 times a second and it is MUCH faster than any other way we've found... and certainly a LOT cheaper than buying bigger DB servers.
An easy and alternative way I've seen in use is to build a SQL statement consisting of sql_execs calling the sproc with the parameters in the string. Not sure if this is advised or not, but from the .NET perspective, you are only populating one SqlCommand and calling ExecuteNonQuery once...
Note if you choose this then please, please use the StringBuilder! :-)
Update: I much prefer Chris Lively's answer, didn't know about table-valued parameters until now... unfortunately the OP is using 2005.
You can send the full set of data as XML input to the stored procedure. Then you can perform Set operations to modify the database. Set based will beat RBARs on performance almost every single time.
If you are using a version of SQL Server prior to 2008, you can move your code entirely into the stored procedure itself.
There are good and "bad" things about this.
Good
No need to pull the data across a network wire.
Faster if your logic is set based
Scales up
Bad
If you have rules against any logic in the database, this would break your design.
If the logic cannot be set based then you might end up with a different set of performance problems
If you have outside dependencies, this might increase difficulty.
Without details on exactly what operations you are performing on the data it's hard to give a solid recommendation.
UPDATE
Ben asked what I meant in one of my comments about the CLR and SQL Server. Read Using CLR Integration in SQL Server 2005. The basic idea is that you can write .Net code to do your data manipulation and have that code live inside the SQL server itself. This saves you from having to read all of the data across the network and send updates back that way.
The code is callable by your existing proc's and gives you the entire power of .net so that you don't have to do things like cursors. The sql will stay set based while the .net code can perform operations on individual records.
Incidentally, this is how things like heirarchyid were implemented in SQL 2008.
The only real downside is that some DBA's don't like to introduce developer code like this into the database server. So depending on your environment, this may not be an option. However, if it is, then it is a very powerful way to take care of your problem while leaving the data and processing within your database server.
Can you create batched statement with 368 calls to your proc, then at least you will not have 368 round trips. ie pseudo code
var lotsOfCommands = "spUpdateOrderDetailDiscountedItem 1; spUpdateOrderDetailDiscountedItem 2;spUpdateOrderDetailDiscountedItem ... 368'
var new sqlcommand(lotsOfCommands)
command.CommandType = CommandType.Text;
//execute command
I had issues when trying to the same thing (via inserts, updates, whatever). While using an OleDbCommand with parameters, it took a bunch of time to constantly re-create the object and parameters each time I called it. So, I made a property on my object for handling such call and also added the appropriate "parameters" to the function. Then, when I needed to actually call/execute it, I would loop through each parameter in the object, set it to whatever I needed it to be, then execute it. This created SIGNIFICANT performance improvement... Such pseudo-code of my operation:
protected OleDbCommand oSQLInsert = new OleDbCommand();
// the "?" are place-holders for parameters... can be named parameters,
// just for visual purposes
oSQLInsert.CommandText = "insert into MyTable ( fld1, fld2, fld3 ) values ( ?, ?, ? )";
// Now, add the parameters
OleDbParameter NewParm = new OleDbParameter("parmFld1", 0);
oSQLInsert.Parameters.Add( NewParm );
NewParm = new OleDbParameter("parmFld2", "something" );
oSQLInsert.Parameters.Add( NewParm );
NewParm = new OleDbParameter("parmFld3", 0);
oSQLInsert.Parameters.Add( NewParm );
Now, the SQL command, and place-holders for the call are all ready to go... Then, when I'm ready to actuall call it, I would do something like..
oSQLInsert.Parameters[0].Value = 123;
oSQLInsert.Parameters[1].Value = "New Value";
oSQLInsert.Parameters[2].Value = 3;
Then, just execute it. The repetition of 100's of calls could be killed by time by creating your commands over and over...
good luck.
Is this a one-time action (like "just import those 368 new customers once") or do you regularly have to do 368 sproc calls?
If it's a one-time action, just go with the 368 calls.
(if the sproc does much more than just updates and is likely to drag down the performance, run it in the evening or at night or whenever no one's working).
IMO, premature optimization of database calls for one-time actions is not worth the time you spend with it.
Bulk CSV Import
(1) Build data output via string builder as CSV then do a Bulk CSV import:
http://msdn.microsoft.com/en-us/library/ms188365.aspx
Table-valued parameters would be best, but since you're on SQL 05, you can use the SqlBulkCopy class to insert batches of records. In my experience, this is very fast.
I know there have been numerous questions here about inline sql vs stored procedures...
I don't want to start another one like that! This one is about inline (or dynamic) sql.
I also know this point has become more or less moot with Linq to SQL and its successor Entity Framework.
But... suppose you have chosen (or are required by your superiors) to work with plain old ADO.NET and inline (or dynamic) sql. What are then the best practices for this and for formatting the sql?
What I do now is the following:
I like to create my SQL statements in a stored procedure first. This gives me syntax coloring in SQL Server Management Studio and the ability to test the query easily without having to execute it in code through the application I'm developing.
So as long as I'm implementing/debugging, my code looks like this:
using (SqlConnection conn = new SqlConnection("myDbConnectionString"))
{
conn.Open();
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = "myStoredProcName";
// add parameters here
using (SqlDataReader rd = cmd.ExecuteReader())
{
// read data and fill object graph
}
}
}
Once the debugging and testing phase is done, I change the code above like this:
using (SqlConnection conn = new SqlConnection("myDbConnectionString"))
{
conn.Open();
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandType = CommandType.Text;
cmd.CommandText = GetQuery();
// add parameters here
using (SqlDataReader rd = cmd.ExecuteReader())
{
// read data and fill object graph
}
}
}
And I add an extra private method e.g. GetQuery() in which I copy/paste the whole block of the stored procedure like this:
private string GetQuery()
{
return #"
SET NOCOUNT ON;
SELECT col1, col2 from tableX where id = #id
-- more sql here
";
}
Working like this has the benefit that I can revert the code easily to call the stored procedure again if I have to debug/update the sql code later, and once it's done I can easily put the sql code back with copy/paste, without having to put quotes around every line and stuff like that.
Is it good practice to include newlines in the query?
Are there other things or tricks that I haven't thought of which can make this approach better?
How do you guys do things like this?
Or am I the only one who still uses (has to use) inline sql?
Inline (with or without the literal #"..." syntax) is fine for short queries... but for anything longer, consider having the tsql as a file in the project; either as embedded resources / resx, or as flat files. Of course, by that stage, you should probably make it a stored procedure anyway ;-p
But having it as a separate file forces the same separation that will make it a breeze to turn into a stored procedure later (probably just adding CREATE PROC etc).
One issue with inline - it makes it so tempting for somebody to concatenate user input... which is obviously bad (you've correctly used parameters in the example).
I've used .NET resource files in the past. These were handy for keeping a library of all queries used in a particular code library, particularly when the same query might be used in multiple places (yes, I realize this also indicates some poor design, but sometimes you need to work within the box given to you).
Beyond non-trival single-line SQL statements, I always take advantage to multi-line and make it a const
const string SelectMyTable = #"
SELECT column_one
, column_two
, column_three
FROM my_table
";
This all allows me to cut and paste to SQL manager for testing.