I have about 20 (relatively) big queries hardcoded in my C# code that I would like to move somewhere else, as they are now making my code look unmanageable. These SQL queries have in common that they receive a fixed set of parameters (2 parameters to be precise).
I am looking at ways of where to place them in my project and how to manage them, and so I was thinking of creating separate sql files for each query in a folder in which the code would look into, somehow passing these two parameters before actually executing the query.
The question is the following. Are there any standard/efficient ways of performing the above in C#? I really do not like these SQL queries hard-coded in my projects, but I am also mindful that these would be parametrised queries and so I might not be able to achieve the above.
Any guidance or help would be most welcome. In case it helps, I do not have access rights to write stored procedures to solve this situation.
Too bad that it has to be a query not stored procedure
If it is not possible to execute stored procedures then you can put the queries in text files and all those text files can be loaded in a resource file. You can access them with "resourceFileName.TextFileName" and please keep the parameters with # and their names.
you can load the query with
using System.Resources;
var resourceManager = new ResourceManager("resourceFileName", Assembly.UnsafeLoadFrom("resourceDll"));
string myQuery= resourceManager.GetString("TextFileName");
Related
I have around 500 stored procedures that are used for our ETL process. I have been asked to identify all the source and target tables used by each stored procedure. So, a stored procedure could have a connection to an Oracle linked server, or another SQL Server. It could also be using an OPENQUERY to extract data from our transactional systems.
Since I have some basic .NET/C# programming chops, I was hoping to leverage the .NET RegEx class to get started. However, I am looking for suggestions on how I should approach this. I really don't have to reinvent the wheel if someone already has a solution for this.
As a context, we are working on implementing PowerDesigner to store metadata repository. So, we are looking to extract metadata from our BI reports (map reports to it's source tables/views) and our Informatica and T-SQL ETL scripts.
Thanks
I'd suggest a dual-approach. Firstly, I'd avoid using regex for something as complex as SQL Query parsing, especially since there are tools in place for this kind of thing.
https://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.dependencywalker.aspx
The SMO library exposes a class that will let you connect to a server and retrieve a dependency tree for a given stored procedure. How to do this exactly is left as an exercise for the reader :)
However, this class won't pick up dependencies that are introduced via dynamic SQL or through OPENQUERY. If the number of procedures that do this are small, I'd recommend doing this manually, and then merging the results. You could use the SMO scripting capabilities to pick up all instances of either OPENQUERY or exec/sp_executesql; at least then you would have an idea of 'suspect' pieces of code.
Merging the results will be tricky. Not only do you have to manually update dependencies for procedures containing dynamic dependencies, but you have to update procedures that depend on procedures containing dynamic dependencies.
You can use a dynamic management view dm_sql_referenced_entities to get some dependency information from SQL Server itself but there are some limitations. Not sure if the Dependency Walker leverages this view, but the pros and cons are very similar.
The same main limitation that I know of and have experienced is that you won't get any dependency information for an object that is leveraged through dynamic sql. We have very contained usages of dynamic sql so I can feel pretty confident leveraging this DMV and manually accounting for the objects hit by those specific procs.
We don't do linked servers, but in my understanding is that those would show in this DMV. I don't know about the OPENQUERY ... I did a little bit of research but I did not test it out but I am guessing those would not be surfaced by the view. Like the previous poster said, you may need a two-pronged approach to get everything you're looking for.
And just for reference, a simple example of using that DMV:
SELECT DISTINCT
[database] = COALESCE(r.referenced_database_name, DB_NAME())
, [schema] = r.referenced_schema_name
, name = r.referenced_entity_name
, r.referenced_id
FROM sys.dm_sql_referenced_entities('dbo.procName_sp', 'OBJECT') AS r
WHERE r.referenced_id IS NOT NULL;
I wouldn't use C# for this. However, maybe something like this will do the job.
select *
from DatabaseName.information_schema.routines
where routine_type = 'PROCEDURE'
SELECT name, type
FROM dbo.sysobjects
WHERE type IN (
'P', -- stored procedures
'FN', -- scalar functions
'IF', -- inline table-valued functions
'TF' -- table-valued functions
)
ORDER BY type, name
Or, if you want SProcs and parameters:
select * from information_schema.parameters
Finally, this link looks pretty helpful for your situation.
http://blog.sqlauthority.com/2010/02/04/sql-server-get-the-list-of-object-dependencies-sp_depends-and-information_schema-routines-and-sys-dm_sql_referencing_entities/
We have a use case where an app that sends out emails finds a specific string ('smart tag') in an email and replaces it with the results of a stored procedure.
So for example the email could have Dear <ST:Name> in the body, and then the code would identify this string, run the stored procedure to find the client name passing in the client id as a parameter.
The list of these tags and the stored procedures that need to be run are currently hard coded, so every time a new 'smart tag' needs to be added, a code change and deployment is required.
Our BA's our skilled in SQL and want to be able to add new tags manually.
Is it bad practice to store the procedure and parameters in a database table? Would this be a suitable design for such a table? Would it be necessary to store parameter type?
SmartTag
SmartTagId SmartTag StoredProcedure
SmartTagParameters
SmartTagParameterId SmartTagId ParameterName
Table driven configuration, data driven programming, is good.
The primary thing to watch out for is SQL Injection risk (or in your case it would be called 'tag injection'...): one could use the email as an attack vector to gain elevated privileges by inserting a crafted procedure that would be run under higher privileges. Note that this is more than just the usuall caution around SQL Injection, since you are already accepting arbitrary code to be executed. This is more of a sandboxing problem.
Typical problems are from the type system: parameters have various types but the declaration tables have a string type for them. SQL_VARIANT can help.
Another potential problem is the language to declare and discover tags. Soon you'll be asked to recognize <tag:foo>, but only before <tag:bar>. A fully fledged context sensitive parser usually follows shortly after first iteration... It would be helpful if you can help by leveraging something already familiar (eg. think how JQuery uses the CSS selector syntax). HTMLAgilityPack could help you perhaps (and btw, this is a task perfect for SQLCLR, don't try to build elaborate statefull parser in T-SQL...).
It's not bad practice, what you are doing is totally fine. As long as only your admin/BA can add and change parameters and change configuration you do not have to worry about injection. If users can add and change parameters you really need to check there input and whitelist certain chars.
It's not only sql injection you have to check, but cross site scripting and dom injection and cross site request forgery as well. The merged text is displayed on a users computer, so you have to protect him when viewing your merge result.
Interesting question, will follow up as it has something to do with mine. Is not bad practice at all. In fact, we are using same approach. I'm trying to achieve similar goals on my XSL editor. I'm using a combination of XML tags, stored procedures and VB.Net logic to do the replacements.
I'm using a combination of a table with all the used XML tags (they are used on other places on the application) and stored procedures that do all the dirty job. One set of sp transforms from text with tags to a user readable text. Other set of procedures creates an XML tree from the XML tags table so the users can choose from to edit their text.
SQL Injection is not an issue for us as we use these procedures to create emails, not to parse them from external sources.
Regarding a comment on one of the question, we manage the tags also directly from SSMS, no admin window to manage them, at least for now. But we plan to add a simple admin window to manage the tags so it would be easier to add/delete/modify them once the application is deployed.
I have a table with large number of rows(~200 million) and I want to process these values in c#, after reading them from memory. Processing requires grouping entries by column values in a way that can't be done inside the sql server itself. Problem is that reading the whole data at once gives me a OutOfMemory exception, and takes a lot of time to execute even partially.
So I want to break my query into shorter pieces. One method is to obviously do an independent select and then use the where in clause. Another method that I have been suggested is to use sql cursors. I want to chose one of these methods(or another one if possible), especially with regards to the following points:
What would be the performance impact of the schemes on the server? Which would perform faster?
Can I safely parallelize the sql cursor queries? Would I get a performance benefit if I parallelize the first scheme(one with where in clause)?
How many objects can I specify in where in clause? Is it only limited by the size of the query string?
Any other suggestions are also welcome.
Edit1: I have been given different solutions, but I would still like to know the answers to my original questions(out of curiousity).
If you have to do the grouping logic in code, you can try to write the logic as a Managed Stored Procedure in sql server which can be used in the groping query.
Check out
How to: Create and Run a SQL Server
Stored Procedure by using Common
Language Run-time Integration
How to: Create and Run a SQL Server
User-Defined Function by using Common
Language Run-time Integration
This will allow you to group on the server before returning the dataset to your client.
[Edit - regarding your comments on using Dictionaries]
You can check out my project on Codeplex which has a disk persisting Dictionary<T,V>. This would prevent the out of memory exception. Would be interesting to see how it performs for your scenario. (If you are on a 32bit system, read the note on the intro page).
If you are using sql 2005 or higher you should check out sql based paging.
http://blogs.x2line.com/al/archive/2005/11/18/1323.aspx
It should work for what you are trying to do and is a better option than the two you listed.
I have an existing SQL Server database whose structure I can't really change, although I can add stored procedures or new tables if I want. I have to write a stand-alone program to access the DB, process the data and produce some reports. I've chosen C# and Visual Studio as we're pretty much an MS shop.
I've made a start at exploring using VS 2008 to create said program. I'm trying to decide where to put some of the SQL logic. My primary aims are to keep the development as simple as possible and to perform quickly.
Should I put the SQL logic into a stored procedure and simply call the stored procedure and have SQL Server do the grunt work and hand me the results? Or am I better off keeping the SQL query in my code, creating the corresponding command and executing it against the SQL Server?
I have a feeling the former might perform better, but I've then got to manage the stored procedure separately to the rest of my code base, don't I?
UPDATE: It's been pointed out the performance should be the same if it's the same SQL code in a C# program or a stored procedure. If this is the case, which is the easiest to maintain?
2009-10-02: I had to really think about which answer to select. At the time of writing, there were 8 answers, basically split 5-3 in favour of putting the SQL logic in the application. On the other hand, there were 11 up-votes, split 9-2 in favour of putting the SQL logic in stored procedures (along with a couple of warnings about going this way). So I'm torn. In the end I'm going with the up-votes. However, if I run into trouble I'm going to come back and change my selected answer :)
If it is heavy data manipulation, keep it on the db in stored procedures. If the queries might change some, the better place would be in the db too, otherwise a redeploy might be required for each change.
Keeping the mainstay of the work in stored procedures has the advantage of flexibility - I find it easier to modify a procedure than implement a program change. Unfortunately flexibility is a double-edged sword; it's much easier to make an ill-advised change as well.
I suggest taking a look at LINQ to Entities, which provides an Object Relational Mapping wrapper around any SQL statements (CRUD), abstracting away the logic needed to write to the database, and allowing you to write OO code instead of using SQLConnections and SQLCommands.
OO code (the save method does not exist but you get the gist of it):
// this adds a new car to the Car table in SQL, without using ANY SQL code
Car car = new Car();
Car.BrandName = "Audi";
Car.Save(); //save is called something else and is on the
// datacontext the car is in, but for brevity sake..
SQL code as string in SqlCommand:
// open sql connection in your app and
// create Command that inserts car
SqlConnection conn = new SqlConnection(connstring);
SQlCommand comm = new SqlCommand("INSERT INTO CAR...");
// execute
Versioning and maintaining stored procedures is a nightmare. If you don't hit serious performance issues (that you think will be resolved using stored procedures), I think it will be better to implement logic in your c# code (linq, subsonic or anything like that).
With regard to your point concerning performance variation between embedding your code in .NET source or within SQL Server stored procedures, you should actually see no difference between the two methods!
This is because the same execution plan will be generated by SQL server, provided the data access T-SQL within the two different sources is the same.
You can see this in action by running a SQL Server Profiler trace and comparing the execution plans that are generated by the two different T-SQL query sources.
In light of this and back to the main point of your question then, your choice of implementation should be determined by ease of development and your future extensibility requirements. As you appear to be the sole individual who shall be working on the project then go with what you prefer, which I suspect being to keep the code centralised i.e. within a visual studio Data Access Layer (DAL).
Stored Procedures can come into their own however when you have separate development functions within your organisation/team. For example, you may have database developers on your team who can create your data access code for you and do so independently of the application, freeing you to work on other code modules.
Update deployment: If you need to update the procedure, you can update a stored procedure without your users eve knowing, without taking the server offline. updating the C# means pushing out a new EXE to all your users!
Have a look at Entity Spaces. It's a code generation tool - but it'll do more.
There's a small amount of leg work to do in learning the tool, but once you're up and running you'll never look back. Saves hours of work. (I don't work for them BTW!)
Should I put the SQL logic into a stored procedure
Well that depends on what the “SQL logic” is, doesn't it? If it's purely database-related, a stored procedure might be most appropriate. If it's ‘business logic’, the rules that decide how your application operates, it definitely belongs in your application.
which is the easiest to maintain?
Personally I find application-side code easier as modern languages like C# have much more expressive power than SQL. Doing any significant processing in T-SQL quickly becomes tedious and difficult to read.
I will likely be responsible for porting a vb6 application to c#. This application is a windows app that interacts with an access db. The data access is encapsulated in basic business objects. One class for one table basically. The existing vb6 business objects read and write to the DB via DAO. I have written DALs and ORMs a few times before but they all targeted SQL Server only. This one will need to target access and sql server. In previous projects, I would place the SQL strings in the private parts of the business object and maybe move the redundant sql code like connecting, creating command, in into a common base class to reduce the code.
This time, i'm thinking about writing the SQL strings into a .settings file or some other key/value type text file. I would then write a sql utility to edit this file and allow me to run and test the parameterized queries. These queries would be referenced by name in the business object instead of embedding the sql into code.
I know a standard approach is to create a DAL for each targeted database and have the configuration state which DAL to use. I really don't want to create the two DAL classes for each database. It seems like it would be less code if I just referenced the correct query by keyname and have the proper type of connection.
So, are you guys doing things like this? How would or have you approached this problem?
What works best for you?
Thanks!
Well, there's a lot of options - so it really depends on what your most pressing needs are :-)
One approach might be to create SQL statements as text files inside your VS solution, and mark them as "embedded resource" in the "build action". That way, the SQL is included in your resulting assembly, and can be retrieved from it at runtime using the ResourceManifestStream of the .NET framework:
private string LoadSQLStatement(string statementName)
{
string sqlStatement = string.Empty;
string namespacePart = "ConsoleApplication1";
string resourceName = namespacePart + "." + statementName;
using(Stream stm = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName))
{
if (stm != null)
{
sqlStatement = new StreamReader(stm).ReadToEnd();
}
}
return sqlStatement;
}
You need to replace "ConsoleApplication1" with your actual namespace, in which the sql statement files reside. You need to reference them by means of the fully qualified name. Then you can load your SQL statement with this line:
string mySQLStatement = LoadSQLStatement("MySQLStatement.sql");
This however makes the queries rather "static", e.g. you cannot configure and change them at runtime - they're baked right into the compiled binary bits. But on the other hand, in VS, you have a nice clean separation between your C# program code, and the SQL statements.
If you need to be able to possibly tweak and change them at runtime, I'd put them into a single SQL table which contains e.g. a keyword and the actual SQL query as fields. You can then retrieve them as needed, and execute them. Since they're in the database table, you can also change, fix, amend them at will - even at runtime - without having to re-deploy your whole app.
Marc
When I really need it, I put the queries into individual *.sql files, then include them into Resources.resx. There is a 'Files' section in it, which allows you to include Embedded Resource files.
After that, I can use generated Resources.MyQuery property which both guarantees that resource exists and saves me from writing a custom resource load method.
LINQ to DataSet sounds like the way to go for you.
If you havent used the .NET 3.5 before / LINQ then you're in for a treat. LINQ will save you writing your raw sql in string literals and provide you with a more logical way to creating querys.
Anyway, check this link out for using LINQ on Access databases - http://msdn.microsoft.com/en-us/library/bb386977.aspx
If i'd had to create application for both SQL and Access, I'd use some IDAL interface, DALCommon with common functionality implementation and separate DALSql and DALAccess, inherited from DALCommon, with some specific stuff, like exceptions, transactions handling, security etc.
I used to keep stored procedure names or queries in resource files.
I'll tell where I won't put it ever, something I saw done in some code I inherited. It was in Java, but applies to any language
A base class that declared protected static member variables for for SQL statements, inited to null, with a get method that returns individual SQL statements
A sub class for each supported database server, with an init method that assigns to the base class member variables
Several DA classes that use the base class method to retrieve SQL statements
The application start-up class with the responsibility to create the correct sub-class object and call its init method
I will also not go into explaining why I will not do this ever :-)
One method we used is to have a class that would connect to the DB and methods to call procedures and in the method parameter you would provide the procedure name. so all the SQL code is in the procedure. we would use overloads for the different return types
class ConnectToSQL()
{
//connectSql code (read from setting file i assume)
XMLDataDocument runProcedure(string procedureName);
int runProcedure(string procedureName);
//etc....
}
Sometimes, like with custom reporting apps, you really need to embrace the impedance mismatch, and give special importance to the SQL. In these cases I recommend the following: For each module that contains SQL strings, create a single static "SQL" class to hold them all. Some of the SQL strings will likely require parameters, so be consistent and put each string behind it's own static method.
I only do this for the occasional custom reporting app, but it always works out great and feels refreshing and liberating. And it's quite nice to come back months later to make an enhancement, and find all of the SQL waiting for you in a single SQL.cs file. Just by reading that one file, it all comes back, and often this is the only file that needs to be changed.
I don't see a need in these cases for hiding the SQL in resources or elsewhere. When SQL is important, then it's important. Interestingly, more and more developers are now freely mixing SQL with C#, including I believe this site, because essentially, that's what LINQ is.
Finally, as always, make sure you are not susceptible to SQL injection attacks. Especially if user input is involved, make sure you are using some kind of parameterization and that you are not using string concatenation.
Embedding solutions shown above may not work if SQL Query has a "where" cause like , but for the same Query the next run needs PropertyID='113' as the PropertyID is read-in.
Glad you asked! Put your sql in a QueryFirst .sql template.
It's automatically compiled into your app as an embedded resource, but you don't care. You just write it, in a real sql window, connected to your DB, with syntax validation and intellisense for tables and columns, then use it, via the generated Execute() methods, with intellisense for your inputs and results.
disclaimer : I wrote QueryFirst.