I've got a bit of a poor situation here. I'm stuck working with commerce server, which doesn't do a whole lot of sanitization/parameterization.
I'm trying to build up my queries to prevent SQL Injection, however some things like the searches / where clause on the search object need to be built up, and there's no parameterized interface.
Basically, I cannot parameterize, however I was hoping to be able to use the same engine to BUILD my query text if possible. Is there a way to do this, aside from writing my own parameterizing engine which will probably still not be as good as parameterized queries?
Update: Example
The where clause has to be built up as a sql query where clause essentially:
CatalogSearch search = /// Create Search object from commerce server
search.WhereClause = string.Format("[cy_list_price] > {0} AND [Hide] is not NULL AND [DateOfIntroduction] BETWEEN '{1}' AND '{2}'", 12.99m, DateTime.Now.AddDays(-2), DateTime.Now);
*Above Example is how you refine the search, however we've done some testing, this string is NOT SANITIZED.
This is where my problem lies, because any of those inputs in the .Format could be user input, and while i can clean up my input from text-boxes easily, I'm going to miss edge cases, it's just the nature of things. I do not have the option here to use a parameterized query because Commerce Server has some insane backwards logic in how it handles the extensible set of fields (schema) & the free-text search words are pre-compiled somewhere. This means I cannot go directly to the sql tables
What i'd /love/ to see is something along the lines of:
SqlCommand cmd = new SqlCommand("[cy_list_price] > #MinPrice AND [DateOfIntroduction] BETWEEN #StartDate AND #EndDate");
cmd.Parameters.AddWithValue("#MinPrice", 12.99m);
cmd.Parameters.AddWithValue("#StartDate", DateTime.Now.AddDays(-2));
cmd.Parameters.AddWithValue("#EndDate", DateTime.Now);
CatalogSearch search = /// constructor
search.WhereClause = cmd.ToSqlString();
It sounds like you'll have to go old school and validate the data yourself before constructing the query. I'm not a .NET guy but in the CGI world I would sanitize the input with something like:
$foo =~ s/[^a-zA-Z0-9*%]//g
That will thwart any SQL injection I can think of and still allow wildcards. Only problem is the regexs are expensive.
Related
We have a couple of SQL queries as strings:
public class Query
{
public static string CreditTransferId(string expectedValue, string table, int statusId, int messageTypeId, int destination103, int destination202, string StoreStatus202Id)
{
return $"SELECT top 1 Id from {table} where MessageId = '{expectedValue}' and FlowId=3 and StatusId={statusId} and MessageTypeId={messageTypeId} and " +
$" Destination103={destination103} and Destination202={destination202} and StoreStatus103Id is null and StoreStatus202Id {StoreStatus202Id}";
}
}
We have them returned as strings from methods inside the Query class. We want to refactor the code, since we have a method with more than 3 parameters which is pretty hard to use.
How would you go about this? What's the cleanest way to organize SQL queries which need a lot of parameters?
Dynamic SQL is a very bad thing for a start as these are open to SQL injection, you should use parameterise queries and return a string.
"eg: SELECT top 1 Id from [Table] where [MessageId] = #messageId" etc
So you dont need to pass in any values, you would add these to your list of SqlParamater's
The table name is probably pointless as this is related to the sql, so probably just add that into the sql string
This doesn't really need an extra class, just create the sql variable where you call it, so it right there if you need it?
..or use Stored Procedures
..or use Entity Framework
EF is great and you have to decide if that's what you want. There are some cases where it is not suitable. Of u decide to stick with plain text queries how about dividing queries into parts:
FromBuilder
JoinBuilder
GroupBuilder
Condition builder etc
ex.:
return #"
"+new TableIdSelectorBuilder(table).Get() +#"
"+new FromBuilder().Get(table) +#"
"+new TableConditionByIdBuilder(table).Get(I'd)+#"
";
EDIT:
Stored procedures allow to change queries without publishing new app version but is a bit pain in the ass to work on a living organism. At least sometimes.
Hopefully this helps you a bit. I was figuring this out a long time ago too.
Use nameOf instead of hardcoded column names
One advice that I have is: Try to use nameOf() for column names (in case your domain model matches exactly your database structure).
Refactoring becomes a lot easier as well.
Create a query builder
I created a class which allows me to create simple to medium complex queries (Select, Insert, Update, Joins, Deletes, Limit, Offset, ...). For most of the queries that I write I can use this query builder to get what I need and the code will look something like this:
var query = new QueryBuilder()
.Update(nameof(MyTable))
.Set(nameof(MyTable.Name), entity.Name)
.Set(nameof(MyTable.Firstname), entity.Firstname)
.WhereEquals(nameof(MyTable.Id), entity.Id)
.Build();
Unfortunately, I haven't found a good QueryBuilder out there yet. Therefore I created my own.
Use string literals
Microsoft Documentation
Another solution that I recently encountered is by doing the following:
var query = $$"""
SELECT * FROM {{nameof(MyTable)}
WHERE
XX
""";
This at least allows you to indent correctly and the code will remain clean.
Outsource your queries in separate files
I was also thinking about having bigger queries in separate files and include them in the Assembly. When you want to execute the query, you load the query from the file and run it like this.
You might also want to cache that query then, to not always load it from the file.
Final words
Personally, I try to use the QueryBuilder as often as possible. It creates clean code which looks somewhat like LINQ
Backup approach for me is to use the string literals as showed above
Trying to create separate files for queries, I have never tried yet, because there was no use for myself
I would also avoid using Stored Procedures, because they are pain to write and also hard to debug in my opinion
The cleanest option would be to use Entity Framework, but if you want to use Micro-ORMs like Dapper, try one of the solutions above
I use sql in string format in ORM (Bltoolkit). I don't prefer to use Linq without need, because
complex queries are hard to built with Linq. There isn't enough resource for building complex query with Linq.
with using string sql with ORM, you are still improving your practice of sql querying. Otherwise you have a break with querying(I suppose Linq doesn't improve your sql querying).
you can still secure your query from sql injection with binding parameters.
What is your idea? Is it a good practice?
An example for using ORM(for my case, BlToolkit) without Linq is below:
var db = new Veritabani("HstConn");
try
{
var sorgu = #"select t.tcno ""KullaniciAdi"", t.ad ""Ad"", t.soyad ""Soyad"", t.kurum_kodu ""KurumKodu"",
t.ilkodu ""IlKodu"", t.kurum_turu ""KurumTuru"", t.e_posta ""Eposta"", t.dogrulama_kodu ""DogrulamaKodu""
from saglikcalisanlari t
where tcno = :kullaniciAdi
and sifre = :sifre";
return db.SetCommand(sorgu,
db.Parameter(":kullaniciAdi", kullaniciAdi.Trim()),
db.Parameter(":sifre", sifre.Trim().Md5Hash())).ExecuteObject<SaglikCalisani>();
}
catch (Exception exc)
{
throw new Exception("Veritabanı Hatası: " + exc.Message);
}
finally
{
db.Close();
db.Dispose();
}
This will always be subjective and context dependent. Perhaps the real question is "will you genuinely target a different db". If the answer to that is "no", then fixed SQL is most likely fine. If you need to target multiple different backends, then an abstraction like HQL or ESQL may be more appropriate - which is different to LINQ, but still platform independent... ish.
There are also plenty of cases where you want to hand-tune the SQL because frankly in complex cases a dev will out-perform a generator (LINQ etc) 9 times out of 10 (according to a study by the EU department of invented statistics).
As long as you correctly parameterize, SQL by itself is fine in those scenarios.
stackoverflow.com uses hand-written TSQL extensively, because:
we have no plans to change backend, and if we ever do: the queries are only the tip of the iceberg
we really really care about performance:
parsing an expression tree (LINQ) or a DSL (HSQL/ESQL) to generate TSQL takes time
we want the TSQL to be well-written and tested bespokely
we were genuinely seeing measured performance issues in the tooling we were using, even with precompiled LINQ queries
we wrote our own tool ("dapper") to remove every feature we didn't absolutely need - to make it simply query-in, objects-out
and our own tool ("mini-profiler") to monitor the performance live on the site
Yes reasons you have mentioned are perfect for opting string format, performance wise you will find very little difference to religiously go for one.
Idea is to use tools that make your more productive without asking you to give up performance of your application.
Like if you like SQL queries over LINQ and are more comfortable with it I dont see any disadvantage in that, on the other hand if you are not going for LINQ because it is taking lot of you and you are finding it difficult its time to push in some effort to learn it. Unless you can compare your tools you cannot choose the best one for the job.
Have a look at NPoco - it is based on PetaPoco but has had a few recent enhancements.
Building SQL strings is much easier using the SQL string builder and the latest version also has the ability to use LINQ if you wish.
I want to have an app where a user (typically a power user) can enter a boolean expression. I want to be able to execute the boolean expression in both .NET and in SQL.
The expressions themselves aren't that hard, they are things like:
country is united states
country is one of: united states, canada, mexico
(country is united states) AND (age is 20)
(country is united states) OR ((age is 20) and country is one of: united states, canada)
I need to be able to support basic things like 'in', equals, greater/less than, between, contains, startswith, etc. I want to be able to compile into C# and run it against an object of type dynamic, and also be able to compile the expression into SQL. I would be plugging the result into a where clause of a very specific query.
I do not want to use Nhibernate or EntityFramework, I want to be able to execute the SQL directly.
UPDATE:
I already know that I want to execute it using ADO.NET. Sorry if I didn't state it clearly. I just want to know what a good approach would be to store a boolean expression that can be executed both in C# and SQL. I don't care about stored procedures and parameterization (the latter is obvious and trivial once I'm able to generate a query). The users entering the expressions are internal to my company, mostly developers and power users.
So I'm thinking along the lines of using things like Expression Trees, Abstract Syntax Trees, LINQ, etc, to accomplish this. I don't want to use an ORM, but I want to do something very similar to what the ORM's do in their LINQ expressions to convert lambas into the code for the WHERE clause.
UPDATE2:
The way we're thinking of doing this so far is having the expressions entered as C# and stored as strings in a database. When we want to execute in a .NET context we'd compile into a lambda or Expression and maybe wrape it in an object that as an interface with a method to represent the boolean expression interface IDynamicFilter { bool PassesFilter<SomePocoType>(poco); }. Or we could drop the POCO(s) into an IEnumerable and run a LINQ .Where() with the lambda/expression passed in to pull out the objects that match the filters.
For the SQL, this is the part I'm more fuzzy on. We want to replicate what the ORM's do -- visit an Expresssion tree and render it into an SQL string. We don't need to support the full set of SQL. We only have to support pretty simple operators like grouping with parentheses, AND/OR/NOT, in/not in, GT/LT/EQ/NEQ/between. We also want to support some basic maths operations (a + b > c).
Microsoft CRM has something like this where they give you the form fields and a drop down designating what you're trying to find and what your logical operator is, so you might have.
Column:
Select Column in Database
Opeartor:
>
<
>=
<=
LIKE
=
IN
NOT IN
Value:
TextBox for user input
So you would take the user input and just create the queries based upon that. If you don't want to use an ORM like Entity, then you could use ADO.NET.
If I understand your problem, you can do something like this (and keep in mind I do not recommend this sort of thing for production):
var query = "SELECT * FROM TABLE as T WHERE 1=1";
if ([some condition]) query += " AND T.CountryCode = 'USA'";
if ([some other condition]) query += " AND T.CountryCode IN ('USA', 'CAN', 'MEX')";
if ([yet another condition]) query += " AND T.CountryCode = 'USA' AND T.Age = 20";
using (var conn = new SqlConnection(connectionString))
{
conn.Open();
using (var comm = new SqlCommand(conn, query))
{
var results = comm.ExecuteReader(); //returns an IDataReader you can loop through.
}
}
I'll say again, make sure you are parameterizing any variables you are searching on, this sort of query is quite vulnerable to sql injection. This is the lazy developer's way of doing something, and can be quite dangerous. If you want to avoid ORM's, this sort of logic should be in a stored procedure that you pass parameters to.
I am having a lot of fun with Linq2Sql. Expression Trees have been great, and just the standard Linq2Sql syntax has been a lot of fun.
I am now down to part of my application where I have to somehow store queries in a database, that are custom for different customers that use the same database and same tables (well, view, but you know what I mean). Basically, I cant hard-code anything, and I have to leave the query language clear text so someone can write a new where-clause type query.
So, if that description was harsh, let me clarify:
In a previous version of our application, we used to do direct SQL calls to the db using raw SQL. Yea. it was fun, dirty, and it worked. We would have a database table fulled of different criteria like
(EventType = 6 and Total > 0)
or a subquery style
(EventType = 7
AND Exists (
select *
from events as e1
where events.EventType = e1.EventType
and e1.objectNumber = 89)
)
(sql injection anyone?)
In Linq2Sql, this is a little more challenging. I can make all these queries no problem in the CLR, but being able to pass dynamic where criterias to Linq is a little more challenging, especially if I want to perform a sub query (like the above example).
Some ideas I had:
Get the raw expression, and store it --- but I have no idea how to take the raw text expression and reverse it back to executable to object expression.
Write a SQl like language, and have it parse the code and generate Linq Expression -- wow, that could be a lot of fun
I am quite sure there is no SomeIqueryable.Where("EventType = 6 and Total > 54"). I was reading that it was available in beta1, but I don't see how you can do that now.
var exp2 = context.POSDataEventView.Where("EmployeeNumber == #0", 8310);
This would be the easiest way for me to deploy.. I think.
Store serialized Expressions -- wow.. that would be confusing to a user trying to write a query --- hell, I'm not sure I could even type it all out.
So, I am looking for some ideas on how I can store a query in some kind of clear text, and then execute it against my Linq2Sql objects in some fashion without calling the ExecuteSQL. I want to use the LinqObjects.
P.S. I am using pLinqo for this application if that helps. Its still linq2sql though.
Thanks in advance!
Perhaps the Dynamic LINQ Library (in the MSDN samples) would help?
In particular, usage like:
This should work with any IQueryable<T> source - including LINQ-to-Objects simply by calling .AsQueryable() on the sequence (typically IEnumerable<T>).
What's the best way to convert search terms entered by a user, into a query that can be used in a where clause for full-text searching to query a table and get back relevant results? For example, the following query entered by the user:
+"e-mail" +attachment -"word document" -"e-learning"
Should translate into something like:
SELECT * FROM MyTable WHERE (CONTAINS(*, '"e-mail"')) AND (CONTAINS(*, '"attachment"')) AND (NOT CONTAINS(*, '"word document"')) AND (NOT CONTAINS(*, '"e-learning"'))
I'm using a query parser class at the moment, which parses the query entered by users into tokens using a regular expression, and then constructs the where clause from the tokens.
However, given that this is probably a common requirement by a lot of systems using full-text search, I'm curious as to how other developers have approached this problem, and whether there's a better way of doing things.
How to implement the accepted answer using .Net / C# / Entity Framework...
Install Irony using nuget.
Add the sample class from:
http://irony.codeplex.com/SourceControl/latest#Irony.Samples/FullTextSearchQueryConverter/SearchGrammar.cs
Write code like this to convert the user-entered string to a query.
var grammar = new Irony.Samples.FullTextSearch.SearchGrammar();
var parser = new Irony.Parsing.Parser(grammar);
var parseTree = parser.Parse(userEnteredSearchString);
string query = Irony.Samples.FullTextSearch.SearchGrammar.ConvertQuery(parseTree.Root);
Perhaps write a stored procedure like this:
create procedure [dbo].[SearchLivingFish]
#Query nvarchar(2000)
as
select *
from Fish
inner join containstable(Fish, *, #Query, 100) as ft
on ft.[Key] = FishId
where IsLiving = 1
order by rank desc
Run the query.
var fishes = db.SearchLivingFish(query);
This may not be exactly what you are looking for but it may offer you some further ideas.
http://www.sqlservercentral.com/articles/Full-Text+Search+(2008)/64248/
In addition to #franzo's answer above you probably also want to change the default stop word behaviour in SQL. Otherwise queries containing single digit numbers (or other stop words) will not return any results.
Either disable stop words, create your own stop word list and/or set noise words to be transformed as explained in SQL 2008: Turn off Stop Words for Full Text Search Query
To view the system list of (English) sql stop words, run:
select * from sys.fulltext_system_stopwords where language_id = 1033
I realize it's a bit of a side-step from your original question, but have you considered moving away from SQL fulltext indexes and using something like Lucene/Solr instead?
The easiest way to do this is to use dynamic SQL (I know, insert security issues here) and break the phrase into a correctly formatted string.
You can use a function to break the phrase into a table variable that you can use to create the new string.
A combination of GoldParser and Calitha should sort you out here.
This article: http://www.15seconds.com/issue/070719.htm has a googleToSql class as well, which does some of the translation for you.