I am allowing users to generate expressions against predefined columns on the table. A user can create columns, tables, and can define constraints such as unique and not null columns. I also want to allow them to generate "Calculated columns". I am aware that PostgreSQL does not allow calculated columns so to get around that I'll use expressions like this:
SELECT CarPrice, TaxRate, CarPrice + (CarPrice * TaxRate) AS FullPrice FROM CarMSRP
The user can enter something like this
{{CarPrice}} + ({{CarPrice}} * {{TaxRate}})
Then it gets translated to
CarPrice + (CarPrice * TaxRate)
Not sure if this is vulnerable to sql injection. If so, how would I make this secure?
Why don't you utilize STORED PROCEDURES to conduct this?
This way, you can, for instance, define variables to receive what user wrote and check if there are some BLACKLISTED words (like DELETE, TRUNCATE, ALL, *, and so forth).
I don't know PostgreSQL, but if it's not possible there, you can also check those problematic commands BEFORE translate them to call your SELECT statement.
If I understand you correctly, you just take user input as desribed above and substitute in select column list. If so, that is sure not safe, because something like:
"* from SomeSystemTable--({{CarPrice}} + ({{CarPrice}} * {{TaxRate}})"
Will allow user to select anything from any other tables he has permissions for. You can try to build expression tree to avoid that: parse user input into some structure describing variables and arithmetic operations between them (like parsing arithmetic expressions). Otherwise you can remove all {{}} from your string (ensure that any {{}} corresponds to a column in a table) and check if only "+-*()" and whitespace characters left.
Note that from user experience viewpoint you will need to parse expression anyway, to warn user about errors without actually running the query.
Related
I have the following C# function
SomeFunction(string table, string column, string where) {
Sql sql = new Sql("SELECT ");
// [...] validate table and column values
sql.Append(column);
sql.Append(" FROM ");
sql.Append(table);
sql.Append(" WHERE ");
sql.Append(where); // This is the issue
}
As you can see this is awful, I'm dealing with this very old legacy code and changing the function signature and the way the clients use it is just not feasible. What I have to do is secure the 'where' clause. This clause may contain any number of conditions and data types.
I had a bunch of ideas but I don't think they are a good solution, I think this requires a properly written and tested code, but if I do it myself out of the blue it'll probably have holes. Here are some thoughts:
Splitting the string by char '=' -> what if that's not the condition operator
Find if string contains semicolons -> the SELECT clause remains vulnerable, and maybe one of the conditions contains that char so it'd give a false positive
If you have any idea/suggestion/pointing in the right direction I will be most grateful.
If the where clause is currently based on being a pre-composed string, then frankly I don't think it is a viable approach to attempt to "secure" it. It is theoretically possible, but any attempt at parsing the SQL will fail if the composed and compromised (injected) where clause is legitimate (but abusive). At that point: you've already lost track of the original intent. That's kinda the entire point of SQL injection: the resultant SQL is valid SQL - so it is very hard for you to tell the difference between where Name = 'Fred Orson' -- check name (probably fine) and where Name = 'Fred' Or 1=1 --' (injected - query widening).
So: while I acknowledge that you say:
changing the function signature and the way the clients use it is just not feasible.
Not changing the function signature doesn't really help you solve the problem. Trying to detect certain patterns is just an arms race, where you need to win every time and the attacker needs to win only once.
If it was me, I'd be doing something like:
[Obsolete("Please specify parameters separately - use 'null' if no parameters are needed")]
SomeFunction(string table, string column, string where) {
return SomeFunction(table, column, where, null);
}
SomeFunction(string table, string column, string where, object args) {
// ...
}
and using an approach like "Dapper" uses to compose the parameters from the args parameter - or just use "Dapper" itself to run the query, and use that functionality for free.
This approach:
prevents new uses of the dangerous API being added
lets the existing code continue to work for now
but lets you track how many outstanding problem calls there are, by watching the warnings
Edit: note: the point of the args parameter is to allow the caller to parameterize their inputs, i.e.
string name = ...
var users = SomeFunction("Users", "Id", "Name=#name", new { name });
With SomeFunction decomposing args and adding parameter name/value pairs from the properties on args (if it is non-null). There are various approaches to composing parameter sets, but the approach shown here is simple and easy to implement correctly - which makes it a clear win for me.
I want to solve problem. I trying to do algorithm, where I can building a queries. For example If I have 1 or 2 condition I can construct my algorithm on programming switches with string format. (pic. 1)
But if I want more than 2 conditions, I'll be have a lot variants.(pic. 2)
I want just SELECT with different conditions from database.
Maybe someone know what way I should to use to construct a lot different conditions?
As long as they're always querying/filtering the same denormalized set, you can write a WHERE clause builder, but you'll need to treat each field/operator/value and clause independently.
Each value in your field combo box should correspond to one table.field name in the set, each value in your operator combo box corresponds to SQL operators to add to the clause, and you'll have problems with your values they enter because you'll need to distinguish between numbers and strings, formatted without or with single quotes. Also, there's date formats to consider.
You may also get people making combinations of fields and operators that don't make sense. 'After' makes sense for dates, but not email addresses. Consider limiting your choices in the operator combo by the data type of the field selection.
Suppose I'm querying a Sql Server DB for row count based on a LIKE comparison on a column whose value is supplied via windows form text input. Using parameters here is important due to possible user input leading to injection. Eventually I'm to execute member function ExecuteScalar() on an instantiated SqlCommand object I named cmd, but first I've to add the parameter. For example:
cmd.Parameters.AddWithValue("#param1", textBox1.Text);
Sql Server uses the special character % as a wildcard match for the LIKE comparison. What I wanted to do was allow the user to use * for wildcards instead. Hence, a simple replace:
textBox1.Text.Replace('*','%');
The problem is I run into issues with values containing special symbols % and _. One way to search for a literal % rather than use it as a wildcard is to enclose it in square brackets: [%].
So, now my replace has to become:
textBox1.Text.Replace("%","[%]").Replace("_","[_]").Replace('*','%');
Order is important here as well, since if the last Replace were made sooner the % would be treated incorrectly.
I'm not sure I've covered all my bases, are there other characters I need to worry about here? Does this really protect from injection? Is there some other preferred way of doing this?
An example query might be something like this:
SELECT COUNT(*) FROM [MyTable] WHERE [Column1] = #param1
Where MyTable is your table name, and Column1 is a valid column name within MyTable. We can assume that Column1 is some nvarchar type.
You shouldn't really need anything else, but I'd make sure to test oddball things from user input. Some characters you haven't accounted for have special meaning in LIKE:
^
-
Since you're passing a parameter into the statement and not blindly appending the string, there should be little danger of injection, but you may want to try variations of user input such as:
foo'; DELETE dbo.[UnimportantTable];
foo''; DELETE dbo.[UnimportantTable];
foo''''; DELETE dbo.[UnimportantTable];
Again, I'm not sure if you're vulnerable because I can't see the whole thing, but I do think it's very easy to construct a variety of tests so that you know all of the potential outcomes with a wide sampling of potential inputs.
As #Bryan pointed out, certainly a good way to limit the risk is to connect using a login that has very explicit read-only permissions only on the objects you want them to be able to read. Then even if they do exploit some hole in your scaffolding, getting in doesn't buy them much.
I'm working a C# form application that ties into an access database. Part of this database is outside of my control, specifically a part that contains strings with ", ), and other such characters. Needless to say, this is mucking up some queries as I need to use that column to select other pieces of data. This is just a desktop form application and the issue lies in an exporter function, so there's no concern over SQL injection or other such things. How do I tell this thing to ignore quotes and such in a query when I'm using a variable that may contain them and match that to what is stored in the Access database?
Well, an example would be that I've extracted several columns from a single row. One of them might be something like:
large (3-1/16" dia)
You get the idea. The quotes are breaking the query. I'm currently using OleDb to dig into the database and didn't have an issue until now. I'd rather not gut what I've currently done if it can be helped, at least not until I'm ready for a proper refactor.
This is actually not as big problem as you may see it: just do NOT handle SQL queries by building them as plain strings. Use SqlCommand class and use query parameters. This way, the SQL engine will escape everything properly for you, because it will know what is the code to be read directly, and what is the parameter's value to be escaped.
You are trying to protect against a SQL Inject attack; see https://www.owasp.org/index.php/SQL_Injection.
The easiest way to prevent these attacks is to use query parameters; http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlparameter.aspx
var cmd = new SqlCommand("select * from someTable where id = #id");
cmd.Parameters.Add("#id", SqlDbType.Int).Value = theID;
At least for single quotes, adding another quote seems to work: '' becomes '.
Even though injection shouldn't be an issue, I would still look into using parameters. They are the simpler option at the end of the day as they avoid a number of unforeseen problems, injection being only one of them.
So as I read your question, you are building up a query as a string in C#, concatenating already queried column values, and the resulting string is either ceasing to be a string in C#, or it won't match stuff in the access db.
If the problem is in C#, I guess you'll need some sort of escaping function like
stringvar += escaped(columnvalue)
...
private static void escaped(string cv) as string {
//code to put \ in front of problem characters in cv
}
If the problem is in access, then
' escapes '
" escapes "
& you can put a column value containing " inside of '...' and it should work.
However my real thought is that, the SQL you're trying to run might be better restructured to use subqueries to get the matched value(s) and then you're simply comparing column name with column name.
If you post some more information re exactly what the query you're producing is, and some hint of the table structures, I'll try and help further - or someone else is bound to be able to give you something constructive (though you may need to adjust it per Jet SQL syntax)
I need a Regex Statement (run in c#) that will take a string containing a Sql Update statement as input, and will return a list of columns to be updated. It should be able to handle columns surrounded by brackets or not.
// Example Sql Statement
Update Employees
Set FirstName = 'Jim', [LastName] = 'Smith', CodeNum = codes.Num
From Employees as em
Join CodeNumbers as codes on codes.EmployeeID = em.EmployeeID
In the end I would want to return an IEnumerable or List containing:
FirstName
LastName
CodeNum
Anyone have any good suggestions on implementation?
Update: The sql is user-generated, so I have to parse the Sql as it is given. The purpose of extracting the column names in my case is to validate that the user has permission to update the columns included in the query.
You're doing it backwards. Store the data in a broken out form, with the table to be updated, the column names, and the expressions to generate the new values all separate. From this canonical representation, generate both the SQL (when you need it) and the list of columns being updated (when you need that instead).
If you absolutely must pull the column names out of a SQL statement, I don't think that regular expressions are the correct way to go. For example, in the general case you may need to skip over new value expressions that contain arbitrarily nested parenthesis. You will probably want a full SQL parser. The book Lex & Yacc by Levine, Mason, and Brown has a chapter on parsing SQL.
Response to update:
You are in for a world of hurt. The only way to do what you want is to fully parse the SQL, because you also need to make sure that you don't have any subexpressions that perform unauthorized actions.
I very, very strongly recommend that you come up with another way to do whatever it is that you are doing. Maybe break out the modifiable fields into a separate table and use access controls? Maybe come up with another interface for them to use in specifying what they want done? Whatever it is that you're doing, there is almost certainly a better way to do it. Down that path there be dragons.
Regular expressions cannot do this task, because SQL is not a regular language.
You can do this, but not with a regular expression. You need a full-blown parser.
You can use ANTLR to generate parsers in C#, and there are free grammars available for parsing SQL in ANTLR.
However, I agree with Glomek that allowing user-supplied SQL to be run against your system, even after you have tried to validate that it includes no "unauthorized actions," is foolish. There are too many cases that may circumvent your validation.
Instead, if you have only a single text field, you should define a simplified Domain-Specific Language that permits users to specify only actions that they are authorized to do. From this input, you can build the SQL yourself.
SQL has a complex recursive grammer, and, there will always be some sub select, group by, or literal that will break your regex based parser.
Why don't use a sql parser to achieve what you need, here is an article shows you how to achieve what you need within 3 minutes.