Automatically escaping characters within strings

Automatically escaping characters within strings - c#

I am working on a C# project where I am exporting data from a database that is defined by the user so I have no idea what the data is going to contain or the format it is going to be in.
Some of the strings within the database might include apostrophe's (') which I need to escape but everything I've found on the internet shows that I would have to do string.replace("'", "\'"); which seems a bit odd as it would be a mass of replace statements for every possibility.
Isn't there a better way to do this.
Thanks for any help you can provide.

I recently had to make a code fix for this same problem. I had to put a ton of string.replace() statements everywhere. My recommendation would be to create a method that handles all escape character possibilities and have your query strings pass through this method before being executed. If you design your structure correctly you should only have to call this method once.
public string FixEscapeCharacterSequence(string query)
{
query = query.Replace("'", "\'");
//..Any other replace statements you need
//....
return query;
}

Related

character to use when splitting strings in visual c#?

Ok, I'm racking my brains over this one. It's pretty simple though (I think).
I'm currently creating a text file as a comma separated string of values.
Later, I read in that file data and then use the .split function to split the data by commas.
I discovered that sometimes one of the description fields in the data conatins an embedded comma, which ends up throwing the split command off.
Is there any special character I could use that could pretty much guarantee wouldn't be in the data, or is there a better way to accomplish this? Thanks!
// Initial Load
fullString = fileName + "," + String.Join(",", fieldValues);
// Access later
String[] valuesArray = myString.Split(',');

Short answer, there's no "simple" way to do it using Split. The best you can hope for is to set the deliminator as something cooky that wouldn't ever get used (but even that's not a guarantee).
The simple method would be to used something like CsvHelper (get it through Nuget) or any of the other dozen or so packages that are designed for parsing CSV.

Replacing single quotes in full sql queries with C#

In our C# desktop-application we generate a lot of dynamic sql-queries. Now we have some troubles with single quotes in strings. Here's a sample:
INSERT INTO Addresses (CompanyName) VALUES ('Thomas' Imbiss')
My question is: How can I find and replace all single quotes between 2 other single quotes in a string? Unfortunately I can't replace the single quotes when creating the different queries. I can only do that after the full query is created and right before the query gets executed.
I tried this pattern (Regular Expressions): "\w\'\w"
But this pattern doesn't work, because after "s'" there's a space instead of a char.

I am sorry to say, there is no solution in approach you expect.
For example, have these columns and values:
column A, value ,A',
column B, value ,B',
If they are together in column list, you have ',A',',',B','.
Now, where is the boundary between first and second value? It is ambiguous.
You must take action when creating text fields for SQL. Either use SQL parameters or properly escape qoutes and other problematic characters there.
Consider showing the above ambiguous example to managers, pushing the whole problem back as algorithmically unsolvable at your end. Or offer implementing a guess-work and ask them whether they will be happy if content of several text fields can get mixed in some cases like above one.
At time of SQL query creation, if they do not want to start using SQL parameters, the solution for enquoting any input string is as simple as replacing:
string Enquote(string input)
{
return input.All(c => Strings.AscW(c) < 128) ? "'" : "N'"
+ input.Replace("'", "''")
+ "'"
}
Of course, it can have problem with deliberately malformed Unicode strings (surrogate pairs to hide ') but it is not normally possible to produce these strings through the user interface. Generally this can be still faster than converting all queries to versions with SQL parameters.

Comparing Strings in .NET

I am running into what must be a HUGE misunderstanding...
I have an object with a string component ID, I am trying to compare this ID to a string in my code in the following way...
if(object.ID == "8jh0086s)
{
//Execute code
}
However, when debugging, I can see that ID is in fact "8jh0086s" but the code is not being executed. I have also tried the following
if(String.Compare(object.ID,"8jh0086s")==0)
{
//Execute code
}
as well as
if(object.ID.Equals("8jh0086s"))
{
//Execute code
}
And I still get nothing...however I do notice that when I am debugging the '0' in the string object.ID does not have a line through it, like the one in the compare string. But I don't know if that is affecting anything. It is not the letter 'o' or 'O', it's a zero but without a line through it.
Any ideas??

I suspect there's something not easily apparent in one of your strings, like a non-printable character for example.
Trying running both strings through this to look at their actual byte values. Both arrays should contain the same numerical values.
var test1 = System.Text.Encoding.UTF8.GetBytes(object.ID);
var test2 = System.Text.Encoding.UTF8.GetBytes("8jh0086s");
==== Update from first comment ====
A very easy way to do this is to use the immediate window or watch statements to execute those statements and view the results without having to modify your code.

Your first example should be correct.
My guess is there is an un-rendered character present in the Object.ID.
You can inspect this further by debugging, copying both values into an editor like Notepad++ and turning on view all symbols.

I suspect you answered your own question. If one string has O and the other has 0, then they will compare differently. I have been in similar situations where strings seem the same but they really aren't. Worst-case, write a loop to compare each individual character one at a time and you might find some subtle difference like that.
Alternatively, if object.ID is not a string, but perhaps something of type "object" then look at this:
http://blog.coverity.com/2014/01/13/inconsistent-equality
The example uses int, not string, but it can give you an idea of the complications with == when dealing with different objects. But I suspect this is not your problem since you explicitly called String.Compare. That was the right thing to do, and it tells you that the strings really are different!

Replace all single quotes with two single quotes in a string

I am trying to read value from DB using c#.
The query string contains multiple single quotes - such as: Esca'pes' (the query strings are being read from a text file)
So, I wanted to replace all the single quotes with two single quotes before forming the SQL query. My code is as below:
if (name.Contains('\''))
{
name = name.Replace('\'','\''');
}
How to fix this?

Use strings, not char literals.
name = name.Replace("'", "''");
However it sounds like you're concatenating SQL strings together. This is a huge "DO NOT" rule in modern application design because of the risk of SQL injection. Please use SQL parameters instead. Every modern DBMS platform supports them, including ADO.NET with SQL Server and MySQL, even Access supports them.

name = name.Replace("'","''");
On an unrelated note, you're concatenating strings for use in SQL? Try parameters instead, that's what they're meant for. You're probably making it harder than it needs to be.

Since you want to replace a single character with two characters, you need to use the String overload of Replace
if (name.Contains('\''))
{
name = name.Replace("'","''");
}
(Note: single quotes don't require escaping in Strings like they do in character notation.)

Regex index in matching string where the match failed

I am wondering if it is possible to extract the index position in a given string where a Regex failed when trying to match it?
For example, if my regex was "abc" and I tried to match that with "abd" the match would fail at index 2.
Edit for clarification. The reason I need this is to allow me to simplify the parsing component of my application. The application is an Assmebly language teaching tool which allows students to write, compile, and execute assembly like programs.
Currently I have a tokenizer class which converts input strings into Tokens using regex's. This works very well. For example:
The tokenizer would produce the following tokens given the following input = "INP :x:":
Token.OPCODE, Token.WHITESPACE, Token.LABEL, Token.EOL
These tokens are then analysed to ensure they conform to a syntax for a given statement. Currently this is done using IF statements and is proving cumbersome. The upside of this approach is that I can provide detailed error messages. I.E
if(token[2] != Token.LABEL) { throw new SyntaxError("Expected label");}
I want to use a regular expression to define a syntax instead of the annoying IF statements. But in doing so I lose the ability to return detailed error reports. I therefore would at least like to inform the user of WHERE the error occurred.

I agree with Colin Younger, I don't think it is possible with the existing Regex class. However, I think it is doable if you are willing to sweat a little:
Get the Regex class source code
(e.g.
http://www.codeplex.com/NetMassDownloader
to download the .Net source).
Change the code to have a readonly
property with the failure index.
Make sure your code uses that Regex
rather than Microsoft's.

I guess such an index would only have meaning in some simple case, like in your example.
If you'll take a regex like "ab*c*z" (where by * I mean any character) and a string "abbbcbbcdd", what should be the index, you are talking about?
It will depend on the algorithm used for mathcing...
Could fail on "abbbc..." or on "abbbcbbc..."

I don't believe it's possible, but I am intrigued why you would want it.

In order to do that you would need either callbacks embedded in the regex (which AFAIK C# doesn't support) or preferably hooks into the regex engine. Even then, it's not clear what result you would want if backtracking was involved.

It is not possible to be able to tell where a regex fails. as a result you need to take a different approach. You need to compare strings. Use a regex to remove all the things that could vary and compare it with the string that you know it does not change.
I run into the same problem came up to your answer and had to work out my own solution. Here it is:
https://stackoverflow.com/a/11730035/637142
hope it helps

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Automatically escaping characters within strings - c#

Related

character to use when splitting strings in visual c#?

Replacing single quotes in full sql queries with C#

Comparing Strings in .NET

Replace all single quotes with two single quotes in a string

Regex index in matching string where the match failed

Categories

Resources