Comparing strings with quotation marks

Comparing strings with quotation marks - c#

Hello guys i'm trying to create a program in C# where I am comparing two strings in which within the strings they have the double quotation marks. My problem is how do I compare them for equality because it seems the compiler ignores the words within the quotation marks and does not give me the right comparison.
An example is if
string1 = Hi "insert name" here.
string2 = Hi "insert name" here.
I want to use string1.equals(string2). But it seems it tells me the strings are not equal. How do I do this? Please help.
PS. I have no control on what the strings will look like as they are dynamic variables. So I can't just say add an escape sequence to it.

string s1 = "Hi \"insert name\" here.";
string s2 = "Hi \"insert name\" here.";
Console.WriteLine((s1 == s2).ToString()); //True
I have no problem ...

.NET will not ignore string values with double quotes when doing comparisons. I think your analysis of what is happening is flawed. For example, given these values:
var string1 = "This contains a \"quoted value\"";
var string2 = "This contains a \"quoted value\"";
var string3 = "This contains a \"different value\"";
string1.Equals(string2) will equal true, and string2.Equals(string3) will equal false.
Here are some potential reasons why you're not seeing an expected result when comparing:
One string may contain different quote characters than another. For example, "this", and “this” are completely different strings.
Your comparison may be failing due to other content not matching. For example, one string may have trailing spaces, and the other may not.
You may be comparing two objects instead of two strings. Object.Equals compares whether two objects are the same object. If you're not dealing with String references, the wrong comparison may be happening.
There are many more potential causes for your issue, but it's not because string comparison ignores double quotes. The more details you provide in your question, the easier it is for us to narrow down what you're seeing.

Related

C# Regular expressions, string variables, and inevitable quotation marks

Although this is a basic example, there are so many questions around escaping quote marks that this basic question about string variables seems to get lost in the 'noise'.
For purposes of this question, C# is always in the context of Visual Studio C#, in this case Visual Studio-2019.
In C#, both the variable in the string that I want to test for a pattern match and the string that contains the pattern are surrounded by quote marks. These quote marks are present in the C# program-code and the debugger string variable values as well. This seems to be inevitable.
Since these quote marks are part of the C# string variables themselves, I would hope that they would just be ignored by regex as part of the standard syntax.
This appears to be the case.
However, I want to verify that this works correctly, and how it works.
Example:
string ourTestString = "Smith";
string ourRegexToMatch = "^(Sm)";
Regex ourRegexVar = new Regex(ourRegexToMatch, RegexOptions.Singleline);
var matchColleciton = ourRegexVar.Matches(ourTestString);
bool ourMatch = matchColleciton.Count == 1;
The intent is to match for Sm at the beginning of the line and it is currently case sensitive.
In the above code, ourMatch is indeed true, as expected/hoped for.
It appears in the debugger that the ourRegexVar itself does not have the quote marks that surround the C# variable. There are curly brackets around everything which I would suppose is standard for such Regex variables.
One could easily imagine complex scenarios that involve strings that really do have quotation marks and escaped quotation marks and so forth, so it could get much more complicated than the above rather simple example.
My question is:
For purposes of regex and C# variables is it ALWAYS the case that both for the ourTestString C# string variable and the ourRegexToMatch C# variable, it is exactly like the compiler-induced "" for a C# string variable are not there?

The quotes tell the compiler that it's a string. Strings in memory aren't delimited by quotes.
If you do have a quote in a string, like Sm"ith, you have to escape it, like so:
var s1 = "Sm\"ith";
var s2 = #"Sm""ith";

Splitting strings in c#

Let's assume in my console the user inputs a couple or few strings separated by spaces.
I'm using these lines of code to organize the inputs into an array:
string[] inputs = Console.ReadLine().Split();
string firstName = inputs[0];
string lastName = inputs[1];
My goal by posting this is to better understand the Console.ReadLine().Split(); command. Microsoft documentation is a bit lost on me. Does this command read inputs and enable them to be separated by empty spaces? I'm assuming that is the case because in the code snippet we are declaring index 0 to be the string variable firstName and index 1 to be the string variable lastName.
I have also seen this command used as Console.ReadLine().Split(" ");. What kind of different functionality does this offer?
Edit: For duplicate notification: This question concerns the mechanics of this command and how it gets placed into an array specifically. Thanks for your responses. The 'duplicate' is a bit more general and did not succeed in answering my question.

These are two different "operations": Console.ReadLine() and String.Split(), first returns string from user input, second splits it. It will be equivalent to:
string input = Console.ReadLine();
string[] result = input.Split();
You can call as many methods (properties, fields, etc) as you want after dot operator, but it will be better, if you make your code readable (well, in this example it is pretty simple).
If there is no parameter passed, it will be whitespace by default, from MSDN:
If the separator argument is null or contains no characters, the method treats white-space characters as the delimiters. White-space characters are defined by the Unicode standard; they return true if they are passed to the Char.IsWhiteSpace method.
References: Console.Readline(), String.Split, . Operator

Read the input from the console
var inputs = Console.ReadLine();
Split the input string by whitespace
var splitInputs = inputs.Split(' ');
Check if the split array has at least one element and take its values
string firstName = splitInputs.Count()>0 ? splitInputs[0] : string.Empty;
Check if the split array has at least two elements and take its values
string lastName = splitInputs.Count() > 1 ? splitInputs[1] : string.Empty;

Remove double backslashes c# (for use ESC/POS programming)

I've seached a long time, and it seems that my problem is world-wide known. But, all the answers that are given, won't work for me. Most of the time, people say 'there is no problem'.
The problem: I'm programming a POS solution, and I'm using a Epson POS printer. To print the buttom to the receipt, I'm storing a string in the database. This is, so users can adjust the text at the bottom of the receipt. But, when I'm pulling the string out of the database, C# adds slashes to the string, so my excape characters won't work. I know, that usualy is not a problem, but in my case it is, because my ECS/POS commands won't work.
I've already tried some scripts, which replaces the double \ with a single \, but they don't work. (eg. String.Replace(#'\\',#'\').
Problem:
I have a sting: "foo \n bar"
Needs to print as:
foo
bar
C# adds slashes: "foo \\n bar"
Now it's printed as:
foo \n bar
Anyone an idea?

The problem is a misunderstanding of how C# handles strings. Take the following sample code:
string foo = "a\nb";
int fooLength = foo.Length; \\ 3 characters.
int bar = (int)(foo[1]); \\ 10 = linefeed character.
versus:
string foo = #"a\nb"; \\ NB: # prefix!
int fooLength = foo.Length; \\ 4 characters.
int bar = (int)(foo[1]); \\ 92 = backslash character.
The first example uses a string literal ("a\nb") which is interpreted by the C# compiler to yield three characters. The second example uses a verbatim string literal, due the prefix #, that suppresses the interpretation of the string.
Note that the debugger is designed to add to the confusion by displaying strings with escape codes added, e.g. string foo = "a\nb" + (Char)9; results in a string that the debugger shows as "a\nb\t". If you use the "text visualizer" in the debugger (by clicking on the magnifying glass when examining the the variable's value) you can see the difference between literal and interpreted characters.
Databases are, as a rule, designed to accept and return string values without interpretation. That way you needn't worry about names like "Delete D'table". Neither the presence of a SQL keyword, nor punctuation used in SQL statements, should present a problem in a data column.
Now the OP's issue should be becoming clearer. The string retrieved from the database does not contain a linefeed, but instead contains the characters '\' and 'n'. .NET has no reason to change those values when the string is read from the database and written to a printer. Unfortunately, the debugger confounds the difference. (Use the text visualizer as described above.)
The solution involves adding code to reproduce the C# compiler's processing of escape sequences. (This should include escaping escape characters!) Alternatively, tokens can be added that are suitable for the application at hand, e.g. occurrences of «ESC» could be replaced with an ASCII escape character. This can be employed for longer sequences, for example if a print uses several characters to introduce a font change then write the code to replace «SetFont» with the correct sequence. More generally, you can replace a snippet with a dynamic value, e.g. «Now» could be replaced with the current date/time when the receipt is being printed. (Register number, cashier name, store hours, ... .) This makes the values in the database more human readable than embedded Unicode oddities and more flexible than fixed strings.
Left as an exercise for the reader: extend snippets to support formatting and null value substitution. «Now|DD.MM.YY hh:mm» to specify a format, «Discount|*|n/a» to specify a value ("n/a") to be displayed if the field is null.

C# Trouble with Regex.Replace

Been scratching my head all day about this one!
Ok, so I have a string which contains the following:
?\"width=\"1\"height=\"1\"border=\"0\"style=\"display:none;\">');
I want to convert that string to the following:
?\"width=1height=1border=0style=\"display:none;\">');
I could theoretically just do a String.Replace on "\"1\"" etc. But this isn't really a viable option as the string could theoretically have any number within the expression.
I also thought about removing the string "\"", however there are other occurrences of this which I don't want to be replaced.
I have been attempting to use the Regex.Replace method as I believe this exists to solve problems along my lines. Here's what I've got:
chunkContents = Regex.Replace(chunkContents, "\".\"", ".");
Now that really messes things up (It replaces the correct elements, but with a full stop), but I think you can see what I am attempting to do with it. I am also worrying that this will only work for single numbers (\"1\" rather than \"11\").. So that led me into thinking about using the "*" or "+" expression rather than ".", however I foresaw the problem of this picking up all of the text inbetween the desired characters (which are dotted all over the place) whereas I obviously only want to replace the ones with numeric characters in between them.
Hope I've explained that clearly enough, will be happy to provide any extra info if needed :)

Try this
var str = "?\"width=\"1\"height=\"1234\"border=\"0\"style=\"display:none;\">');";
str = Regex.Replace(str , "\"(\\d+)\"", "$1");
(\\d+) is a capturing group that looks for one or more digits and $1 references what the group captured.

This works
String input = #"?\""width=\""1\""height=\""1\""border=\""0\""style=\""display:none;\"">');";
//replace the entire match of the regex with only what's captured (the number)
String result = Regex.Replace(input, #"\\""(\d+)\\""", match => match.Result("$1"));
//control string for excpected result
String shouldBe = #"?\""width=1height=1border=0style=\""display:none;\"">');";
//prints true
Console.WriteLine(result.Equals(shouldBe).ToString());

compare strings with one having nonreadable chars

I am having problem regarding comparing two strings. One string has one or more nonreadable characters in it while other other string is same but in readable format.
When I try to use this, I am having trouble
if (Alemria=Almería)...
I am having such string Almería in a table.
How can this be done?

Use an overload of string.Equals that takes a StringComparison enum - use one of the CurrentCulture enum members.
You will need to set the current culture to a culture that can sort by these characters.

Depending on how strict you want to be, you might try this. If you know the characters in question are only from spanish the alphabet you could strip those out of the seed values (maybe use RegEx) and modify your comparison logic to do the same with target records. For example, remove all 'ñ' and 'n' from both side and maybe add a length comparison to increase reliability. Of course do this with all the special characters, not just 'ñ'.

See if this article will help you, you could replace all accented characters in the word and then do your comparison.

I suppose CompareOptions.IgnoreNonSpace is that you are looking for. Compare will ignore accents, diacritics, and vowel marks.
string str1 = "mun";
string str2 = "mün";
int result1 = string.Compare(str1, str2, CultureInfo.InvariantCulture, CompareOptions.IgnoreNonSpace);
int result2 = string.Compare(str1, str2, CultureInfo.CurrentCulture, CompareOptions.IgnoreNonSpace);
But Alemria will differ from Almería anyway. Seems like it's considered totally another symbol.

Try
if(var.StartsWith("Almer"))//repalce var with your string var
MessageBox.Show("String matched");//do watever you want to do here

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.