Visual Studio 2013: Testing / Assert Strings - c#

I have a question. I am testing a lib from me, which is generation some text in xml-style. Up to now, I am testing with the function
Assert.AreEqual(string1, string2);
But the strings, which are in xml-style, are more than 300 characters long. And when I make a little mistake in one character, the test is failing and the output is, that the strings are not equal. But the test does not say, at which position they are not equal.
So my question is: Is there already an implemented function, which compares two strings and tell me also, at which position they differ + output of the strings ... ?

try this way
var indexBroke = 0;
var maxLength = Math.Min(string1.Length, string2.Length);
while (indexBroke < maxLength && string1[indexBroke] == string2[indexBroke]) {
indexBroke++;
}
return ++indexBroke;
the logic is that you compare each character step by step and when you get the first difference the function exit returninng the last index with equal characters

For that reason (and many others more), I can recommend using FluentAssertions.
With FluentAssertions you would formulate your assertion like this:
string1.Should().Be(string2);
In the case the strings do not match, you get a nice informative message helping you to tackle down the problem:
Expected string to be
"<p>Line one<br/>Line two</p>" with a length of 28, but
"<p>Line one<br>Line two</p>" has a length of 27.
Additionally, you can give a reason to make the error message even more clear:
string1.Should().Be(string2, "a multiline-input should have been successfully parsed");
That would give you the following message:
Expected string to be
"<p>Line one<br/>Line two</p>" with a length of 28 because a multiline-input should have been successfully parsed, but
"<p>Line one<br>Line two</p>" has a length of 27.
These reason arguments are especially valuable when comparing values that provide no meaning by themselves, such as booleans and numbers.
BTW, FluentAssertions also helps greatly in comparing object graphs.

Related

Comparing strings with if in C# code error [duplicate]

This question already has answers here:
compare two string value [closed]
(6 answers)
Closed 3 years ago.
I'm a beginner in c# and I'm making a console guess the number game. You enter a number and it tells you to guess higher or lower or if you guessed the number. Anyways, I'm having trouble comparing the answer with the users guess.
I've tried comparing string guess with string answer using a <= in an if statement. I got an error that says "Operator '<=' cannot be applied to operands of 'string' and'string'.
The code:
string answer = "537";
string guess = Console.ReadLine();
*if (guess <= answer)*
The code with asterisks is the code I'm getting an error from. Does anyone know what I'm doing wrong and a solution?
Since you've said that you're a beginner,
<= isn't valid for strings.
Imagine if I did this:
string foo = "Hello world";
string bar = "Wassup?"
if(foo <= bar)
{
/// do something
}
What, exactly, would foo <= bar mean in that context? We could trying to compare the length of the strings (bar is shorter than foo), the sum of the ASCII values of the characters in each string, or just about anything. It's possible to implement methods that do those things, but none of them make sense in the general case so the language doesn't try, and it shouldn't.
The difference between a string and an int is that the former is intended to contain character data, like a name or a sentence. Mathematical comparisons like <= apply to numeric data, like integers and floating point values. So, to get the behavior you're looking for, you need to convert your text data into a numeric type.
The nature of data types and how they are stored, comparisons, etc. is a nontrivial discussion. But, suffice it to say that the string "123" is NOT the same as the number (integer, most likely) 123.
The easiest fix for your code would be something like:
string answer = "537";
string guess = Console.ReadLine();
var intAnswer = Int32.Parse(answer);
var intGuess = Int32.Parse(guess);
if (intGuess <= intAnswer)
{
/// do something...
}
Note that this will throw an exception if the user enters anything in the console that is not a valid digit. (Look up TryParse for a better solution, but that's beyond the scope of this answer and I think it'll just confuse the issue in this case.)
I'd spend some time reading about data types, int vs string, etc. This is a reasonable question about something that is not obvious to those just getting started.
Keep at it. We all started somewhere, and this is as good a place as any.
strings cannot be treated as number, it will only compare if they are equal. if numbers are the input. convert it to int first, both the guess and answer. if the guess will always be a number this would suffice.
if (Convert.ToInt32(guess) <= Convert.ToInt32(answer))
{
}
if not try to do a try catch or Int32.TryParse

C# String.CompareTo not returning the results I would expect

I am trying to compare strings for less than etc - in a similar way I would compare numbers.
My issue is the following comparison returns true:
var expectThisToBeFalse = "315160".CompareTo("40000") < 0;
I know I can compare these as numbers, but in my application I do not know if they are numbers or letters.
Can anyone explain what I am misssing, and if there is a comparison method that would work
eg would show:
"1" is less than "2"
"a" is less than "b"
"aa" is greater than "b"
etc...
You are not missing anything. The metod you use compares two strings alphabetically. It means that if string A is in the alphabet ahead of string B, then it returns -1.
Because you're comparing two strings, not two numbers, the function looks at the first character of both of the strings ("3" and "4" in your example. Because "3" has a lower ASCII code than "4" (51 and 52, respectively), the function concludes that "315160" is ahead in the alphabet than "40000", so it returns -1. Because you compared the result of this function (-1) with 0, the variable is (correctly) true, because -1<0.
For what you wish, you will need to program your own function. I don't know if there is any function already programmed.
Later edit: more info on string.compare.
Later edit 2: something else struck me as interesting:
but in my application I do not know if they are numbers or letters.
For a simpler way of solving this, you may begin by checking if the two inputs are numbers or letters. You would save yourself a lot of trouble, because sometimes these two inputs will be numbers and solving is super-easy.

How to check a partial similarity of two strings in C#

Is there any function in C# that check the % of similarity of two strings?
For example i have:
var string1="Hello how are you doing";
var string2= " hi, how are you";
and the
function(string1, string2)
will return similarity ratio because the words "how", "are", "you" are present in the line.
Or even better, return me 60% of similarity because "how", "are", "you" is a 3/5 of string1.
Does any function exist in C# which do that?
A common measure for similarity of strings is the so-called Levenshtein distance or edit distance. In this approach, a certain defined set of edit operation is defined. The Levenshtein distance is the minimum number of edit steps which is necessary to obtain the second string from the first. Closely related is the Damerau-Levenshtein distance, which uses a different set of edit operations.
Algorithmically, the Levenshtein distance can be calculated using Dynamic programming, which can be considered efficient. However, note that this approach does not actually take single words into account and cannot directly express the similarity in percent.
Now i am going to risk a -1 here for my suggestions, but in situations where you are trying to get something which is close but not so complex, then there is a lot of simpler solutions then the Levenshtein distance, which is perfect if you need exakt results and have time to code it.
If you are a bit looser concerning the accuracy, then i would follow this simple rules:
compare literal first (strSearch == strReal) - if match exit
convert search string and real string to lowercase
remove vowels and other chars from strings [aeiou-"!]
now you have two converted strings. your search string:
mths dhlgrn mtbrn
and your real string to compare to
rstrnt mths dhlgrn
compare the converted strings, if they match exit
split only the search strings by its words either with simple split function or using Regular Expressions \W+
calculate the virtual value (weight) of one part by dividing 100 by the number of parts - in this case 33
compare each part of the search string with the
real string, if it is contained, and add the value for each match to your total weight. In this case we have three elements and two matches so the result is 66 - so 66% match
This method is simple and extendable to go more and more in detail, actually you could use steps 1-7 and if step 7 returns anything above 50% then you figure you have a match, and otherwise you use more complex calculations.
ok, now don't -1 me too fast, because other answers are perfect, this is just a solution for lazy developers and might be of value there, where the result fulfills the expectations.
You can create a function that splits both strings into arrays, and then iterate over one of them to check if the word exists in the other one.
If you want percentage of it you would have to count total amount of words and see how many are similar and create a number based on that.

Uniqueness for a shortened guid

I have to append a unique code as a querystring, for every url generated.
So, the option I chose is to shorten a guid (found here on SO).
public static string CreateGuid()
{
Guid guid = Guid.NewGuid();
return Convert.ToBase64String(guid.ToByteArray());
}
Will this be as unique as a guid, cause I have several urls to generate and this guid will be saved in DB.
Yup, the default string representation of a guid is base16. By reformatting the same value as base64, you get a shorter (but possibly uglier) string.
You should watch out if you are using this in the url. While the string will be shorter, it will potentially have characters that are illegal in urls, so you may need to run it past
HttpUtility.UrlEncode()
to be safe. Of course, once you do that, it will get a little longer again.
Edit:
Your comment makes it seem like you want some sort of math, so here goes:
Let's assume that you have 24 alphanumeric characters all the time, and casing does not matter. That means each character can be 0-9 + a-z or 36 possibilities. That makes it 24 ^ 36 different possible strings. Refer to this website then:
http://davidjohnstone.net/pages/hash-collision-probability
Which lets you plug in possible values and the number of times you will need to run your code. 24^36 is equivalent to 2^100 (I arrived at this number after some googling, may be incorrect). plugging in 100 into the "number of bits in your hash" field at the link above means if you run your code 1000000 times, you will still only have 3.944300×10^19 odds of a collision, or the same value coming up twice. That's miniscule, but you may run into issues if you are writing something that will be used many many more times than that.

IndexOf method returns 0 when it should had return -1 in C# / Java

A friend of mine came to me with this strange behavior which i can't explain, any insight view would be appreciated.
Im running VS 2005 (C# 2.0), the following code show the behavior
int rr = "test".IndexOf("");
Console.WriteLine(rr.ToString());
the above code, print "0" which clearly show it should have return -1
This also happen in Java where the following Class show the behavior:
public class Test{
public static void main(String[] args){
System.out.println("Result->"+("test".indexOf("")));
}
}
Im running Java 1.6.0_17
Quote from the C# documentation:
If value is Empty, the return value
is 0.
The behavior that you describe is entirely as expected (at least in C#).
0 is correct. Start at position zero and you can (trivially) match a zero-length string. Likewise, "" contains "".
This is not an exception to the rule, but rather a natural consequence of how indexOf and startsWith are defined.
You’re claiming that "test".indexOf("") should return -1. This is essentially equivalent to the claim that "test".startsWith("") should return false. Why is this? Although this case is specifically addressed in the documentation as returning true, this is not just an arbitrary decision.
How would you decide "test".startsWith("te"), for example? The simplest way is to use recursion. Since both strings start with the character 't', you call "est".startsWith("e") and return the result. Similarly, you will call "st".startsWith("") and return the result. But you already know that the answer should be true, so that is why every string starts with "".
0 is correct. The Javadocs point out that indexOf works as follows:
The integer returned is the smallest
value k such that:
this.startsWith(str, k)
Any string starting with "" is equal to the original string (and every string starts with ""), so the smallest k for str = "" is always 0.
Think of it this way: IndexOf, when looking for a string, will start at position 0, try to match the string, if it doesn't fit, move on to position 1, 2, etc. When you call it with an empty string, it attempts to match the empty string with the string starting at position 0 with length 0. And hooray, nothing equals nothing.
Side note: There's no real reason to use ToString when you're using Console.Write/WriteLine. The function automatically calls the ToString method of the object in question. (Unless overloading ToString)
It should return 0. You are looking for the first occurrence of an empty string, right? :)
More fun php actually does a way better job!
php -r "print strpos('test','');"
PHP Warning: strpos(): Empty delimiter. in Command line code on line 1
Just for the fun of it. It also works like that in python
>>> "test".startswith("")
True
>>> "test".index("")
0
Python throws a ValueError instead of the -1 that is nice.
>>> "test".index('r')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found

Categories