Seemingly identical primitive values aren't equal

Seemingly identical primitive values aren't equal - c#

A strange case popped up. Two properties of two objects are being cast to the same primitive type and (seemingly) have the same value. However, the equality comparer returns false. If we use the Equals method (or other means of comparing the two values), then we get the correct result.
Even stranger, actually placing the result of the cast into a new variable also seems to works.
Below is a VERY simplified code example and and it will NOT yield the same results when copied and compiled. It's just used to illustrate the general setting where the problem occurs.
class Program
{
static void Main(string[] args)
{
var v1 = new Object1 { SomeValue = (short)-1d };
var invalidResult = (int)v1.SomeValue == (int)SomeEnum.Value1; //for some reason this returns false
var validResult = ((int)v1.SomeValue).CompareTo((int)SomeEnum.Value1) == 0; //this works
var extraValidResult = ((int)v1.SomeValue).Equals((int)SomeEnum.Value1);
var cast1 = (int)v1.SomeValue;
var cast2 = (int)SomeEnum.Value1;
var otherValidResult = cast1 == cast2; //this also works
}
}
public class Object1
{
public short SomeValue { get; set; }
}
public enum SomeEnum : short
{
Value1 = -1,
Value2 = 0,
Value3 = 1
}
Here's a screenshot of the VS watch window as proof of what we're seeing:
I know sometimes VS can show invalid values in the "watches" window, however the effects aren't limited to that window and a case actually fails an if check where it should not in one of our tests. AFAIK there's no trickery in the code (like overriding == or Equals).
What could possibly be happening here?
(We've obviously "fixed" the issue using the CompareTo method, but we're all still scratching our heads wondering what exactly had happened...)
EDIT:
I realize the code example above is... a tad bit useless. However posting the actual code in question might prove to be very difficult; there's a lot of it. Finally, the "live" values of some objects are populated from a SQL server (using Entity Framework), which complicates sharing of the code even further. I'm happy to try and answer any additional questions to try and narrow down the issue, but sharing of the FULL code is, unfortunately, not possible (a specific block of it is possible, but it won't compile for obvious reasons). The example code was provided to show how strange the issue is.
EDIT 2:
Sorry for the delay. Here's the particular method in question:
public bool IsLocalizationBlockedByMagPElem(int localizationId)
{
IEnumerable<MagPElem> magPElems = MagPElemRepository.GetByLocalizationIdAndStatusesOrderedByIdDescThenSubLpAsc(localizationId, DocumentStatus.InBuffer, DocumentStatus.InBufferReedition);
if (magPElems.Count() != 0)
{
var magPElem = magPElems.First();
//this commented out code did not return the expected value due to the strange comparison issue
//return (magPElem.MaP_GIDTyp == (int)ExternalSystemType.PM_GIDTyp || (magPElem.MaP_GIDTyp == (int)ExternalSystemType.MP_GIDTyp && magPElem.MaP_SubGIDLp == (short)LocalizationDirection.Destination));
//to avoid the issue CompareTo is being used, but Equals would work just as well
return (magPElem.MaP_GIDTyp == (int)ExternalSystemType.PM_GIDTyp || (magPElem.MaP_GIDTyp == (int)ExternalSystemType.MP_GIDTyp && magPElem.MaP_SubGIDLp.CompareTo((short)LocalizationDirection.Destination) == 0));
}
return false;
}
I've also updated the proof screenshot to encompass more of the screen. In the screenshot there are some minor discrepancies with the above code as we were testing to see what's going on (like trying out different casts or assigning the cast result to a variable). But the gist of the problem is there.

Related

bugs in java code after converting from C#

here is a function prints repeating int in a array.
in c#:
int [] ReturnDups(int[] a)
{
int repeats = 0;
Dictionary<int, bool> hash = new Dictionary<int>();
for(int i = 0; i < a.Length i++)
{
bool repeatSeen;
if (hash.TryGetValue(a[i], out repeatSeen))
{
if (!repeatSeen)
{
hash[a[i]] = true;
repeats ++;
}
}
else
{
hash[a[i]] = false;
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
foreach(KeyValuePair<int,bool> p in hash)
{
if(p.Value)
{
result[current++] = p.Key;
}
}
}
return result;
}
now converted to JAVA by Tangible software's tool.
in java:
private int[] ReturnDups(int[] a)
{
int repeats = 0;
java.util.HashMap<Integer, Boolean> hash = new java.util.HashMap<Integer>();
for (int i = 0; i < a.length i++)
{
boolean repeatSeen = false;
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
{
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}
else
{
hash.put(a[i], false);
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
for (java.util.Map.Entry<Integer,Boolean> p : hash.entrySet())
{
if (p.getValue())
{
result[current++] = p.getKey();
}
}
}
return result;
}
but findbug find this line of code as bugs. and it looks very odd to me too.
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
can someone pls explain to me what this line does and how do i write it in java properly?
thanks

You have overcomplicated the code for TryGetValue - this simple translation should work:
if ( hash.containsKey(a[i]) ) {
if (!hash.get(a[i])) {
hash.put(a[i], true);
}
} else {
hash.put(a[i], false);
}
C# has a way to get the value and a flag that tells you if the value has been found in a single call; Java does not have a similar API, because it lacks an ability to pass variables by reference.

Do not directly convert C# implementation. assign repeatSeen value only if the id is there.
if (hash.containsKey(a[i]))
{
repeatSeen = hash.get(a[i]).equals(repeatSeen)
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}

To answer the actual question that was asked:
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
is indeed syntactically wrong. I haven't looked at the rest of the code, but having written parsers/code-generators in my time I'm guessing it was supposed to be
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen) : false)
It's gratuitously ugly -- which often happens with code generators, especially ones without an optimizing pass -- but it's syntactically correct. Let's see if it actually does have a well-defined meaning.
CAVEAT: I haven't crosschecked this by running it -- if someone spots an error, please tell me!
First off, x?y:z is indeed a ternary operator, which Java inherited from C via C++. It's an if-then-else expression -- if x is true it has the value y, whereas if x is false it has the value z. So this one-liner means the same thing as:
boolean implied;
if (hash.containsKey(a[i]) then
implied = (repeatSeen = hash.get(a[i])) == repeatSeen);
else
implied = false;
if(implied)
... and so on.
Now, the remaining bit of ugliness is the second half of that and-expression. I don't know if you're familiar with the use of = (assignment) as an expression operator; its value as an operator is the same value being assigned to the variable. That's mostly intended to let you do things like a=b=0;, but it can also be used to set variables "in passing" in the middle of an expression. Hardcore C hackers do some very clever, and ugly, things with it (he says, being one)... and here's it's being used to get the value from the hashtable, assign it to repeatSeen, and then -- via the == -- test that same value against repeatSeen.
Now the question is, what order are the two arguments of == evaluated in? If the left side is evaluated first, the == must always be true because the assignment will occur before the right-hand side retrieves the value. If the right side is evaluated first, we'd be comparing the new value against the previous value, in an very non-obvious way.
Well, in fact, there's another StackOverflow entry which addresses that question:
What are the rules for evaluation order in Java?
According to that, the rule for Java is that the left argument of an operator is always evaluated before the right argument. So the first case applies, the == always returns true.
Rewriting our translation one more time to reflect that, it turns into
boolean implied;
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
implied = true;
}
else
implied = false;
if(implied)
Which could be further rewritten as
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
// and go on to do whatever else was in the body of the original if statement
"If that's what they meant, why didn't they just write it that way?" ... As I say, I've written code generators, and in many cases the easiest thing to do is just make sure all the fragments you're writing are individually correct for what they're trying to do and not worry about whether they at all resemble what a human would have written do do the same thing. In particular, it's tempting to generate code according to templates which allow for cases you may not actually use, rather than trying to recognize the simpler situation and generate code differently.
I'm guessing that the compiler was drawing in and translating bits of computation as it realized it needed them, and that this created the odd nesting as it started the if, then realized it needed a conditional assignment to repeatSeen, and for whatever reason tried to make that happen in the if's test rather than in its body. Believe me, I've seen worse kluging from code generators.

c# which is the better way to declare two parameters

I have a very quick question about the best way to use two variables. Essentially I have an enum and an int, the value for which I want to get within several ifs. Should I declare them outside the if's or inside - consider the following examples:
e.g.a:
public void test() {
EnumName? value = null;
int distance = 0;
if(anotherValue == something) {
distance = 10;
value = getValue(distance);
}
else if(anotherValue == somethingElse) {
distance = 20;
value = getValue(distance);
}
if (value == theValueWeWant){
//Do something
}
OR
e.g.2
public void test() {
if(anotherValue == something) {
int distance = 10;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
else if(anotherValue == somethingElse) {
int distance = 20;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
}
I am just curious which is best? or if there is a better way?

Purely in terms of maintenance, the first code block is better as it does not duplicate code (assuming that "Do something" is the same in both cases).
In terms of performance, the difference should be negligible. The second case does generate twice as many locals in the compiled IL, but the JIT should notice that their usage does not overlap and optimize them away. The second case is also going to cause emission of the same code twice (if (value == theValueWeWant) { ...), but this should also not cause any significant performance penalty.
(Though both aspects of the second example will cause the compiled assembly to be very slightly larger, more IL does not always imply worse performance.)

Both examples do two different things:
Version 1 will run the same code if you get the desired value, where as Version 2 will potentially run different code even if you get the desired value.
There's a lot of possible (micro)optimizations you could do.
For Example, if distance is only ever used in getValue(distance), you could get rid of it entirely:
/*Waring, micro-optimization!*/
public void test() {
EnumType value = getValue((anotherValue == something) ? 10 : (anotherValue == somethingElse) ? 20 : 0);
if (value == theValueWeWant){
//Do something
}
}

If you wish to use those later on, then g for the second method. Those variables will be lost as soon as they're out of scope.
Even if you don't want to use them later, declaring them before the if's is something you should do, to avoid code repetition.

This question is purely a matter of style and hence has no correct answer, only opinions
The C# best practice is generally to declare variables in the scope where they are used. This would point to the second example as the answer. Even though the types and names are the same, they represent different uses and should be constrained to the blocks in which they are created.

Objects with many value checks c#

I want to see your ideas on a efficient way to check values of a newly serialized object.
Example I have an xml document I have serialized into an object, now I want to do value checks. First and most basic idea I can think of is to use nested if statments and checks each property, could be from one value checking that it has he correct url format, to checking another proprieties value that is a date but making sue it is in the correct range etc.
So my question is how would people do checks on all values in an object? Type checks are not important as this is already taken care of it is more to do with the value itself. It needs to be for quite large objects this is why I did not really want to use nested if statements.
Edit:
I want to achieve complete value validation on all properties in a given object.
I want to check the value it self not that it is null. I want to check the value for specific things if i have, an object with many properties one is of type string and named homepage.
I want to be able to check that the string in the in the correct URL format if not fail. This is just one example in the same object I could check that a date is in a given range if any are not I will return false or some form of fail.
I am using c# .net 4.

Try to use Fluent Validation, it is separation of concerns and configure validation out of your object

public class Validator<T>
{
List<Func<T,bool>> _verifiers = new List<Func<T, bool>>();
public void AddPropertyValidator(Func<T, bool> propValidator)
{
_verifiers.Add(propValidator);
}
public bool IsValid(T objectToValidate)
{
try {
return _verifiers.All(pv => pv(objectToValidate));
} catch(Exception) {
return false;
}
}
}
class ExampleObject {
public string Name {get; set;}
public int BirthYear { get;set;}
}
public static void Main(string[] args)
{
var validator = new Validator<ExampleObject>();
validator.AddPropertyValidator(o => !string.IsNullOrEmpty(o.Name));
validator.AddPropertyValidator(o => o.BirthYear > 1900 && o.BirthYear < DateTime.Now.Year );
validator.AddPropertyValidator(o => o.Name.Length > 3);
validator.Validate(new ExampleObject());
}

I suggest using Automapper with a ValueResolver. You can deserialize the XML into an object in a very elegant way using autommaper and check if the values you get are valid with a ValueResolver.
You can use a base ValueResolver that check for Nulls or invalid casts, and some CustomResolver's that check if the Values you get are correct.
It might not be exacly what you are looking for, but I think it's an elegant way to do it.
Check this out here: http://dannydouglass.com/2010/11/06/simplify-using-xml-data-with-automapper-and-linqtoxml

In functional languages, such as Haskell, your problem could be solved with the Maybe-monad:
The Maybe monad embodies the strategy of combining a chain of
computations that may each return Nothing by ending the chain early if
any step produces Nothing as output. It is useful when a computation
entails a sequence of steps that depend on one another, and in which
some steps may fail to return a value.
Replace Nothing with null, and the same thing applies for C#.
There are several ways to try and solve the problem, none of them are particularly pretty. If you want a runtime-validation that something is not null, you could use an AOP framework to inject null-checking code into your type. Otherwise you would really have to end up doing nested if checks for null, which is not only ugly, it will probably violate the Law of Demeter.
As a compromise, you could use a Maybe-monad like set of extension methods, which would allow you to query the object, and choose what to do in case one of the properties is null.
Have a look at this article by Dmitri Nesteruk: http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
Hope that helps.

I assume your question is: How do I efficiently check whether my object is valid?
If so, it does not matter that your object was just deserialized from some text source. If your question regards checking the object while deserializing to quickly stop deserializing if an error is found, that is another issue and you should update your question.
Validating an object efficiently is not often discussed when it comes to C# and administrative tools. The reason is that it is very quick no matter how you do it. It is more common to discuss how to do the checks in a manner that is easy to read and easily maintained.
Since your question is about efficiency, here are some ideas:
If you have a huge number of objects to be checked and performance is of key importance, you might want to change your objects into arrays of data so that they can be checked in a consistent manner. Example:
Instead of having MyObject[] MyObjects where MyObject has a lot of properties, break out each property and put them into an array like this:
int[] MyFirstProperties
float[] MySecondProperties
This way, the loop that traverses the list and checks the values, can be as quick as possible and you will not have many cache misses in the CPU cache, since you loop forward in the memory. Just be sure to use regular arrays or lists that are not implemented as linked lists, since that is likely to generate a lot of cache misses.
If you do not want to break up your objects into arrays of properties, it seems that top speed is not of interest but almost top speed. Then, your best bet is to keep your objects in a serial array and do:
.
bool wasOk = true;
foreach (MyObject obj in MyObjects)
{
if (obj.MyFirstProperty == someBadValue)
{
wasOk = false;
break;
}
if (obj.MySecondProperty == someOtherBadValue)
{
wasOk = false;
break;
}
}
This checks whether all your objects' properties are ok. I am not sure what your case really is but I think you get the point. Speed is already great when it comes to just checking properties of an object.
If you do string compares, make sure that you use x = y where possible, instead of using more sophisticated string compares, since x = y has a few quick opt outs, like if any of them is null, return, if the memory address is the same, the strings are equal and a few more clever things if I remember correctly. For any Java guy reading this, do not do this in Java!!! It will work sometimes but not always.
If I did not answer your question, you need to improve your question.

I'm not certain I understand the depth of your question but, wouldn't you just do somthing like this,
public SomeClass
{
private const string UrlValidatorRegex = "http://...
private const DateTime MinValidSomeDate = ...
private const DateTime MaxValidSomeDate = ...
public string SomeUrl { get; set; }
public DateTime SomeDate { get; set; }
...
private ValidationResult ValidateProperties()
{
var urlValidator = new RegEx(urlValidatorRegex);
if (!urlValidator.IsMatch(this.Someurl))
{
return new ValidationResult
{
IsValid = false,
Message = "SomeUrl format invalid."
};
}
if (this.SomeDate < MinValidSomeDate
|| this.SomeDate > MinValidSomeDate)
{
return new ValidationResult
{
IsValid = false,
Message = "SomeDate outside permitted bounds."
};
}
...
// Check other fields and properties here, return false on failure.
...
return new ValidationResult
{
IsValid = true,
};
}
...
private struct ValidationResult
{
public bool IsValid;
public string Message;
}
}
The exact valdiation code would vary depending on how you would like your class to work, no? Consider a property of a familar type,
public string SomeString { get; set; }
What are the valid values for this property. Both null and string.Empty may or may not be valid depending on the Class adorned with the property. There may be maximal length that should be allowed but, these details would vary by implementation.
If any suggested answer is more complicated than code above without offering an increase in performance or functionality, can it be more efficient?
Is your question actually, how can I check the values on an object without having to write much code?

Why is this string property shown to be fully covered when it's not supposed to be?

class Program
{
static void Main(string[] args)
{
var x = new Program();
Console.Write(x.Text);
Console.Write(x.Num);
//Console.Write(x.Num);//line A
}
private string Text_;
public string Text
{
get
{
return Text_ ?? (Text_ = "hello");//line B
}
}
private int? Num_;
public int Num
{
get
{
return (int)(Num_ ?? (Num_ = 42));//line C
}
}
}
I'm using Visual Studio 2010 to get code coverage results. It shows that line B is totally covered while line C is partially covered. I would expect that line B is partially covered as well instead of being totally covered. Why is it the case that the code coverage results shows line B as being totally covered?
To demonstrate that it's working "correctly" for Num property, uncomment line A and run the coverage. It should show line C as totally being covered.
When I rewrite the code to a more verbose form (see below), it works correctly and reports that Text_ is partially covered. I prefer to use the former for its brevity, and would like to know if the two forms are equivalent too. Thanks in advance.
if (Text_ != null)
{
return Text_;
}
else
{
return Text_ = "hello";
}

The ?? on Nullable<T> expands to something more complex at the compiler, i.e.
a ?? b
is really
a.HasValue ? a.GetValueOrDefault() : b
Now, since a was null/empty for the only time it was executed, the a.GetValueOrDefault() has never been called. The underlying code when using a string (or any flat reference) is simpler.
In reality, just call it twice to make this go away:
Console.Write(x.Text); // first call; performs init
Console.Write(x.Text); // test once initialized
Console.Write(x.Num); // first call; performs init
Console.Write(x.Num); // test once initialized

Without going into the IL generated by this line I'd suspect that the compiler has done some internal optimisation that causes your "line B" to be regarded as a single statement.
The principal difference between the two statements is that line B refers to an object, whereas line C uses a shorthand reference to Num_.Value.
In general, it's definitely not worth getting stressed about "100% code coverage" -- as this example indicates, sometimes the instrumentation is wrong, anyway. What's important is that most of the code is covered, at least the principal execution paths through business logic.
Coverage is no replacement for code reviews by another human.

C# Reflection Property.GetValue() Problem

I am having a problem with the following code:
int errorCount = 0;
foreach (var cinf in client.GetType().GetProperties())
{
var vinf = viewModel.GetType().GetProperty(cinf.Name);
if (vinf != null)
{
if (cinf.GetValue(client, null) != vinf.GetValue(viewModel, null))
{
errorCount++;
}
}
}
It's for an automated test to see whether mapping a model object from a DTO has worked. If I use the more cumbersome approch of writing this for each property:
Assert.AreEqual(viewModel.ClientCompanyID, client.ClientCompanyID);
This works fine.
The problem is: the reflection code evaluates "if val1 != val2" statement incorrectly (or so it seems). If I step through this code, the evaluation basically says "1 does not equal 1", and incorrectly adds an error. Furthermore, if I test this with this code, I get the same seemingly false result:
var clientEx = client.GetType().GetProperty("ClientCompanyID");
var viewModelEx = viewModel.GetType().GetProperty("ClientCompanyID");
var clientVal = clientEx.GetValue(client, null);
var viewModelVal = viewModelEx.GetValue(viewModel, null);
bool test = (clientVal == viewModelVal);
The bool returns false even when, stepping through the code, clientVal = 1 and viewModelVal = 1. See attached picture.
Any help with this would be greatly appreciated!
Thanks guys.
Tim.
EDIT: Could have given you all the answer. Glad it was simple in the end. Thanks alot for your help. Cheers.

You need to compare with object.Equals() instead of using reference equality. Boxed value types will not compare as equal without using object.Equals(). Try this:
if (!object.Equals(cinf.GetValue(client, null), vinf.GetValue(viewModel, null)))
For example, take this simple case:
csharp> object a = 1;
csharp> object b = 1;
csharp> a == b;
false
csharp> object.Equals(a, b);
true

You're comparing to different boxed integers by reference.
Change it to
if (!Equals(cinf.GetValue(client, null), vinf.GetValue(viewModel, null))
This calls the static Object.Equals method, which will call the virtual Object.Equals method (after checking for null) to compare objects by value.
You'll see the same issue for strings.

This is only natural. If you compare to objects together using ==, it will compare their reference which is different.
Use objectA.Equals(objectB).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.