I am doing some custom serializing, and in order to save some space, i want to serialize the decimals as int, if possible value wise. Performance is a concern, since i am dealing with a high volume of data. The current method i use is:
if ((value > Int32.MinValue) && (value < Int32.MaxValue) && ((valueAsInt = Decimal.ToInt32(value)) == value))
{
return true;
}
Can this be improved?
Do you have any negative values? I'm guessing yes since you have the MinValue check, otherwise you can skip it. You could even use unsigned int which will allow you to convert more of your double values into ints.
Edit: Also, if you have more positive numbers, you can swap the first two conditions. That way the first one is the most likely to fail, decreasing the total number of comparisons.
Your invalidation criteria are:
1) Is it greater than MaxValue?
2) Is it smaller than MinValue?
3) Does it contain a fractional component?
It sounds like you have them covered. My implementation would be:
public bool IsConvertibleToInt(decimal value)
{
if(value > int.MaxValue)
return false;
if(value < int.MinValue)
return false;
if(Math.Floor(value) < value && Math.Ceiling(value) > value)
return false;
return true;
}
How about this. I think it should take fewer operations (at least a fewer number of comparisons):
return (value == (Int32)value);
Also remember, if an if statement simply returns a boolean, you can just return the comparison. That alone might make it faster (unless the compiler already optimizes for this). If you have to use the if statement, you can similarly do this:
if (value == (Int32)value)
{
//Do stuff...
return true;
}
else
{
//Do stuff...
return false;
}
EDIT: I realize this doesn't actually work. I was thinking the Int32 cast would just copy in the first 32 bits from the decimal, leaving behind any remaining bits (and not throw an exception), but alas, it didn't work that way (not to mention it would be wrong for all negative values).
It depends on how many decimal places you have or really care about. If you could say that I only care about up to 3 decimal places then the largest number you can store in int32 is int.MaxValue / 1000. If you are only working with positive numbers then you can get a higher number by using uint. In any case the way to do it is to consistently reserve space for the decimal and use * 1000 to encode them and / 1000 to decode them to / from decimal.
No need for "valueAsInt =". I believe (Decimal.ToInt32(value) == value)) gets you the same result with one less assignment. Are you using valueAsInt as some sort of output parameter?
Wouldn't you be able to just do something like:
if(Decimal.ToInt32(value) == value)
{
return true;
}
Not an expert on .net, but I think that should be all it'd require. Also, your two comparisons operators should be 'or equal' since the min/max values are also valid.
Edit: As pointed out in the comment, this would throw an exception. You could try catching the exception and returning false, but at that point it likely would be much faster to do the min/max testing yourself.
Related
I've been struggling to get my head around a natural way of using TryParse because I keep expecting it to work the other way around (i.e. to return the parsed value and emit the boolean for whether the input parsed).
For example, if we take a basic implementation of Parse, the return value is the parsed input:
int parsedValue = int.Parse(input);
This works fine until it gets a value that it can't parse, at which point it entirely reasonably throws an exception. To me, the option then is either to wrap the parse in something like a try-catch block to handle the exception and a condition to set a default, or just to use TryParse to let C# do all that for me. Except that's not how TryParse works. The above example now looks like this:
bool parseSucceeded = int.TryParse(input, out int parsedValue);
To get it to assign in the same way as Parse, I wrap it in a ternary conditional with parsedValue and a default value (in this case, 0) as the true and false results respectively:
int parsedValue = int.TryParse(input, out parsedValue) ? parsedValue : 0;
But I still feel like I'm missing the point with TryParse if I'm just working around its default behaviour like this. I've read Tim Schmelter's excellent answer in which he shows its internal workings, from which I can suppose that it returns the boolean because it's easier internally than passing it out at all the various places that it currently returns. But I'm not sure about this, and I'm not satisfied that I understand its intent correctly. I also tried reading the documentation for it, but the remarks don't clear up my confusion (I don't think they even make its differences with Parse clear enough, like the change in return type).
Am I understanding it correctly, or am I missing something?
Sure, it could have been implemented as
int TryParse(string input, out bool succeeded)
{
}
But as mentioned in a comment, the common use case for the function is:
string input;
int parsedValue;
if(int.TryParse(input, out parsedValue))
{
// use parsedValue here
}
With the signature you propose, that code would now be:
string input;
bool succeeded;
int parsedValue = int.TryParse(input, out succeeded)
if(succeeded)
{
// use parsedValue here
}
So there's more code for no functional benefit. Also, with your ternary operator, if the parse fails you just set a value of zero, which is unnecessary since the default value of it is 0. You could just do:
int parsedValue;
int.TryParse(input, out parsedValue);
If the parse fails, parsedValue will have a value of 0; (I also question if/how you distinguish between an actual result of 0 and a failed parse, but I'm sure you have a reason).
So there's no technical reason why the signature is the way it is; it's a design decision that is appropriate for the most common use cases.
Of course, now with tuples in C# 7 you could have:
(int parsedValue, bool succeeded) = int.TryParse(input);
but again there's little functional benefit and prevents you from inlining the TryParse in an if statement.
Because logically you would want to check that the TryParse succeeded before trying to use the out value.
So this is more concise:
if (int.TryParse(input, out int parsedValue)}
{
// Do something with parsedValue
}
Than this:
int parsedValue = int.TryParse(input, out bool succeded);
if (succeeded)
{
// Do something with parsedValue
}
I think, a large part of your confusion stems from the method name isn't named exactly right:
int parsedValue = int.Parse("42");
This makes perfect sense, give me the integeger representation of a string.
int parsedValue = int.TryParse(input);
This makes sense as an extension of the concept: Input might be '42' or 'Oswald', but if it's a number I want that number.
In 2020, I think a better name would be CanParse(string input, out int result).
It better matches style guides and naming conventions, where returning a bool should be named with Is, Has, or Can.
It better matches how we use TryParse 99% of the time:
if (int.CanParse(input, out int result))
{
return result * 10;
}
But where I feel the current name makes sense, is the problem I assume it was trying to solve: To get rid of the following boilerplate code:
int result;
bool hasValidNumber = false;
try
{
result = int.Parse(input);
hasValidNumber = true;
}
catch
{
// swallow this exception
}
if (hasValidNumber)
{
// do things with result
}
else
{
// use a default or other logic
}
I have the following:
string outOfRange = "2147483648"; // +1 over int.MaxValue
Obviously if you have anything other than a number this will fail:
var defaultValue = 0;
int.TryParse(outOfRange, out defaultValue);
My question is, since this IS a number, and it WILL fail when you int.TryParse(), how do you tell that it failed because the string was out of the bounds of the container it's stored in?
I'd go with the Try/Catch solution for this scenario.
string outOfRange = "2147483648";
try
{
int.Parse(outOfRange);
}
catch (OverflowException oex)
{
}
catch (Exception ex)
{ }
I know that most people here would recommend avoiding this but sometimes we just have to use it (or we don't have to but it would just save us a lot of time).
here's a little post about the efficiency of Try/Catch.
can parse to decimal and then check range, avoids try/catch
string s = "2147483648";
decimal.Parse(s) > int.MaxValue;
I would attempt to parse, if it fails, then attempt to parse a higher-capacity value. If the higher capacity value passes parsing, then you know it's out of range. If it fails as well, then it's bad input.
string outOfRange = "2147483648"; // +1 over int.MaxValue
int result;
if (!Int32.TryParse(outOfRange, out result))
{
long rangeChecker;
if (Int64.TryParse(outOfRange, out rangeChecker))
//out of range
else
//bad format
}
Unfortunately, I don't think there's a way to do this generically for any type; you'd have to write an implementation for all types. So for example, what do do for Int64? Maybe use BigInteger instead:
string outOfRange = "9223372036854775808"; // +1 over Int64.MaxValue
long result;
if (!Int64.TryParse(outOfRange, out result))
{
BigInteger rangeChecker;
if (BigInteger.TryParse(outOfRange, out rangeChecker))
//out of range
else
//bad format
}
EDIT: double floating point values may be more fun since AFAIK, there's no "BigDecimal" and you may have to also account for values that approach 0 at the very extreme (not sure about that). Possibly you could do a variation on the BigInteger check but you might also have to account for decimal points (probably a simple regex would be best here to have only numbers, an optional negative sign, and only one at most decimal point). If there are any decimal points, you'd have to truncate them out and simply check the integer portion of the string.
EDITx2: Here's a pretty ugly implementation for checking double values too:
// +bajillion over Double.MaxValue
string outOfRange = "90000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.1";
double result;
if (!Double.TryParse(outOfRange, out result))
{
string bigIntegerInput = outOfRange;
if (!Regex.IsMatch(bigIntegerInput, #"^-?[0-9]\d*(\.\d+)?$"))
//bad format
int decimalIndex = bigIntegerInput.IndexOf('.');
if (decimalIndex > -1)
bigIntegerInput = bigIntegerInput.Substring(0, decimalIndex);
BigInteger rangeChecker;
if (BigInteger.TryParse(bigIntegerInput, out rangeChecker))
//out of range
else
//bad format
}
But honestly, at this point I think we've just gone off the deep end. Unless you have some real performance bottleneck, or your application has out-of-range values inputted frequently, you might be better off just catching them the odd time it happens as in this answer or perhaps more simply, applying a regex to the input. In my last example, I may have as well just quit after doing the regex anyway (but I don't know off the top of my head if the TryParse implementations are more lenient, allowing for exponential/scientific notation. If so, the regex would have to cover these as well)
string outOfRange = "2147483648"; // +1 over int.MaxValue
int value;
if(! int.TryParse(outOfRange, out value)) {
try {
int.Parse(defaultValue);
} catch(OverflowException e) {
// was overflow
} catch(Exception e) {
// was other reason
}
}
Assuming that there are few cases where the number is too large, the overhead of exception throwing and catching may be tolerable, as the normal cases are handled with the faster TryParse method without involving exceptions.
This would work similar for other numeric data types like floats, ...
You could try parsing with BigInteger.
BigInteger bigInt;
bool isAnOutOfRangeInt = BigInteger.TryParse(input, out bigInt)
&& (bigInt > int.MaxValue || bigInt < int.MinValue);
// if you care to have the value as an int:
if (!isAnOutOfRangeInt)
{
int intValue = (int)bigInt;
}
Use the normal Parse instead of the TryParse. And then use it inside a try/catch because it will give you the appropriate exception. See this for details: http://msdn.microsoft.com/en-us/library/b3h1hf19.aspx. The exception you are looking for is OverflowException.
I would look at using System.Convert.ToInt32(String) as the mechanism to convert things; namely because OverflowException has already been implemented for you.
This is convenient because you can do something simple like
try
{
result = Convert.ToInt32(value);
Console.WriteLine("Converted the {0} value '{1}' to the {2} value {3}.",
value.GetType().Name, value, result.GetType().Name, result);
}
catch (OverflowException)
{
Console.WriteLine("{0} is outside the range of the Int32 type.", value);
}
catch (FormatException)
{
Console.WriteLine("The {0} value '{1}' is not in a recognizable format.",
value.GetType().Name, value);
}
and the logic is already a part of the standard System library.
The straight forward way would be to instead use Int32.Parse(string s) and catch OverflowException;
OverflowException
s represents a number less than MinValue or greater than MaxValue.
I have a very quick question about the best way to use two variables. Essentially I have an enum and an int, the value for which I want to get within several ifs. Should I declare them outside the if's or inside - consider the following examples:
e.g.a:
public void test() {
EnumName? value = null;
int distance = 0;
if(anotherValue == something) {
distance = 10;
value = getValue(distance);
}
else if(anotherValue == somethingElse) {
distance = 20;
value = getValue(distance);
}
if (value == theValueWeWant){
//Do something
}
OR
e.g.2
public void test() {
if(anotherValue == something) {
int distance = 10;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
else if(anotherValue == somethingElse) {
int distance = 20;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
}
I am just curious which is best? or if there is a better way?
Purely in terms of maintenance, the first code block is better as it does not duplicate code (assuming that "Do something" is the same in both cases).
In terms of performance, the difference should be negligible. The second case does generate twice as many locals in the compiled IL, but the JIT should notice that their usage does not overlap and optimize them away. The second case is also going to cause emission of the same code twice (if (value == theValueWeWant) { ...), but this should also not cause any significant performance penalty.
(Though both aspects of the second example will cause the compiled assembly to be very slightly larger, more IL does not always imply worse performance.)
Both examples do two different things:
Version 1 will run the same code if you get the desired value, where as Version 2 will potentially run different code even if you get the desired value.
There's a lot of possible (micro)optimizations you could do.
For Example, if distance is only ever used in getValue(distance), you could get rid of it entirely:
/*Waring, micro-optimization!*/
public void test() {
EnumType value = getValue((anotherValue == something) ? 10 : (anotherValue == somethingElse) ? 20 : 0);
if (value == theValueWeWant){
//Do something
}
}
If you wish to use those later on, then g for the second method. Those variables will be lost as soon as they're out of scope.
Even if you don't want to use them later, declaring them before the if's is something you should do, to avoid code repetition.
This question is purely a matter of style and hence has no correct answer, only opinions
The C# best practice is generally to declare variables in the scope where they are used. This would point to the second example as the answer. Even though the types and names are the same, they represent different uses and should be constrained to the blocks in which they are created.
I need to limit the value on overflow.
I implemented this as follows:
public static sbyte LimitValueToSByte(this int val)
{
if (val > sbyte.MaxValue) return sbyte.MaxValue;
if (val < sbyte.MinValue) return sbyte.MinValue;
return (sbyte)val;
}
Is there a more elegant way?
This is the code in a time critical system, therefore performance is important.
That seems like a perfectly readable and valid code that doesn't need any improvement whatsoever. It's just the name of the method. Maybe use ToSbyte rather than LimitValueToSByte.
Can't think of a better way to write that function.
I'd call it ClampToSByte, since this kind of limiting operation is usually called Clamp. Limit is a bit less specific, and allows for other boundary conditions, such as wrapping around.
You should be careful if you implement similar code for floating point numbers. In particular you need to decide what behavior you want for NaNs and signed zeros. But luckily that are no issues with integral values.
Looks pretty good to me. If you want something more elegant, how about a generic clamp function?
public static T Clamp<T>(this T value, T min, T max)
where T : IComparable<T>
{
if (value.CompareTo(min) <= 0) return min;
if (value.CompareTo(max) >= 0) return max;
return value;
}
(Warning: I didn't test this.) You could use it like this:
int a = 42;
sbyte b = (sbyte)a.Clamp(sbyte.MinValue, sbyte.MaxValue);
My new solution of this problem:
public static sbyte Clamp(this int val)
{
return (sbyte)Math.Max(Math.Min(value, sbyte.MaxValue), sbyte.MinValue);
}
Often I find myself having a expression where a division by int is a part of a large formula. I will give you a simple example that illustrate this problem:
int a = 2;
int b = 4;
int c = 5;
int d = a * (b / c);
In this case, d equals 0 as expected, but I would like this to be 1 since 4/5 multiplied by 2 is 1 3/5 and when converted to int get's "rounded" to 1. So I find myself having to cast c to double, and then since that makes the expression a double also, casting the entire expression to int. This code looks like this:
int a = 2;
int b = 4;
int c = 5;
int d = (int)(a * (b / (double)c));
In this small example it's not that bad, but in a big formula this get's quite messy.
Also, I guess that casting will take a (small) hit on performance.
So my question is basically if there is any better approach to this than casting both divisor and result.
I know that in this example, changing a*(b/c) to (a*b)/c would solve the problem, but in larger real-life scenarios, making this change will not be possible.
EDIT (added a case from an existing program):
In this case I'm caclulating the position of a scrollbar according to the size of the scrollbar, and the size of it's container. So if there is double the elements to fit on the page, the scrollbar will be half the height of the container, and if we have scrolled through half of the elements possible, that means that the scroller position should be moved 1/4 down so it will reside in the middle of the container. The calculations work as they should, and it displays fine. I just don't like how the expression looks in my code.
The important parts of the code is put and appended here:
int scrollerheight = (menusize.Height * menusize.Height) / originalheight;
int maxofset = originalheight - menusize.Height;
int scrollerposition = (int)((menusize.Height - scrollerheight) * (_overlayofset / (double)maxofset));
originalheight here is the height of all elements, so in the case described above, this will be the double of menusize.Height.
Disclaimer: I typed all this out, and then I thought, Should I even post this? I mean, it's a pretty bad idea and therefore doesn't really help the OP... In the end I figured, hey, I already typed it all out; I might as well go ahead and click "Post Your Answer." Even though it's a "bad" idea, it's kind of interesting (to me, anyway). So maybe you'll benefit in some strange way by reading it.
For some reason I have a suspicion the above disclaimer's not going to protect me from downvotes, though...
Here's a totally crazy idea.
I would actually not recommend putting this into any sort of production environment, at all, because I literally thought of it just now, which means I haven't really thought it through completely, and I'm sure there are about a billion problems with it. It's just an idea.
But the basic concept is to create a type that can be used for arithmetic expressions, internally using a double for every term in the expression, only to be evaluated as the desired type (in this case: int) at the end.
You'd start with a type like this:
// Probably you'd make this implement IEquatable<Term>, IEquatable<double>, etc.
// Probably you'd also give it a more descriptive, less ambiguous name.
// Probably you also just flat-out wouldn't use it at all.
struct Term
{
readonly double _value;
internal Term(double value)
{
_value = value;
}
public override bool Equals(object obj)
{
// You would want to override this, of course...
}
public override int GetHashCode()
{
// ...as well as this...
return _value.GetHashCode();
}
public override string ToString()
{
// ...as well as this.
return _value.ToString();
}
}
Then you'd define implicit conversions to/from double and the type(s) you want to support (again: int). Like this:
public static implicit operator Term(int x)
{
return new Term((double)x);
}
public static implicit operator int(Term x)
{
return (int)x._value;
}
// ...and so on.
Next, define the operations themselves: Plus, Minus, etc. In the case of your example code, we'd need Times (for *) and DividedBy (for /):
public Term Times(Term multiplier)
{
// This would work because you would've defined an implicit conversion
// from double to Term.
return _value * multiplier._value;
}
public Term DividedBy(Term divisor)
{
// Same as above.
return _value / divisor._value;
}
Lastly, write a static helper class to enable you to perform Term-based operations on whatever types you want to work with (probably just int for starters):
public static class TermHelper
{
public static Term Times(this int number, Term multiplier)
{
return ((Term)number).Times(multiplier);
}
public static Term DividedBy(this int number, Term divisor)
{
return ((Term)number).DividedBy(divisor);
}
}
What would all of this buy you? Practically nothing! But it would clean up your expressions, hiding away all those unsightly explicit casts, making your code significantly more attractive and considerably more impossible to debug. (Once again, this is not an endorsement, just a crazy-ass idea.)
So instead of this:
int d = (int)(a * (b / (double)c)); // Output: 2
You'd have this:
int d = a.Times(b.DividedBy(c)); // Output: 2
Is it worth it?
Well, if having to write casting operations were the worst thing in the world, like, even worse than relying on code that's too clever for its own good, then maybe a solution like this would be worth pursuing.
Since the above is clearly not true... the answer is a pretty emphatic NO. But I just thought I'd share this idea anyway, to show that such a thing is (maybe) possible.
First of all, C# truncates the result of int division, and when casting to int. There's no rounding.
There's no way to do b / c first without any conversions.
Multiply b times 100. Then divide by 100 at the end.
In this case, I would suggest Using double instead, because you don't need 'exact' precision.
However, if you really feel you want to do it all without floating-point operation, I would suggest creating some kind of fraction class, which is far more complex and less efficient but you can keep track of all dividend and divisor and then calculate it all at once.