I'm wondering what the correct way to compare two characters ignoring case that will work for all cultures. Also, is Comparer<char>.Default the best way to test two characters without ignoring case? Does this work for surrogate-pairs?
EDIT: Added sample IComparer<char> implementation
If this helps anyone this is what I've decided to use
public class CaseInsensitiveCharComparer : IComparer<char> {
private readonly System.Globalization.CultureInfo ci;
public CaseInsensitiveCharComparer(System.Globalization.CultureInfo ci) {
this.ci = ci;
}
public CaseInsensitiveCharComparer()
: this(System.Globalization.CultureInfo.CurrentCulture) { }
public int Compare(char x, char y) {
return Char.ToUpper(x, ci) - Char.ToUpper(y, ci);
}
}
// Prints 3
Console.WriteLine("This is a test".CountChars('t', new CaseInsensitiveCharComparer()));
It depends on what you mean by "work for all cultures". Would you want "i" and "I" to be equal even in Turkey?
You could use:
bool equal = char.ToUpperInvariant(x) == char.ToUpperInvariant(y);
... but I'm not sure whether that "works" according to all cultures by your understanding of "works".
Of course you could convert both characters to strings and then perform whatever comparison you want on the strings. Somewhat less efficient, but it does give you all the range of comparisons available in the framework:
bool equal = x.ToString().Equals(y.ToString(),
StringComparison.InvariantCultureIgnoreCase);
For surrogate pairs, a Comparer<char> isn't going to be feasible anyway, because you don't have a single char. You could create a Comparer<int> though.
As I understand it, there isn't really a way that will "work for all cultures". Either you want to compare characters for some kind of internal, non-displayed-to-the-user reason (in which case you should use the InvariantCulture), or you want to use the CurrentCulture of the user. Obviously, using the user's current culture will mean that you will get different results in different locales, but they will be consistent with what your users in those locales will expect.
Without knowing more about WHY you are comparing two characters, I can't really advise you on which one you should be using.
I would recommend comparing uppercase, and if they don't match then comparing lowercase, just in case the locale's uppercasing and lowercasing logic behave slightly different.
Addendum
For example,
int CompareChar(char c1, char c2)
{
int dif;
dif = char.ToUpper(c1) - char.ToUpper(c2);
if (diff != 0)
dif = char.ToLower(c1) - char.ToLower(c2);
return dif;
}
What I was thinking that would be available within the runtime is something like the following
public class CaseInsensitiveCharComparer : IComparer<char> {
private readonly System.Globalization.CultureInfo ci;
public CaseInsensitiveCharComparer(System.Globalization.CultureInfo ci) {
this.ci = ci;
}
public CaseInsensitiveCharComparer()
: this(System.Globalization.CultureInfo.CurrentCulture) { }
public int Compare(char x, char y) {
return Char.ToUpper(x, ci) - Char.ToUpper(y, ci);
}
}
// Prints 3
Console.WriteLine("This is a test".CountChars('t', new CaseInsensitiveCharComparer()));
You could try:
class Test{
static int Compare(char t, char p){
return string.Compare(t.ToString(), p.ToString(), StringComparison.CurrentCultureIgnoreCase);
}
}
But I doubt this is the "optimal" way to do it, but I'm not all of the cases you need to be checking...
string.Compare("string a","STRING A",true)
It will work for every string
I know this is an old post, but things have changed since then.
The question above can be answered by using an extension. This would extend the char.Equals to allow for locality and case insensitivity.
In an extension class, add something such as:
internal static Boolean Equals(this Char src, Char ch, StringComparison comp)
{
Return $"{src}".Equals($"{ch}", comp);
}
I'm currently at work, so can't check this, but it should work.
Andy
You can provide last argument as true for caseInsensetive match
string.Compare(lowerCase, upperCase, true);
Related
For example if both values are int type it adds them.... ie 2+2=4
if both values are float....ie 2.2+2.3=4.5
or if one value is string and second is int...ie 1 + Pak=1Pak
We will get these two values from user using tfwo textboxes
This would be one way of doing it. Without having to convert to string and than back to numeric.
public object Add(IConvertible a, IConvertible b)
{
if(IsNumeric(a) && IsNumeric(b))
return a.ToDouble(CultureInfo.InvariantCulture) + b.ToDouble(CultureInfo.InvariantCulture);
return a.ToString() + b.ToString();
}
public static bool IsNumeric(object o)
{
var code = (int)Type.GetTypeCode(o.GetType());
return code >= 4 && code <= 15;
}
You can't do it using generics. You'll receive string from your textbox anyway. The only thing to do is to implement it "manually" exactly this way as you said:
public string TrySumTwoStrings(string input1, string input2)
{
double numeric1, numeric2;
if (Double.TryParse(input1, out numeric1) && Double.TryParse(input2, out numeric2))
return (numeric1 + numeric2).ToString();
return input1 + input2;
}
There's no way to use generics if we have no different types (everything is typed as string here).
You wouldn't, generics cannot be constrained in a way to support arithmetic operators (or concatenation). You would need to create overloads.
public int Add(int x, int y)
public double Add(double x, double y)
public decimal Add(decimal x, decimal y)
// etc.
Of course, you still have the problem of determining how exactly to parse your data. The source being a TextBox, the data will inherently be strings. You will have to determine which type of number it should be, if any.
If doing this for a real application, you shouldn't have this problem. Your textbox should be expected to receive input from the user in the form of an integer, or a decimal, or a string, etc. If it's not convertible to the proper type, it's an invalid input from your user. You wouldn't want the input to have to be magically deduced.
string Str1 = textBox1.Text.Trim();
string Str2 = textBox2.Text.Trim();
double Num1,num2;
bool isNum1 = double.TryParse(Str1, out Num1);
bool isNum2 = double.TryParse(Str2, out Num2);
if (isNum1 && isNum2)
MessageBox.Show((isNum1 + isNum2).ToString());
else
MessageBox.Show( Str1 + Str2);
Check out http://www.yoda.arachsys.com/csharp/miscutil/ The MiscUtil library. It contains some very clever Expression Tree stuff to allow operators with generics. It's not going to work exactly how you want (as others have stated, you can't constrain types to have operators) but it should do exactly what you want.
I don't know how it handles adding different types together though, I've not tried that.
I would think it would take some processing the values before hand in order with things like String.PArse, Int.Parse, etc...
Take them in order of compplexity first because 1 will convert to string, however x will not convert to integer or float.
Officially changed my answer to same as comment...
Best suggestion I have on that would be allow the user to select what type to interpret the data as and pass to the appropriate function based on what the user meant, there would be too many ways to interprets char strings to know what the users intention was, code processes logic not intent.
One thing that has bothered me about C# since its release was the lack of a generic IsNumeric function. I know it is difficult to generate a one-stop solution to detrmine if a value is numeric.
I have used the following solution in the past, but it is not the best practice because I am generating an exception to determine if the value is IsNumeric:
public bool IsNumeric(string input)
{
try
{
int.Parse(input);
return true;
}
catch
{
return false;
}
}
Is this still the best way to approach this problem or is there a more efficient way to determine if a value is numeric in C#?
Try this:
int temp;
return int.TryParse(input, out temp);
Of course, the behavior will be different from Visual Basic IsNumeric. If you want that behavior, you can add a reference to "Microsoft.VisualBasic" assembly and call the Microsoft.VisualBasic.Information.IsNumeric function directly.
You can use extension methods to extend the String type to include IsInteger:
namespace ExtensionMethods
{
public static class MyExtensions
{
public static bool IsInteger(this String input)
{
int temp;
return int.TryParse(input, out temp);
}
}
}
Rather than using int.Parse, you can use int.TryParse and avoid the exception.
Something like this
public static bool IsNumeric(string input)
{
int dummy;
return int.TryParse(input, out dummy);
}
More generically you might want to look at double.TryParse.
One thing you should also consider is the potential of handling numeric string for different cultures. For example Greek (el-GR) uses , as a decimal separator while the UK (en-GB) uses a .. So the string "1,000" will either be 1000 or 1 depending on the current culture. Given this, you might consider providing overloads for IsNumeric that support passing the intended culture, number format etc. Take a look at the 2 overloads for double.TryParse.
I've used the following extension method before, if it helps at all:
public static int? AsNumeric(this string source)
{
int result;
return Int32.TryParse(source, out result) ? result : (int?)null;
}
Then you can use .HasValue for the bool you have now, or .Value for the value, but convert just once...just throwing it out there, not sure what situation you're using it for afterwards.
If you use Int32.TryParse then you don't need to wrap the call in a TryCatch block, but otherwise, yes that is the approach to take.
Not exactly crazy about this approach, but you can just call the vb.net isNumeric function from C# by adding a reference to the Microsoft.VisualBasic.dll library...
bool x= Microsoft.VisualBasic.Information.IsNumeric("123");
The other approaches given are superior, but wanted to add this for the sake of completeness.
Lot's of TryParse answers. Here's something a bit different using Char.IsNumber():
public bool IsNumeric(string s)
{
for (int i = 0; i < s.Length; i++)
{
if (char.IsNumber(s, i) == false)
{
return false;
}
}
return true;
}
Take a look on the following answer:
What is the C# equivalent of NaN or IsNumeric?
Double.TryParse takes care of all numeric values and not only ints.
Another option - LINQ!
public static class StringExtensions
{
public static bool IsDigits(this String text)
{
return !text.Any(c => !char.IsDigit(c));
}
}
Note that this assumes you only want digits 0-9. If you want to accept decimal point, sign, exponent, etc, then repalce IsDigit() with IsNumber().
I've been using the following small code snippet for years as a pure C# IsNumeric function.
Granted, it's not exactly the same as the Microsoft.VisualBasic library's IsNumeric function as that (if you look at the decompiled code) involves lots of type checking and usage of the IConvertible interface, however this small function has worked well for me.
Note also that this function uses double.TryParse rather than int.TryParse to allow both integer numbers (including long's) as well as floating point numbers to be parsed. Also note that this function specifically asserts an InvariantCulture when parsing (for example) floating point numbers, so will correctly identify both 123.00 and 123,00 (note the comma and decimal point separators) as floating point numbers.
using System;
using System.Globalization;
namespace MyNumberFunctions
{
public static class NumberFunctions
{
public static bool IsNumeric(this object expression)
{
if (expression == null)
{
return false;
}
double number;
return Double.TryParse(Convert.ToString(expression, CultureInfo.InvariantCulture), NumberStyles.Any, NumberFormatInfo.InvariantInfo, out number);
}
}
}
Usage is incredibly simple, since this is implemented as an extension method:
string myNumberToParse = "123.00";
bool isThisNumeric = myNumberToParse.IsNumeric();
public bool IsNumeric(string input)
{
int result;
return Int32.TryParse(input,out result);
}
try this:
public static bool IsNumeric(object o)
{
const NumberStyles sty = NumberStyles.Any;
double d;
return (o != null && Double.TryParse(o.ToString(), sty, null, out d));
}
You can still use the Visual Basic function in C#. The only thing you have to do is just follow my instructions shown below:
Add the reference to the Visual Basic Library by right clicking on your project and selecting "Add Reference":
Then import it in your class as shown below:
using Microsoft.VisualBasic;
Next use it wherever you want as shown below:
if (!Information.IsNumeric(softwareVersion))
{
throw new DataException(string.Format("[{0}] is an invalid App Version! Only numeric values are supported at this time.", softwareVersion));
}
Hope, this helps and good luck!
I'm wondering what the correct way to compare two characters ignoring case that will work for all cultures. Also, is Comparer<char>.Default the best way to test two characters without ignoring case? Does this work for surrogate-pairs?
EDIT: Added sample IComparer<char> implementation
If this helps anyone this is what I've decided to use
public class CaseInsensitiveCharComparer : IComparer<char> {
private readonly System.Globalization.CultureInfo ci;
public CaseInsensitiveCharComparer(System.Globalization.CultureInfo ci) {
this.ci = ci;
}
public CaseInsensitiveCharComparer()
: this(System.Globalization.CultureInfo.CurrentCulture) { }
public int Compare(char x, char y) {
return Char.ToUpper(x, ci) - Char.ToUpper(y, ci);
}
}
// Prints 3
Console.WriteLine("This is a test".CountChars('t', new CaseInsensitiveCharComparer()));
It depends on what you mean by "work for all cultures". Would you want "i" and "I" to be equal even in Turkey?
You could use:
bool equal = char.ToUpperInvariant(x) == char.ToUpperInvariant(y);
... but I'm not sure whether that "works" according to all cultures by your understanding of "works".
Of course you could convert both characters to strings and then perform whatever comparison you want on the strings. Somewhat less efficient, but it does give you all the range of comparisons available in the framework:
bool equal = x.ToString().Equals(y.ToString(),
StringComparison.InvariantCultureIgnoreCase);
For surrogate pairs, a Comparer<char> isn't going to be feasible anyway, because you don't have a single char. You could create a Comparer<int> though.
As I understand it, there isn't really a way that will "work for all cultures". Either you want to compare characters for some kind of internal, non-displayed-to-the-user reason (in which case you should use the InvariantCulture), or you want to use the CurrentCulture of the user. Obviously, using the user's current culture will mean that you will get different results in different locales, but they will be consistent with what your users in those locales will expect.
Without knowing more about WHY you are comparing two characters, I can't really advise you on which one you should be using.
I would recommend comparing uppercase, and if they don't match then comparing lowercase, just in case the locale's uppercasing and lowercasing logic behave slightly different.
Addendum
For example,
int CompareChar(char c1, char c2)
{
int dif;
dif = char.ToUpper(c1) - char.ToUpper(c2);
if (diff != 0)
dif = char.ToLower(c1) - char.ToLower(c2);
return dif;
}
What I was thinking that would be available within the runtime is something like the following
public class CaseInsensitiveCharComparer : IComparer<char> {
private readonly System.Globalization.CultureInfo ci;
public CaseInsensitiveCharComparer(System.Globalization.CultureInfo ci) {
this.ci = ci;
}
public CaseInsensitiveCharComparer()
: this(System.Globalization.CultureInfo.CurrentCulture) { }
public int Compare(char x, char y) {
return Char.ToUpper(x, ci) - Char.ToUpper(y, ci);
}
}
// Prints 3
Console.WriteLine("This is a test".CountChars('t', new CaseInsensitiveCharComparer()));
You could try:
class Test{
static int Compare(char t, char p){
return string.Compare(t.ToString(), p.ToString(), StringComparison.CurrentCultureIgnoreCase);
}
}
But I doubt this is the "optimal" way to do it, but I'm not all of the cases you need to be checking...
string.Compare("string a","STRING A",true)
It will work for every string
I know this is an old post, but things have changed since then.
The question above can be answered by using an extension. This would extend the char.Equals to allow for locality and case insensitivity.
In an extension class, add something such as:
internal static Boolean Equals(this Char src, Char ch, StringComparison comp)
{
Return $"{src}".Equals($"{ch}", comp);
}
I'm currently at work, so can't check this, but it should work.
Andy
You can provide last argument as true for caseInsensetive match
string.Compare(lowerCase, upperCase, true);
Consider the need for a function in C# that will test whether a string is a numeric value.
The requirements:
must return a boolean.
function should be able to allow for whole numbers, decimals, and negatives.
assume no using Microsoft.VisualBasic to call into IsNumeric(). Here's a case of reinventing the wheel, but the exercise is good.
Current implementation:
//determine whether the input value is a number
public static bool IsNumeric(string someValue)
{
Regex isNumber = new Regex(#"^\d+$");
try
{
Match m = isNumber.Match(someValue);
return m.Success;
}
catch (FormatException)
{return false;}
}
Question: how can this be improved so that the regex would match negatives and decimals? Any radical improvements that you'd make?
Just off of the top of my head - why not just use double.TryParse ? I mean, unless you really want a regexp solution - which I'm not sure you really need in this case :)
Can you just use .TryParse?
int x;
double y;
string spork = "-3.14";
if (int.TryParse(spork, out x))
Console.WriteLine("Yay it's an int (boy)!");
if (double.TryParse(spork, out y))
Console.WriteLine("Yay it's an double (girl)!");
Regex isNumber = new Regex(#"^[-+]?(\d*\.)?\d+$");
Updated to allow either + or - in front of the number.
Edit: Your try block isn't doing anything as none of the methods within it actually throw a FormatException. The entire method could be written:
// Determine whether the input value is a number
public static bool IsNumeric(string someValue)
{
return new Regex(#"^[-+]?(\d*\.)?\d+$").IsMatch(someValue);
}
Well, for negatives you'd need to include an optional minus sign at the start:
^-?\d+$
For decimals you'd need to account for a decimal point:
^-?\d*\.?\d*$
And possible exponential notation:
^-?\d*\.?\d*(e\d+)?$
I can't say that I would use regular expressions to check if a string is a numeric value. Slow and heavy for such a simple process. I would simply run over the string one character at a time until I enter an invalid state:
public static bool IsNumeric(string value)
{
bool isNumber = true;
bool afterDecimal = false;
for (int i=0; i<value.Length; i++)
{
char c = value[i];
if (c == '-' && i == 0) continue;
if (Char.IsDigit(c))
{
continue;
}
if (c == '.' && !afterDecimal)
{
afterDecimal = true;
continue;
}
isNumber = false;
break;
}
return isNumber;
}
The above example is simple, and should get the job done for most numbers. It is not culturally sensitive, however, but it should be strait-forward enough to make it culturally sensitive.
Also, make sure the resulting code passes the Turkey Test:
http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html
Unless you really want to use regex, Noldorin posted a nice extension method in another Q&A.
Update
As Patrick rightly pointed out, the link points to an extension method that check whether the object is a numeric type or not, not whether it represents a numeric value. Then using double.TryParse as suggested by Saulius and yodaj007 is probably the best choice, handling all sorts of quirks with different decimal separators, thousand separators and so on. Just wrap it up in a nice extension method:
public static bool IsNumeric(this string value)
{
double temp;
return double.TryParse(value.ToString(), out temp);
}
...and fire away:
string someValue = "89.9";
if (someValue.IsNumeric()) // will be true in the US, but not in Sweden
{
// wow, it's a number!
]
Can anyone think of a nicer way to do the following:
public string ShortDescription
{
get { return this.Description.Length <= 25 ? this.Description : this.Description.Substring(0, 25) + "..."; }
}
I would have liked to just do string.Substring(0, 25) but it throws an exception if the string is less than the length supplied.
I needed this so often, I wrote an extension method for it:
public static class StringExtensions
{
public static string SafeSubstring(this string input, int startIndex, int length, string suffix)
{
// Todo: Check that startIndex + length does not cause an arithmetic overflow - not that this is likely, but still...
if (input.Length >= (startIndex + length))
{
if (suffix == null) suffix = string.Empty;
return input.Substring(startIndex, length) + suffix;
}
else
{
if (input.Length > startIndex)
{
return input.Substring(startIndex);
}
else
{
return string.Empty;
}
}
}
}
if you only need it once, that is overkill, but if you need it more often then it can come in handy.
Edit: Added support for a string suffix. Pass in "..." and you get your ellipses on shorter strings, or pass in string.Empty for no special suffixes.
return this.Description.Substring(0, Math.Min(this.Description.Length, 25));
Doesn't have the ... part. Your way is probably the best, actually.
public static Take(this string s, int i)
{
if(s.Length <= i)
return s
else
return s.Substring(0, i) + "..."
}
public string ShortDescription
{
get { return this.Description.Take(25); }
}
The way you've done it seems fine to me, with the exception that I would use the magic number 25, I'd have that as a constant.
Do you really want to store this in your bean though? Presumably this is for display somewhere, so your renderer should be the thing doing the truncating instead of the data object
Well I know there's answer accepted already and I may get crucified for throwing out a regular expression here but this is how I usually do it:
//may return more than 25 characters depending on where in the string 25 characters is at
public string ShortDescription(string val)
{
return Regex.Replace(val, #"(.{25})[^\s]*.*","$1...");
}
// stricter version that only returns 25 characters, plus 3 for ...
public string ShortDescriptionStrict(string val)
{
return Regex.Replace(val, #"(.{25}).*","$1...");
}
It has the nice side benefit of not cutting a word in half as it always stops after the first whitespace character past 25 characters. (Of course if you need it to truncate text going into a database, that might be a problem.
Downside, well I'm sure it's not the fastest solution possible.
EDIT: replaced … with "..." since not sure if this solution is for the web!
without .... this should be the shortest :
public string ShortDescription
{
get { return Microsoft.VisualBasic.Left(this.Description;}
}
I think the approach is sound, though I'd recommend a few adjustments
Move the magic number to a const or configuration value
Use a regular if conditional rather than the ternary operator
Use a string.Format("{0}...") rather than + "..."
Have just one return point from the function
So:
public string ShortDescription
{
get
{
const int SHORT_DESCRIPTION_LENGTH = 25;
string _shortDescription = Description;
if (Description.Length > SHORT_DESCRIPTION_LENGTH)
{
_shortDescription = string.Format("{0}...", Description.Substring(0, SHORT_DESCRIPTION_LENGTH));
}
return _shortDescription;
}
}
For a more general approach, you might like to move the logic to an extension method:
public static string ToTruncated(this string s, int truncateAt)
{
string truncated = s;
if (s.Length > truncateAt)
{
truncated = string.Format("{0}...", s.Substring(0, truncateAt));
}
return truncated;
}
Edit
I use the ternary operator extensively, but prefer to avoid it if the code becomes sufficiently verbose that it starts to extend past 120 characters or so. In that case I'd like to wrap it onto multiple lines, so find that a regular if conditional is more readable.
Edit2
For typographical correctness you could also consider using the ellipsis character (…) as opposed to three dots/periods/full stops (...).
One way to do it:
int length = Math.Min(Description.Length, 25);
return Description.Substring(0, length) + "...";
There are two lines instead of one, but shorter ones :).
Edit:
As pointed out in the comments, this gets you the ... all the time, so the answer was wrong. Correcting it means we go back to the original solution.
At this point, I think using string extensions is the only option to shorten the code. And that makes sense only when that code is repeated in at least a few places...
Looks fine to me, being really picky I would replace "..." with the entity reference "…"
I can't think of any but your approach might not be the best. Are you adding presentation logic into your data object? If so then I suggest you put that logic elsewhere, for example a static StringDisplayUtils class with a GetShortStringMethod( int maxCharsToDisplay, string stringToShorten).
However, that approach might not be great either. What about different fonts and character sets? You'd have to start measuring the actual string length in terms of pixels. Check out the AutoEllipsis property on the winform's Label class (you'll prob need to set AutoSize to false if using this). The AutoEllipsis property, when true, will shorten a string and add the '...' chars for you.
I'd stick with what you have tbh, but just as an alternative, if you have LINQ to objects you could
new string(this.Description.ToCharArray().Take(25).ToArray())
//And to maintain the ...
+ (this.Description.Length <= 25 ? String.Empty : "...")
As others have said, you'd likely want to store 25 in a constant
You should see if you can reference the Microsoft.VisualBasic DLL into your app so you can make use of the "Left" function.