Best way to parse a log file in C#

Best way to parse a log file in C# - c#

I have the following log file:
START:SOME_STRING
BL:2
LK:3
LH:5
end
START:SOME_STRING
BL:5
LK:6
LH:6
end
Which has multiple START: -> end structures inside. Is there a better 'non-sloppy' way of parsing this file rather than reading line by line and using SPLIT?

You can try to formalize your ini-file's grammar, and you some of parser generators. See this question for more detail.
Be aware howeveer that for such a simple grammar as yours it might be easier to parse manually :-P
class IniEntry
{
public int BL;
public int LK;
public int LH;
IniEntry Clone() { return new IniEntry { BL = BL, LK = LK, LH = LH }; }
}
IEnumerable<IniEntry> Parse()
{
IniEntry ie = new IniEntry();
while (ParseEntry(out ie))
yield return ie.Clone();
}
bool ParseEntry(out IniEntry ie)
{
ie = new IniEntry();
return ParseStart(ie) &&
ParseBL(ie) &&
ParseLK(ie) &&
ParseLH(ie) &&
ParseEnd(ie);
}
bool ParseStart(IniEntry ie)
{
string dummy;
return ParseLine("START", out dummy);
}
bool ParseBL(IniEntry ie)
{
string BL;
return ParseLine("BL", out BL) && int.TryParse(BL, out ie.BL);
}
bool ParseLK(IniEntry ie)
{
string LK;
return ParseLine("LK", out LK) && int.TryParse(LK, out ie.LK);
}
bool ParseLH(IniEntry ie)
{
string LH;
return ParseLine("LH", out LH) && string.TryParse(LH, out ie.LH);
}
bool ParseLine(string key, out string value)
{
string line = GetNextLine();
var parts = line.Split(":");
if (parts.Count != 2) return false;
if (parts[0] != key) return false;
value = parts[1];
}
etc.

This is a good candidate for a while loop and a state machine.
With this approach you would use even use less memory and have greater performance than using string.split()

If it is certain that the START/END are always matched, (apologies, my C# is embarrassing, so plain English):
Read the whole file with System.IO.ReadToEnd
Parse the whole thing in one go with a regular expression
Iterate over regex results
The regex would be something like "(START:([^$]+)$BL:([^$]+)$LK:([^$]+)$LH:([^$]+)$end$)+", off the top of my head, you'll need to validate/adjust according to how your parameters BL/LK etc. occur

Related

How Validate input as only character and can have space between in C#

I need to have a method that validate the input to make sure it is only character I also allow white space like, "Alton Drive" the code I have is only allow character no white space.
public bool IsCorrectString(string val)
{
foreach (char c in val)
{
if (!char.IsLetter(c))
return false;
}
return true;
}
I am not sure if regex is better to use here or not?

A regex will certainly be much simpler.
^[A-Za-z\s]*$
This regex will match letters and whitespace only, and will fail for a string that contains anything else.
You'll want to use IsMatch for this.
For example:
public bool IsCorrectString(string val)
{
return Regex.IsMatch(val, #"^[A-Za-z\s]*$")
}

Boolean HasSpecialChars(string yourString)
{
return yourString.Any(ch => !Char.IsLetter(ch) && !Char.IsWhiteSpace(ch));
}

You can use Char.IsWhiteSpace:
public bool IsCorrectString(string val)
{
foreach (char c in val)
{
if (!char.IsWhiteSpace(c) && !char.IsLetter(c))
return false;
}
return true;
}
or shorter:
return val.All(c => Char.IsWhiteSpace(c) || Char.IsLetter(c));
Note that it also includes other characters like tabs or newlines. If that's not desired use (space).

You can use this condition:
if (!char.IsLetter(c) && !char.IsWhiteSpace(c)) {
return false;
}

You can do it with LINQ in one statement:
public bool IsCorrectString(string val)
{
return val.All(x => char.IsLetter(x) || char.IsWhiteSpace(x));
}
If you want to allow space only between two words then you can also use Split like this:
public bool IsCorrectString(string val)
{
return val.Split().All(x => x.All(char.IsLetter) && x != string.Empty);
}

You can just augment your if statement from; if (!char.IsLetter(c)) to if (!char.IsLetter(c) || c != ' ') to solve the problem. You could also use RegEx, the code would be cleaner but the performance would be worse. Your code could also be cleaned up and I personally think that would be the best choice (use iteration but have cleaner code). Here's my clean 1 line version with LINQ;
return val.Where(c => !char.IsLetter(c) || !char.IsWhiteSpace(c)).Count() == 0;

Using unicode characters bigger than 2 bytes with .Net

I'm using this code to generate U+10FFFC
var s = Encoding.UTF8.GetString(new byte[] {0xF4,0x8F,0xBF,0xBC});
I know it's for private-use and such, but it does display a single character as I'd expect when displaying it. The problems come when manipulating this unicode character.
If I later do this:
foreach(var ch in s)
{
Console.WriteLine(ch);
}
Instead of it printing just the single character, it prints two characters (i.e. the string is apparently composed of two characters). If I alter my loop to add these characters back to an empty string like so:
string tmp="";
foreach(var ch in s)
{
Console.WriteLine(ch);
tmp += ch;
}
At the end of this, tmp will print just a single character.
What exactly is going on here? I thought that char contains a single unicode character and I never had to worry about how many bytes a character is unless I'm doing conversion to bytes. My real use case is I need to be able to detect when very large unicode characters are used in a string. Currently I have something like this:
foreach(var ch in s)
{
if(ch>=0x100000 && ch<=0x10FFFF)
{
Console.WriteLine("special character!");
}
}
However, because of this splitting of very large characters, this doesn't work. How can I modify this to make it work?

U+10FFFC is one Unicode code point, but string's interface does not expose a sequence of Unicode code points directly. Its interface exposes a sequence of UTF-16 code units. That is a very low-level view of text. It is quite unfortunate that such a low-level view of text was grafted onto the most obvious and intuitive interface available... I'll try not to rant much about how I don't like this design, and just say that not matter how unfortunate, it is just a (sad) fact you have to live with.
First off, I will suggest using char.ConvertFromUtf32 to get your initial string. Much simpler, much more readable:
var s = char.ConvertFromUtf32(0x10FFFC);
So, this string's Length is not 1, because, as I said, the interface deals in UTF-16 code units, not Unicode code points. U+10FFFC uses two UTF-16 code units, so s.Length is 2. All code points above U+FFFF require two UTF-16 code units for their representation.
You should note that ConvertFromUtf32 doesn't return a char: char is a UTF-16 code unit, not a Unicode code point. To be able to return all Unicode code points, that method cannot return a single char. Sometimes it needs to return two, and that's why it makes it a string. Sometimes you will find some APIs dealing in ints instead of char because int can be used to handle all code points too (that's what ConvertFromUtf32 takes as argument, and what ConvertToUtf32 produces as result).
string implements IEnumerable<char>, which means that when you iterate over a string you get one UTF-16 code unit per iteration. That's why iterating your string and printing it out yields some broken output with two "things" in it. Those are the two UTF-16 code units that make up the representation of U+10FFFC. They are called "surrogates". The first one is a high/lead surrogate and the second one is a low/trail surrogate. When you print them individually they do not produce meaningful output because lone surrogates are not even valid in UTF-16, and they are not considered Unicode characters either.
When you append those two surrogates to the string in the loop, you effectively reconstruct the surrogate pair, and printing that pair later as one gets you the right output.
And in the ranting front, note how nothing complains that you used a malformed UTF-16 sequence in that loop. It creates a string with a lone surrogate, and yet everything carries on as if nothing happened: the string type is not even the type of well-formed UTF-16 code unit sequences, but the type of any UTF-16 code unit sequence.
The char structure provides static methods to deal with surrogates: IsHighSurrogate, IsLowSurrogate, IsSurrogatePair, ConvertToUtf32, and ConvertFromUtf32. If you want you can write an iterator that iterates over Unicode characters instead of UTF-16 code units:
static IEnumerable<int> AsCodePoints(this string s)
{
for(int i = 0; i < s.Length; ++i)
{
yield return char.ConvertToUtf32(s, i);
if(char.IsHighSurrogate(s, i))
i++;
}
}
Then you can iterate like:
foreach(int codePoint in s.AsCodePoints())
{
// do stuff. codePoint will be an int will value 0x10FFFC in your example
}
If you prefer to get each code point as a string instead change the return type to IEnumerable<string> and the yield line to:
yield return char.ConvertFromUtf32(char.ConvertToUtf32(s, i));
With that version, the following works as-is:
foreach(string codePoint in s.AsCodePoints())
{
Console.WriteLine(codePoint);
}

As posted already by Martinho, it is much easier to create the string with this private codepoint that way:
var s = char.ConvertFromUtf32(0x10FFFC);
But to loop through the two char elements of that string is senseless:
foreach(var ch in s)
{
Console.WriteLine(ch);
}
What for? You will just get the high and low surrogate that encode the codepoint. Remember a char is a 16 bit type so it can hold just a max value of 0xFFFF. Your codepoint doesn't fit into a 16 bit type, indeed for the highest codepoint you'll need 21 bits (0x10FFFF) so the next wider type would just be a 32 bit type. The two char elements are not characters, but a surrogate pair. The value of 0x10FFFC is encoded into the two surrogates.

While #R. Martinho Fernandes's answer is correct, his AsCodePoints extension method has two issues:
It will throw an ArgumentException on invalid code points (high surrogate without low surrogate or vice versa).
You can't use char static methods that take (char) or (string, int) (such as char.IsNumber()) if you only have int code points.
I've split the code into two methods, one similar to the original but returns the Unicode Replacement Character on invalid code points. The second method returns a struct IEnumerable with more useful fields:
StringCodePointExtensions.cs
public static class StringCodePointExtensions {
const char ReplacementCharacter = '\ufffd';
public static IEnumerable<CodePointIndex> CodePointIndexes(this string s) {
for (int i = 0; i < s.Length; i++) {
if (char.IsHighSurrogate(s, i)) {
if (i + 1 < s.Length && char.IsLowSurrogate(s, i + 1)) {
yield return CodePointIndex.Create(i, true, true);
i++;
continue;
} else {
// High surrogate without low surrogate
yield return CodePointIndex.Create(i, false, false);
continue;
}
} else if (char.IsLowSurrogate(s, i)) {
// Low surrogate without high surrogate
yield return CodePointIndex.Create(i, false, false);
continue;
}
yield return CodePointIndex.Create(i, true, false);
}
}
public static IEnumerable<int> CodePointInts(this string s) {
return s
.CodePointIndexes()
.Select(
cpi => {
if (cpi.Valid) {
return char.ConvertToUtf32(s, cpi.Index);
} else {
return (int)ReplacementCharacter;
}
});
}
}
CodePointIndex.cs:
public struct CodePointIndex {
public int Index;
public bool Valid;
public bool IsSurrogatePair;
public static CodePointIndex Create(int index, bool valid, bool isSurrogatePair) {
return new CodePointIndex {
Index = index,
Valid = valid,
IsSurrogatePair = isSurrogatePair,
};
}
}
To the extent possible under law, the person who associated CC0 with this work has waived all copyright and related or neighboring rights to this work.

Yet another alternative to enumerate the UTF32 characters in a C# string is to use the System.Globalization.StringInfo.GetTextElementEnumerator method, as in the code below.
public static class StringExtensions
{
public static System.Collections.Generic.IEnumerable<UTF32Char> GetUTF32Chars(this string s)
{
var tee = System.Globalization.StringInfo.GetTextElementEnumerator(s);
while (tee.MoveNext())
{
yield return new UTF32Char(s, tee.ElementIndex);
}
}
}
public struct UTF32Char
{
private string s;
private int index;
public UTF32Char(string s, int index)
{
this.s = s;
this.index = index;
}
public override string ToString()
{
return char.ConvertFromUtf32(this.UTF32Code);
}
public int UTF32Code { get { return char.ConvertToUtf32(s, index); } }
public double NumericValue { get { return char.GetNumericValue(s, index); } }
public UnicodeCategory UnicodeCategory { get { return char.GetUnicodeCategory(s, index); } }
public bool IsControl { get { return char.IsControl(s, index); } }
public bool IsDigit { get { return char.IsDigit(s, index); } }
public bool IsLetter { get { return char.IsLetter(s, index); } }
public bool IsLetterOrDigit { get { return char.IsLetterOrDigit(s, index); } }
public bool IsLower { get { return char.IsLower(s, index); } }
public bool IsNumber { get { return char.IsNumber(s, index); } }
public bool IsPunctuation { get { return char.IsPunctuation(s, index); } }
public bool IsSeparator { get { return char.IsSeparator(s, index); } }
public bool IsSurrogatePair { get { return char.IsSurrogatePair(s, index); } }
public bool IsSymbol { get { return char.IsSymbol(s, index); } }
public bool IsUpper { get { return char.IsUpper(s, index); } }
public bool IsWhiteSpace { get { return char.IsWhiteSpace(s, index); } }
}

How to check if a string has at least 1 alphabetic character? [duplicate]

This question already has answers here:
How can I generate random alphanumeric strings?
(36 answers)
Closed 2 years ago.
My ASP.NET application requires me to generate a huge number of random strings such that each contain at least 1 alphabetic and numeric character and should be alphanumeric on the whole.
For this my logic is to generate the code again if the random string is numeric:
public static string GenerateCode(int length)
{
if (length < 2 || length > 32)
{
throw new RSGException("Length cannot be less than 2 or greater than 32.");
}
string newcode = Guid.NewGuid().ToString("n").Substring(0, length).ToUpper();
return newcode;
}
public static string GenerateNonNumericCode(int length)
{
string newcode = string.Empty;
try
{
newcode = GenerateCode(length);
}
catch (Exception)
{
throw;
}
while (IsNumeric(newcode))
{
return GenerateNonNumericCode(length);
}
return newcode;
}
public static bool IsNumeric(string str)
{
bool isNumeric = false;
try
{
long number = Convert.ToInt64(str);
isNumeric = true;
}
catch (Exception)
{
isNumeric = false;
}
return isNumeric;
}
While debugging, it is working properly but when I ask it to create 10,000 random strings, its not able to handle it properly. When I export that data to Excel, I find at least 20 strings on an average that are numeric.
Is it a problem with my code or C#? - Mine.
If anyone's looking for code,
public static string GenerateCode(int length)
{
if (length < 2)
{
throw new A1Exception("Length cannot be less than 2.");
}
var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
var random = new Random();
var result = new string(
Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)])
.ToArray());
return result;
}
public static string GenerateAlphaNumericCode(int length)
{
string newcode = string.Empty;
try
{
newcode = GenerateCode(length);
while (!IsAlphaNumeric(newcode))
{
newcode = GenerateCode(length);
}
}
catch (Exception)
{
throw;
}
return newcode;
}
public static bool IsAlphaNumeric(string str)
{
bool isAlphaNumeric = false;
Regex reg = new Regex("[0-9A-Z]+");
isAlphaNumeric = reg.IsMatch(str);
return isAlphaNumeric;
}
Thanks to all for your ideas.

If you want to stick with the Guid as the generator, you could always validate using a Regex
This will only return true if at least one alpha is present
Regex reg = new Regex("[a-zA-Z]+");
Then just use the IsMatch method to see if your string is valid
That way you don't need the (IMHO rather ugly) try..catch around the Convert.
Update : I see your subsequent comment about actually making your code slower. Are you instantiating the Regex object only once, or every time that the test is being done? If the latter then this will be rather inefficient, and you should consider using a "lazy-loaded" property on your class, e.g.
private Regex reg;
private Regex AlphaRegex
{
get
{
if (reg == null) reg = new Regex("[a-zA-Z]+");
return reg;
}
}
Then just use AlphaRegex.IsMatch() in your method. I would expect this to make a difference.

use name space then using System.Linq; use normal string
check whether the string consist at lest one character or number.
using System.Linq;
string StrCheck = "abcd123";
check the string has characters ---> StrCheck.Any(char.IsLetter)
check the string has numbers ---> StrCheck.Any(char.IsDigit)
if (StrCheck.Any(char.IsLetter) && StrCheck.Any(char.IsDigit))
{
//statement goes here.....
}
sorry for the late reply ...

I didn't quite understand what you want in the string except letters (abc etc) - lets say numbers.
You can generate a random character as following:
Random r = new Random();
r.Next('a', 'z'); //For lowercase
r.Next('A', 'Z'); //For capitals
//or you can convert lowercase to capital:
char c = 'k' + ('A' - 'a');
If you want to create a string:
var s = new StringBuilder();
for(int i = 0; i < length; ++i)
s.Append((char)r.Next('a', 'z' + 1)); //Changed to char
return s.ToString();
Note: I don't know ASP.NET so much, so I just act like it's C#.

To answer your question strictly, using your existing code: there is a problem with your recursion logic, which can be avoided by not using recursion (there is absolutely no reason to use recursion in GenerateNonNumericCode). Do the following instead:
public static string GenerateNonNumericCode(int length)
{
string newcode = GenerateCode(length);
while (IsNumeric(newcode))
{
newcode = GenerateCode(length);
}
return newcode;
}
Other General Notes
Your code is very inefficient--throwing exceptions is expensive, so using try/catch in a loop is therefore slow and pointless. As others have suggested, regex makes more sense (System.Text.RegularExpressions namespace).
Is it a problem with my code or C#?
When in doubt, the problem is almost never C#.

So, I would change the code to this:
static Random r = new Random();
public static string GenerateNonNumericCodeFaster(int length) {
var firstLength = r.Next(0, length - 1);
var secondLength = length - 1 - firstLength;
return GenerateCode(firstLength)
+ (char) r.Next((int)'A', (int)'G')
+ GenerateCode(secondLength);
}
You can keep your GenerateCode function as is. Everything else you toss out. The idea here of course is, rather than testing if the string contains an alphabetic character, you just explicitly PUT one in. In my tests, using this code could generate 10,000 8 character strings in 0.0172963 seconds compared to your code which takes around 52 seconds. So, yeah, this is about 3000 times faster :)

Compare two values using RegEx

If I have two values eg/ABC001 and ABC100 or A0B0C1 and A1B0C0, is there a RegEx I can use to make sure the two values have the same pattern?

Well, here's my shot at it. This doesn't use regular expressions, and assumes s1 and s2 only contain numbers or digits:
public static bool SamePattern(string s1, string s2)
{
if (s1.Length == s2.Length)
{
char[] chars1 = s1.ToCharArray();
char[] chars2 = s2.ToCharArray();
for (int i = 0; i < chars1.Length; i++)
{
if (!Char.IsDigit(chars1[i]) && chars1[i] != chars2[i])
{
return false;
}
else if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
{
return false;
}
}
return true;
}
else
{
return false;
}
}
A description of the algorithm is as follows:
If the strings have different lengths, return false.
Otherwise, check the characters in the same position in both strings:
If they are both digits or both numbers, move on to the next iteration.
If they aren't digits but aren't the same, return false.
If one is a digit and one is a number, return false.
If all characters in both strings were checked successfully, return true.

If you don't know the pattern in advance, but are only going to encounter two groups of characters (alpha and digits), then you could do the following:
Write some C# that parsed the first pattern, looking at each char and determine if it's alpha, or digit, then generate a regex accordingly from that pattern.
You may find that there's no point writing code to generate a regex, as it could be just as simple to check the second string against the first.
Alternatively, without regex:
First check the strings are the same length.
Then loop through both strings at the same time, char by char. If char[x] from string 1 is alpha, and char[x] from string two is the same, you're patterns are matching.
Try this, it should cope if a string sneaks in some symbols. Edited to compare character values ... and use Char.IsLetter and Char.IsDigit
private bool matchPattern(string string1, string string2)
{
bool result = (string1.Length == string2.Length);
char[] chars1 = string1.ToCharArray();
char[] chars2 = string2.ToCharArray();
for (int i = 0; i < string1.Length; i++)
{
if (Char.IsLetter(chars1[i]) != Char.IsLetter(chars2[i]))
{
result = false;
}
if (Char.IsLetter(chars1[i]) && (chars1[i] != chars2[i]))
{
//Characters must be identical
result = false;
}
if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
result = false;
}
return result;
}

Consider using Char.GetUnicodeCategory
You can write a helper class for this task:
public class Mask
{
public Mask(string originalString)
{
OriginalString = originalString;
CharCategories = originalString.Select(Char.GetUnicodeCategory).ToList();
}
public string OriginalString { get; private set; }
public IEnumerable<UnicodeCategory> CharCategories { get; private set; }
public bool HasSameCharCategories(Mask other)
{
//null checks
return CharCategories.SequenceEqual(other.CharCategories);
}
}
Use as
Mask mask1 = new Mask("ab12c3");
Mask mask2 = new Mask("ds124d");
MessageBox.Show(mask1.HasSameCharCategories(mask2).ToString());

I don't know C# syntax but here is a pseudo code:
split the strings on ''
sort the 2 arrays
join each arrays with ''
compare the 2 strings

A general-purpose solution with LINQ can be achieved quite easily. The idea is:
Sort the two strings (reordering the characters).
Compare each sorted string as a character sequence using SequenceEquals.
This scheme enables a short, graceful and configurable solution, for example:
// We will be using this in SequenceEquals
class MyComparer : IEqualityComparer<char>
{
public bool Equals(char x, char y)
{
return x.Equals(y);
}
public int GetHashCode(char obj)
{
return obj.GetHashCode();
}
}
// and then:
var s1 = "ABC0102";
var s2 = "AC201B0";
Func<char, double> orderFunction = char.GetNumericValue;
var comparer = new MyComparer();
var result = s1.OrderBy(orderFunction).SequenceEqual(s2.OrderBy(orderFunction), comparer);
Console.WriteLine("result = " + result);
As you can see, it's all in 3 lines of code (not counting the comparer class). It's also very very easily configurable.
The code as it stands checks if s1 is a permutation of s2.
Do you want to check if s1 has the same number and kind of characters with s2, but not necessarily the same characters (e.g. "ABC" to be equal to "ABB")? No problem, change MyComparer.Equals to return char.GetUnicodeCategory(x).Equals(char.GetUnicodeCategory(y));.
By changing the values of orderFunction and comparer you can configure a multitude of other comparison options.
And finally, since I don't find it very elegant to define a MyComparer class just to enable this scenario, you can also use the technique described in this question:
Wrap a delegate in an IEqualityComparer
to define your comparer as an inline lambda. This would result in a configurable solution contained in 2-3 lines of code.

What is the C# equivalent of NaN or IsNumeric?

What is the most efficient way of testing an input string whether it contains a numeric value (or conversely Not A Number)? I guess I can use Double.Parse or a regex (see below) but I was wondering if there is some built in way to do this, such as javascript's NaN() or IsNumeric() (was that VB, I can't remember?).
public static bool IsNumeric(this string value)
{
return Regex.IsMatch(value, "^\\d+$");
}

This doesn't have the regex overhead
double myNum = 0;
String testVar = "Not A Number";
if (Double.TryParse(testVar, out myNum)) {
// it is a number
} else {
// it is not a number
}
Incidentally, all of the standard data types, with the glaring exception of GUIDs, support TryParse.
update
secretwep brought up that the value "2345," will pass the above test as a number. However, if you need to ensure that all of the characters within the string are digits, then another approach should be taken.
example 1:
public Boolean IsNumber(String s) {
Boolean value = true;
foreach(Char c in s.ToCharArray()) {
value = value && Char.IsDigit(c);
}
return value;
}
or if you want to be a little more fancy
public Boolean IsNumber(String value) {
return value.All(Char.IsDigit);
}
update 2 ( from #stackonfire to deal with null or empty strings)
public Boolean IsNumber(String s) {
Boolean value = true;
if (s == String.Empty || s == null) {
value=false;
} else {
foreach(Char c in s.ToCharArray()) {
value = value && Char.IsDigit(c);
}
} return value;
}

I prefer something like this, it lets you decide what NumberStyle to test for.
public static Boolean IsNumeric(String input, NumberStyles numberStyle) {
Double temp;
Boolean result = Double.TryParse(input, numberStyle, CultureInfo.CurrentCulture, out temp);
return result;
}

In addition to the previous correct answers it is probably worth pointing out that "Not a Number" (NaN) in its general usage is not equivalent to a string that cannot be evaluated as a numeric value. NaN is usually understood as a numeric value used to represent the result of an "impossible" calculation - where the result is undefined. In this respect I would say the Javascript usage is slightly misleading. In C# NaN is defined as a property of the single and double numeric types and is used to refer explicitly to the result of diving zero by zero. Other languages use it to represent different "impossible" values.

I know this has been answered in many different ways, with extensions and lambda examples, but a combination of both for the simplest solution.
public static bool IsNumeric(this String s)
{
return s.All(Char.IsDigit);
}
or if you are using Visual Studio 2015 (C# 6.0 or greater) then
public static bool IsNumeric(this String s) => s.All(Char.IsDigit);
Awesome C#6 on one line. Of course this is limited because it just tests for only numeric characters.
To use, just have a string and call the method on it, such as:
bool IsaNumber = "123456".IsNumeric();

Yeah, IsNumeric is VB. Usually people use the TryParse() method, though it is a bit clunky. As you suggested, you can always write your own.
int i;
if (int.TryParse(string, out i))
{
}

I like the extension method, but don't like throwing exceptions if possible.
I opted for an extension method taking the best of 2 answers here.
/// <summary>
/// Extension method that works out if a string is numeric or not
/// </summary>
/// <param name="str">string that may be a number</param>
/// <returns>true if numeric, false if not</returns>
public static bool IsNumeric(this String str)
{
double myNum = 0;
if (Double.TryParse(str, out myNum))
{
return true;
}
return false;
}

You can still use the Visual Basic function in C#. The only thing you have to do is just follow my instructions shown below:
Add the reference to the Visual Basic Library by right clicking on your project and selecting "Add Reference":
Then import it in your class as shown below:
using Microsoft.VisualBasic;
Next use it wherever you want as shown below:
if (!Information.IsNumeric(softwareVersion))
{
throw new DataException(string.Format("[{0}] is an invalid App Version! Only numeric values are supported at this time.", softwareVersion));
}
Hope, this helps and good luck!

VB has the IsNumeric function. You could reference Microsoft.VisualBasic.dll and use it.

Simple extension:
public static bool IsNumeric(this String str)
{
try
{
Double.Parse(str.ToString());
return true;
}
catch {
}
return false;
}

public static bool IsNumeric(string anyString)
{
if (anyString == null)
{
anyString = "";
}
if (anyString.Length > 0)
{
double dummyOut = new double();
System.Globalization.CultureInfo cultureInfo = new System.Globalization.CultureInfo("en-US", true);
return Double.TryParse(anyString, System.Globalization.NumberStyles.Any, cultureInfo.NumberFormat, out dummyOut);
}
else
{
return false;
}
}

Maybe this is a C# 3 feature, but you could use double.NaN.

Actually, Double.NaN is supported in all .NET versions 2.0 and greater.

I was using Chris Lively's snippet (selected answer) encapsulated in a bool function like Gishu's suggestion for a year or two. I used it to make sure certain query strings were only numeric before proceeding with further processing. I started getting some errant querystrings that the marked answer was not handling, specifically, whenever a comma was passed after a number like "3645," (returned true). This is the resulting mod:
static public bool IsNumeric(string s)
{
double myNum = 0;
if (Double.TryParse(s, out myNum))
{
if (s.Contains(",")) return false;
return true;
}
else
{
return false;
}
}

This is a modified version of the solution proposed by Mr Siir. I find that adding an extension method is the best solution for reuse and simplicity in the calling method.
public static bool IsNumeric(this String s)
{
try { double.Parse(s); return true; }
catch (Exception) { return false; }
}
I modified the method body to fit on 2 lines and removed the unnecessary .ToString() implementation. For those not familiar with extension methods here is how to implement:
Create a class file called ExtensionMethods. Paste in this code:
using System;
using System.Collections.Generic;
using System.Text;
namespace YourNameSpaceHere
{
public static class ExtensionMethods
{
public static bool IsNumeric(this String s)
{
try { double.Parse(s); return true; }
catch (Exception) { return false; }
}
}
}
Replace YourNameSpaceHere with your actual NameSpace. Save changes. Now you can use the extension method anywhere in your app:
bool validInput = stringVariable.IsNumeric();
Note: this method will return true for integers and decimals, but will return false if the string contains a comma. If you want to accept input with commas or symbols like "$" I would suggest implementing a method to remove those characters first then test if IsNumeric.

I have a slightly different version which returns the number. I would guess that in most cases after testing the string you would want to use the number.
public bool IsNumeric(string numericString, out Double numericValue)
{
if (Double.TryParse(numericString, out numericValue))
return true;
else
return false;
}

If you don't want the overhead of adding the Microsoft.VisualBasic library just for isNumeric, here's the code reverse engineered:
public bool IsNumeric(string s)
{
if (s == null) return false;
int state = 0; // state 0 = before number, state 1 = during number, state 2 = after number
bool hasdigits = false;
bool hasdollar = false;
bool hasperiod = false;
bool hasplusminus = false;
bool hasparens = false;
bool inparens = false;
for (var i = 0; i <= s.Length - 1; i++)
{
switch (s[i])
{
case char n when (n >= '0' && n <= '9'):
if (state == 2) return false; // no more numbers at the end (i.e. "1 2" is not valid)
if (state == 0) state = 1; // begin number state
hasdigits = true;
break;
case '-':
case '+':
// a plus/minus is allowed almost anywhere, but only one, and you cannot combine it with parenthesis
if (hasplusminus || hasparens) return false;
if (state == 1) state = 2; // exit number state (i.e. "1-" is valid but 1-1 is not)
hasplusminus = true;
break;
case ' ':
case '\t':
case '\r':
case '\n':
// don't allow any spaces after parenthesis/plus/minus, unless there's a $
if (state == 0 && (hasparens || (hasplusminus && !hasdollar))) return false;
if (state == 1) state = 2; // exit number state
break;
case ',':
// do not allow commas unless in the middle of the number, and not after a decimal
if (state != 1 || hasperiod) return false;
break;
case '.':
// only allow one period in the number
if (hasperiod || state == 2) return false;
if (state == 0) state = 1; // begin number state
hasperiod = true;
break;
case '$':
// dollar symbol allowed anywhere, but only one
if (hasdollar) return false;
if (state == 1) state = 2; // exit number state (i.e. "1$" is valid but "1$1" is not)
hasdollar = true;
break;
case '(':
// only allow one parens at the beginning, and cannot combine with plus/minus
if (state != 0 || hasparens || hasplusminus) return false;
hasparens = true;
inparens = true;
break;
case ')':
if (state == 1 && inparens) state = 2; // exit number state
if (state != 2 || !inparens) return false; // invalid end parens
inparens = false; // end parens mode
break;
default:
// oh oh, we hit a bad character
return false;
}
}
// must have at leats one digit, and cannot have imbalanced parenthesis
if (!hasdigits || inparens) return false;
// if we got all the way to here...
return true;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Best way to parse a log file in C# - c#

I have the following log file: START:SOME_STRING BL:2 LK:3 LH:5 end START:SOME_STRING BL:5 LK:6 LH:6 end Which has multiple START: -> end structures inside. Is there a better 'non-sloppy' way of parsing this file rather than reading line by line and using SPLIT?

This is a good candidate for a while loop and a state machine. With this approach you would use even use less memory and have greater performance than using string.split()

Related

How Validate input as only character and can have space between in C#

Using unicode characters bigger than 2 bytes with .Net

How to check if a string has at least 1 alphabetic character? [duplicate]

Compare two values using RegEx

What is the C# equivalent of NaN or IsNumeric?

Categories

Resources