Getting Index of First Non Alpha character in a string C# - c#

I'm trying to split out a string (at the index) whenever I find the first non alpha or whitespace.
My Regex is really rusty and trying to find some direction on getting this to work.
Example: "Payments Received by 08/14/2015 $0.00" is the string. and I'm able to find the first digit
string alphabet = String.Empty;
string digit = String.Empty;
int digitStartIndex;
Match regexMatch = Regex.Match("Payments Received by 08/14/2015 $0.00", "\\d");
digitStartIndex = regexMatch.Index;
alphabet = line.Substring(0, digitStartIndex);
digit = line.Substring(digitStartIndex);
The problem lies when a string like "Amount This Period + $57.00"
I end up with "Amount This Period + $"
How from using Regex in C#, if I want to also include specific non-alphanumeric characters to check for such as $ + -?
Edit: I'm looking for the output (variables alphabet and digit) in the example above I'm struggling with to be.
"Amount This Period"
"+ $57.00"

To split a string the way you mention, use a regular expression to find the initial alpha/space chars and then the rest.
var s = "Payments Received by 08/14/2015 $0.00";
var re = new Regex("^([a-z ]+)(.+)", RegexOptions.IgnoreCase);
var m = re.Match(s);
if (m.Success)
{
Console.WriteLine(m.Groups[1]);
Console.WriteLine(m.Groups[2]);
}
The ^ is important to find characters at the start.

Ah, then you want this I think:
void Main()
{
var regex = new Regex(#"(.*?)([\$\+\-].*)");
var a = "Payments Received by 08/14/2015 $0.00";
var b = "Amount This Period + $57.00";
Console.WriteLine(regex.Match(a).Groups[1].Value);
Console.WriteLine(regex.Match(a).Groups[2].Value);
Console.WriteLine(regex.Match(b).Groups[1].Value);
Console.WriteLine(regex.Match(b).Groups[2].Value);
}
Outputs:
Payments Received by 08/14/2015
$0.00
Amount This Period
+ $57.00

Related

c#: how to match a variable string followed by a space and currency and pull the currency?

I'm trying to match the following cases and pull the number value:
"b 30.00"
"bill 30.00"
"bill 30"
"b 30"
I've tried:
var regex = new Regex("^b(?-i:ill)?$ ^$?d+(.d{2})?$", RegexOptions.IgnoreCase);
However, this doesn't seem to return a match, and I'm not sure how to pull the digit.
You haven't well understand how to use anchors ^ & $, read about this.
var regex = new Regex(#"^[Bb](?:ill)? \d+(?:\.\d{2})?$");
or better since you only need ascii digits (and not all possible digits of the world):
var regex = new Regex(#"^[Bb](?:ill)? [0-9]+(?:\.[0-9]{2})?$");
If you want to figure a literal . you must escape it (same thing for a literal $). Note the use of a verbatim string to avoid double backslashes.
Feel free to add capture groups around what you want to capture.
You didn't mention if RegEx is actually required to accomplish your goal. If RegEx is not required, and you know that your string is in a specific format, you could just split the string:
string val = "bill 30.00";
string[] split = val.Split(' ');
string name = string.Empty;
decimal currency = 0m;
if (split.Length > 1)
{
name = split[0];
decimal.TryParse(split[1], out currency);
}
new Regex (#"\b\d+(.\d {2})*") should give you what you want
Just try the code
string Value = "bill 30.00";
string resultString = Regex.Match(Value, #"\d+").Value;

Replace a part of string containing Password

Slightly similar to this question, I want to replace argv contents:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
to this:
"-help=none\n-URL=(default)\n-password=********\n-uname=Khanna\n-p=100"
I have tried very basic string find and search operations (using IndexOf, SubString etc.). I am looking for more elegant solution so as to replace this part of string:
-password=AnyPassword
to:
-password=*******
And keep other part of string intact. I am looking if String.Replace or Regex replace may help.
What I've tried (not much of error-checks):
var pwd_index = argv.IndexOf("--password=");
string converted;
if (pwd_index >= 0)
{
var leftPart = argv.Substring(0, pwd_index);
var pwdStr = argv.Substring(pwd_index);
var rightPart = pwdStr.Substring(pwdStr.IndexOf("\n") + 1);
converted = leftPart + "--password=********\n" + rightPart;
}
else
converted = argv;
Console.WriteLine(converted);
Solution
Similar to Rubens Farias' solution but a little bit more elegant:
string argv = "-help=none\n-URL=(default)\n-password=\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)[^\n]*", "$1********");
It matches password= literally, stores it in capture group $1 and the keeps matching until a \n is reached.
This yields a constant number of *'s, though. But telling how much characters a password has, might already convey too much information to hackers, anyway.
Working example: https://dotnetfiddle.net/xOFCyG
Regular expression breakdown
( // Store the following match in capture group $1.
password= // Match "password=" literally.
)
[ // Match one from a set of characters.
^ // Negate a set of characters (i.e., match anything not
// contained in the following set).
\n // The character set: consists only of the new line character.
]
* // Match the previously matched character 0 to n times.
This code replaces the password value by several "*" characters:
string argv = "-help=none\n-URL=(default)\n-password=look\n-uname=Khanna\n-p=100";
string result = Regex.Replace(argv, #"(password=)([\s\S]*?\n)",
match => match.Groups[1].Value + new String('*', match.Groups[2].Value.Length - 1) + "\n");
You can also remove the new String() part and replace it by a string constant

Regex to extract specific numbers in a String

string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?
Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999
You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345
If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345
From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.

RegEx - Find and Replace while ignoring a number in the middle?

In the middle of a long string, I am looking for "No. 1234. "
The number (1234) in my example above can be any length whole number. It also has to match on the space at the end.
So I am looking for examples:
1) This is a test No. 42. Hello Nice People
2) I have no idea wtf No. 1234412344124. I am doing.
I have figured out a way to match on this pattern with the following regex:
(No. [\d]{1,}. )'
What I cannot figure out, though, is how to do one simple thing when finding a match: Replace that last period with a darn comma!
So, with the two examples up above, I want to transform them into:
1) This is a test No. 42, Hello Nice People
2) I have no idea wtf No. 1234412344124, I am doing.
(Notice the commas now after the numbers)
How might one do this in C# and RegEx? Thank you!
EDIT:
Another way of looking at this is...
I can do this easily and have for years:
str = Replace(str, "Find this", "Replace it with this")
However, how can I do that by combining regex and the unknown portion of the string in the middle to replace the last period (not to be confused with the last character since the last character still needs to be a space)
This is a test No. 42. Hello Nice People
This is a test No. (some unknown length number). Hello Nice People
becomes
This is a test No. 42, Hello Nice People
This is a test No. (some unknown length number), Hello Nice People
(Notice the comma)
So you are essentially trying to match two adjacent groups, "\d+" and ". " then replace the second with ", ".
var r = new Regex(#"(\d+)(\. )");
var input = "This is a test No. 42. Hello Nice People";
var output = r.Replace(input, "$1, ");
Use the parenthesis to match two groups then with replace keep the first group and dump in the ", ".
Edit: derp, escape that period.
Edit - #1:
neilh's way is much better!
Ok, i know the code looks ugly.. i don't know how to edit the last char of a match directly in a regex
string[] stringhe = new string[5] {
"This is a test No. 42, Hello Nice People",
"I have no idea wtf No. 1234412344124. I am doing.",
"Very long No. 74385748957348957893458934; Hello World",
"Nope No. 48394839!!!",
"Nope"
};
Regex reg = new Regex(#"No.\s*([0-9]+)");
Match match;
int idx = 0;
StringBuilder builder;
foreach(string stringa in stringhe)
{
match = reg.Match(stringa);
if (match.Success)
{
Console.WriteLine("No. Stringa #" + idx + ": " + stringhe[idx]);
int indexEnd = match.Groups[1].Index + match.Groups[1].Length;
builder = new StringBuilder(stringa);
builder[indexEnd] = '.';
stringhe[idx] = builder.ToString();
Console.WriteLine("New String: " + stringhe[idx]);
}
++idx;
}
Console.ReadKey(true);
If you want to edit the char after the number of if it's a ',':
int indexEnd = match.Groups[1].Index + match.Groups[1].Length;
if (stringa[indexEnd] == ',')
{
builder = new StringBuilder(stringa);
builder[indexEnd] = '.';
stringhe[idx] = builder.ToString();
Console.WriteLine("New String: " + stringhe[idx]);
}
Or, we can edit the Regex to detect only if the number is followed by a comma with (better anyway)
No.\s*([0-9]+),
I'm not the best at Regex, but this should do what you want.
No.\s+([0-9]+)
If you except zero or more whitespaces between No. {NUMBER} this Regex should do the work:
No.\s*([0-9]+)
An example of how can look C# code:
string[] stringhe = new string[4] {
"This is a test No. 42, Hello Nice People",
"I have no idea wtf No. 1234412344124. I am doing.",
"Very long No. 74385748957348957893458934; Hello World",
"Nope No. 48394839!!!"
};
Regex reg = new Regex(#"No.\s+([0-9]+)");
Match match;
int idx = 0;
foreach(string stringa in stringhe)
{
match = reg.Match(stringa);
if (match.Success)
{
Console.WriteLine("No. Stringa #" + idx + ": " + match.Groups[1].Value);
}
++idx;
}
Here is the code :
private string Format(string input)
{
Match m = new Regex("No. [0-9]*.").Match(input);
int targetIndex = m.Index + m.Length - 1;
return input.Remove(targetIndex, 1).Insert(targetIndex, ",");
}

regex to strip number from var in string

I have a long string and I have a var inside it
var abc = '123456'
Now I wish to get the 123456 from it.
I have tried a regex but its not working properly
Regex regex = new Regex("(?<abc>+)=(?<var>+)");
Match m = regex.Match(body);
if (m.Success)
{
string key = m.Groups["var"].Value;
}
How can I get the number from the var abc?
Thanks for your help and time
var body = #" fsd fsda f var abc = '123456' fsda fasd f";
Regex regex = new Regex(#"var (?<name>\w*) = '(?<number>\d*)'");
Match m = regex.Match(body);
Console.WriteLine("name: " + m.Groups["name"]);
Console.WriteLine("number: " + m.Groups["number"]);
prints:
name: abc
number: 123456
Your regex is not correct:
(?<abc>+)=(?<var>+)
The + are quantifiers meaning that the previous characters are repeated at least once (and there are no characters since (?< ... > ... ) is named capture group and is not considered as a character per se.
You perhaps meant:
(?<abc>.+)=(?<var>.+)
And a better regex might be:
(?<abc>[^=]+)=\s*'(?<var>[^']+)'
[^=]+ will match any character except an equal sign.
\s* means any number of space characters (will also match tabs, newlines and form feeds though)
[^']+ will match any character except a single quote.
To specifically match the variable abc, you then put it like this:
(?<abc>abc)\s*=\s*'(?<var>[^']+)'
(I added some more allowances for spaces)
From the example you provided the number can be gotten such as
Console.WriteLine (
Regex.Match("var abc = '123456'", #"(?<var>\d+)").Groups["var"].Value); // 123456
\d+ means 1 or more numbers (digits).
But I surmise your data doesn't look like your example.
Try this:
var body = #"my word 1, my word 2, my word var abc = '123456' 3, my word x";
Regex regex = new Regex(#"(?<=var \w+ = ')\d+");
Match m = regex.Match(body);

Categories