regexp for find number in a string - c#

I have the following string fromat:
session=11;reserID=1000001
How to get string array of number?
My code:
var value = "session=11;reserID=1000001";
var numbers = Regex.Split(value, #"^\d+");

You probably were on the right track but forgot the character class:
Regex.Split(value, #"[^\d]+");
You can also write it shorter by using \D+ which is equivalent.
However, you'd get an empty element at the start of the returned array, so caution when consuming the result. Sadly, Regex.Split() doesn't have an option that removes empty elements (String.Split does, however). A not very pretty way of resolving that:
Regex.Replace(value, #"[^\d;]", "").Split(';');
based on the assumption that the semicolon is actually the relevant piece where you want to split.
Quick PowerShell test:
PS> 'session=11;reserID=1000001' -replace '[^\d;]+' -split ';'
11
1000001
Another option would be to just skip the element:
Regex.Split(...).Skip(1).ToArray();

Regex
.Matches("session=11;reserID=1000001", #"\d+") //match all digit groupings
.Cast<Match>() //promote IEnumerable to IEnumerable<Match> so we can use Linq
.Select(m => m.Value) //for each Match, select its (string) Value
.ToArray() //convert to array, as per question

.Net has built in feature without using RegEx.Try System.Web.HttpUtility.ParseQueryString, passing the string. You would need to reference the System.Web assembly, but it shouldn't require a web context.
var value = "session=11;reserID=1000001";
NameValueCollection numbers =
System.Web.HttpUtility.ParseQueryString(value.Replace(";","&"));

I will re-use my code from another question:
private void button1_Click(object sender, EventArgs e)
{
string sauce = htm.Text; //htm = textbox
Regex myRegex = new Regex(#"[0-9]+(?:\.[0-9]*)?", RegexOptions.Compiled);
foreach (Match iMatch in myRegex.Matches(sauce))
{
txt.AppendText(Environment.NewLine + iMatch.Value);//txt= textbox
}
}
If you want to play around with regex here is a good site: http://gskinner.com/RegExr/
They also have a desktop app: http://gskinner.com/RegExr/desktop/ - It uses adobe air so install that first.

var numbers = Regex.Split(value, #".*?(.\d+).*?");
or
to return each digit:
var numbers = Regex.Split(value, #".*?(\d).*?");

Related

How to move 12 digit numbers from richtextbox to textbox2

I want to move 12 digit numbers from richtextbox to textbox2 by a program.
I enter these words for richtextbox
sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd
I need to get only these 12 digit numbers to textbox2..
I tried this code but it types System.Text.RegularExpressions.MatchCollection not these digits
Here i use code for that
private void button2_Click(object sender, EventArgs e)
{
Regex RX = new Regex("[0-9]{1,12}$");
textBox2.Text = (RX.Matches(richTextBox1.Text)).ToString();
}
I don't know how to move these numebrs to the textbox2.. Please help me enter image description here
Split with a comma, then take all items that are of length 12 and are all digits:
var richTextBox1_Text = "sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd";
Console.Write(
string.Join(",",
richTextBox1_Text.Split(',')
.Where(m=>m.Length==12 && m.All(char.IsDigit))));
# => 512025151988,512025151988,512025151988,512025151988,512025151988
See the C# demo
In your code:
textBox2.Text = string.Join(",",
richTextBox1.Text.Split(',')
.Where(m=>m.Length==12 && m.All(char.IsDigit)));
For more complex scenarios, use a \b\d{12}\b regex like this:
textBox2.Text = string.Join("\r\n",
Regex.Matches(richTextBox1.Text, #"\b\d{12}\b")
.Cast<Match>()
.Select(m => m.Value));
I have created a method that returns a collection of strings, each containing a number.
public static IEnumerable<string> SeparateNumbers(string inputText)
{
var matches = Regex.Matches(inputText, "[0-9]{12}");
foreach (Match match in matches)
{
yield return inputText.Substring(match.Index, match.Length);
}
}
You would simply use it like this. I have also added a way to comma separate them again:
string inputText = "sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd";
var separatedNumbers = SeparateNumbers(inputText)
.ToArray();
string numbersOnly = string.Join(',', separatedNumbers);
I hope this helps.
Edit:
The reason that it gives you this: System.Text.RegularExpressions.MatchCollection is because of the default implementation of the ToString method, it simply gives you the full name of the type (including namespaces).
Also, if you want it to match any amount of numbers up to 12, simply change the regex to [0-9]{1,12} as you did initially.
1) If you want to return all 12-digit numbers (no more, no less), your regex should be [0-9]{12}. The $ you had in your OP matches the end of a string or line, and the {1,12} in your OP matches any number of digits from 1 to 12. If the number has to be surrounded by commas or string anchors so that 13-digit numbers are not matched, your regex would look something like (?<=^|,)[0-9]{12}(?=,|$).
2) If you read this link, Regex.Matches(string) returns a MatchCollection. If you convert that to string, it is just the type name to string. You have to get to each item in the collection, like:
Match match = regex.Match(input);
while (match.Success) {
// Your logic here
match = match.NextMatch();
}
3) I think string.Split(',') is easier to use. Then, loop through the array and return all strings that are 12 characters long and are numeric. Alternatively, you could use Linq as others have pointed out.
You can simply use LINQ for this purposes. If you want to get just 512025151988:
textBox2.Text = string.Join("",richTextBox1.Text.SkipWhile(c =>
!char.IsDigit(c)).TakeWhile(char.IsDigit));
Or if you want to get all numbers (512025151988,512025151988,512025151988,512025151988,512025151988):
textBox2.Text = string.Join(",",richTextBox1.Text.Split(',')
.Select(d => string.Join("",d
.SkipWhile(c => !char.IsDigit(c)).TakeWhile(char.IsDigit))))
.TrimStart(',').TrimEnd(',');
Replace first comma with space if you need to join results with space. string.Join(" ",...

How can I split a regex into exact words?

I need a little help regarding Regular Expressions in C#
I have the following string
"[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r"
The string could also contain spaces and theoretically any other characters. So I really need do match the [[words]].
What I need is a text array like this
"[[Sender.Name]]",
"[[Sender.AdditionalInfo]]",
"[[Sender.Street]]",
// ... And so on.
I'm pretty sure that this is perfectly doable with:
var stringArray = Regex.Split(line, #"\[\[+\]\]")
I'm just too stupid to find the correct Regex for the Regex.Split() call.
Anyone here that can tell me the correct Regular Expression to use in my case?
As you can tell I'm not that experienced with RegEx :)
Why dont you split according to "\r"?
and you dont need regex for that just use the standard string function
string[] delimiters = {#"\r"};
string[] split = line.Split(delimiters,StringSplitOptions.None);
Do matching if you want to get the [[..]] block.
Regex rgx = new Regex(#"\[\[.*?\]\]");
foreach (Match m in rgx.Matches(input))
Console.WriteLine(m.Groups[0].Value);
IDEONE
The regex you are using (\[\[+\]\]) will capture: literal [s 2 or more, then 2 literal ]s.
A regex solution is capturing all the non-[s inside doubled [ and ]s (and the string inside the brackets should not be empty, I guess?), and cast MatchCollection to a list or array (here is an example with a list):
var str = "[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r";
var rgx22 = new Regex(#"\[\[[^]]+?\]\]");
var res345 = rgx22.Matches(str).Cast<Match>().ToList();
Output:

RegEx for split text on string .NET

I found this answer for my question, but it for PHP. Perhaps there is an analogue for .NET? I know about Split method, but I don't understand how to save text outside my tags <#any_text#>, and I need a regular expression (by the condition of the task).
For example:
string: aaa<#bbb#>aaa<#bb#>c
list: aaa
<#bbb#>
aaa
<#bb#>
c
Here you have passing test. It wasn't hard to find it on web and it would be definitely faster and better for you - try first finding solution yourself, trying some code, and then ask a question. This way you will actually learn something.
[TestMethod]
public void TestMethod1()
{
string source = "aaa<#bbb#>aaa<#bb#>c";
Regex r = new Regex("(<#.+?#>)");
string[] result = r.Split(source);
Assert.AreEqual(5, result.Length);
}
string input = #"aaa<#bbb#>aaa<#bb#>c";
var list = Regex.Matches(input, #"\<.+?\>|[^\<].+?[^\>]|.+?")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

Regex to get the file extension

I have a list which contains file names (without their full path)
List<string> list=new List<string>();
list.Add("File1.doc");
list.Add("File2.pdf");
list.Add("File3.xls");
foreach(var item in list) {
var val=item.Split('.');
var ext=val[1];
}
I don't want to use String.Split, how will I get the extension of the file with regex?
You don't need to use regex for that. You can use Path.GetExtension method.
Returns the extension of the specified path string.
string name = "notepad.exe";
string ext = Path.GetExtension(name).Replace(".", ""); // exe
Here is a DEMO.
To get the extension using regex:
foreach (var item in list) {
var ext = Regex.Match( item, "[^.]+$" ).Value;
}
Or if you want to make sure there is a dot:
#"(?<=\.)[^.]+$"
You could use Path.GetExtension().
Example (also removes the dot):
string filename = "MyAwesomeFileName.ext";
string extension = Path.GetExtension(filename).Replace(".", "");
// extension now contains "ext"
The regex is
\.([A-Za-z0-9]+)$
Escaped period, 1 or more alpha-numeric characters, end of string
You could also use LastIndexOf(".")
int delim = fileName.LastIndexOf(".");
string ext = fileName.Substring(delim >= 0 ? delim : 0);
But using the built in function is always more convenient.
For the benefit of googlers -
I was dealing with bizarre filenames e.g. FirstPart.SecondPart.xml, with the extension being unknown.
In this case, Path.GetFileExtension() got confused by the extra dots.
The regex I used was
\.[A-z]{3,4}$
i.e. match the last instance of 3 or 4 characters with a dot in front only. You can test it here at Regexr. Not a prize winner, but did the trick.
The obvious flaw is that if the second part were 3-4 chars and the file had no extension, it would pick that up, however I knew that was not a situation I would encounter.
"\\.[^\\.]+" matches anything that starts with . character followed by 1 or more no . characters.
By the way the others are right, regex is overkill here.

C# Regular Expression to return only the numbers

Let's say I have the following within my source code, and I want to return only the numbers within the string:
The source is coming from a website, just fyi, and I already have it parsed out so that it comes into the program, but then I need to actually parse these numbers to return what I want. Just having a doosy of a time trying to figure it out tho :(
like: 13|100|0;
How could I write this regex?
var cData = new Array(
"g;13|g;100|g;0",
"g;40|g;100|g;1.37",
"h;43|h;100|h;0",
"h;27|h;100|h;0",
"i;34|i;100|i;0",
"i;39|i;100|i;0",
);
Not sure you actually need regex here.
var str = "g;13|g;100|g;0";
str = str.Replace("g;", "");
would give you "13|100|0".
Or a slight improvement on spinon's answer:
// \- included in case numbers can be negative. Leave it out if not.
Regex.Replace("g;13|g;100|g;0", "[^0-9\|\.\-]", "");
Or an option using split and join:
String.Join("|", "g;13|g;100|g;0".Split('|').Select(pipe => pipe.Split(';')[1]));
I would use something like this so you only keep numbers and separator:
Regex.Replace("g;13|g;100|g;0", "[^0-9|]", "");
Regex might be overkill in this case. Given the uniform delimiting of | and ; I would recommend String.Split(). Then you could either split again or use String.Replace() to get rid of the extra chars (i.e. g;).
It looks like you have a number of solutions, but I'll throw in one more where you can iterate over each group in a match to get the number out if you want.
Regex regexObj = new Regex(#"\w;([\d|.]+)\|?");
Match matchResults = regexObj.Match("g;13|g;100|g;0");
if( matchResults.IsMatch )
{
for (int i = 1; i < matchResults.Groups.Count; i++)
{
Group groupObj = matchResults.Groups[i];
if (groupObj.Success)
{
//groupObj.Value will be the number you want
}
}
}
I hope this is helps.

Categories