i have one string like
"8/6/08mz: Last name corrected from Paniaguato Arevalo-Paniaguaas listed on bills/MR, email Shasta, 1132644 06/24/08jh:To
Concentra/froi."
and i want to split this string when i get "8/6/08mz:" pattern so my updated string will be following
"8/6/08mz: Last name corrected from Paniaguato Arevalo-Paniaguaas listed on bills/MR, email Shasta, 1132644"
"06/24/08jh:To Concentra/froi."
how can i do it in c# please help me.
Using Regex.Split() and a Regular Expression?
I have a very bad one here:
[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{1,2}[a-z]{2}:
https://regex101.com/r/cEFbbZ/1
You can verify the string starts with what you want, then split on the space preceded by 7 digits:
if (s.StartsWith("8/6/08mz: ")) {
var ans = Regex.Split(s, #"(?<=[0-9]{7}) ");
}
Related
I have a string which I extract from an HTML document like this:
var elas = htmlDoc.DocumentNode.SelectSingleNode("//a[#class='a-size-small a-link-normal a-text-normal']");
if (elas != null)
{
//
_extractedString = elas.Attributes["href"].Value;
}
The HREF attribute contains this part of the string:
gp/offer-listing/B002755TC0/
And I'm trying to extract the B002755TC0 value, but the problem here is that the string will vary by its length and I cannot simply use Substring method that C# offers to extract that value...
Instead I was thinking if there's a clever way to do this, to perhaps a match beginning of the string with what I search?
For example I know for a fact that each href has this structure like I've shown, So I would simply match these keywords:
offer-listing/
So I would find this keyword and start extracting the part of the string B002755TC0 until the next " / " sign ?
Can someone help me out with this ?
This is a perfect job for a regular expression :
string text = "gp/offer-listing/B002755TC0/";
Regex pattern = new Regex(#"offer-listing/(\w+)/");
Match match = pattern.Match(text);
string whatYouAreLookingFor = match.Groups[1].Value;
Explanation : we just match the exact pattern you need.
'offer-listing/'
followed by any combination of (at least one) 'word characters' (letters, digits, hyphen, etc...),
followed by a slash.
The parenthesis () mean 'capture this group' (so we can extract it later with match.Groups[1]).
EDIT: if you want to extract also from this : /dp/B01KRHBT9Q/
Then you could use this pattern :
Regex pattern = new Regex(#"/(\w+)/$");
which will match both this string and the previous. The $ stands for the end of the string, so this literally means :
capture the characters in between the last two slashes of the string
Though there is already an accepted answer, I thought of sharing another solution, without using Regex. Just find the position of your pattern in the input + it's lenght, so the wanted text will be the next character. to find the end, search for the first "/" after the begining of the wanted text:
string input = "gp/offer-listing/B002755TC0/";
string pat = "offer-listing/";
int begining = input.IndexOf(pat)+pat.Length;
int end = input.IndexOf("/",begining);
string result = input.Substring(begining,end-begining);
If your desired output is always the last piece, you could also use split and get the last non-empty piece:
string result2 = input.Split(new string[]{"/"},StringSplitOptions.RemoveEmptyEntries)
.ToList().Last();
I have this text and I want to get the 2 matches from it but the problem is I am always getting only 1 match. This is the sample code in c#
string formattedTag = "{Tag 1}::[FORMAT] asdfa {Tag 2}::[FORMAT]";
var tagMatches = Regex.Matches(formattedTag, #"(\{.+\}\:\:\[.+\])");
i am expecting to get two matches here "{Tag 1}::[FORMAT]" and "{Tag 2}::[FORMAT]"
but the result of this code is the actual value of the variable formattedTag.
It must be something from regexp pattern so can somebody help me to figure it out?
I will appreciate every help. Thanks in advance!
You need to use the following regular expression:
(\{[^}]+\}\:\:\[[^]]+\])
You want to match any character except the closing bracket within your bracketed portions of the string, otherwise the whole string is matched because regular expressions are greedy and attempt to retrieve the longest match.
string formattedTag = "{tag 1}::[admin] adfaf{tag 2}::[test.user]";
var tagMatches = Regex.Matches(formattedTag, #"\{(\w+\s*\d{1,2})\}::\[(.*?)\]");
foreach(Match item in tagMatches)[enter image description here][1]{
Console.WriteLine(item.Groups[0]);
Console.WriteLine(item.Groups[1] + "=" + item.Groups[2]);
}
In my table of database MySQL I've stored this string example:
name.surname#thedomain.com
I need spli this string in C# for this output
NAME SURNAME
And tried this solution:
string[] emails = strEMail_user.ToString().Split('.');
string newUserName = emails[0].ToUpper().ToString() + " "
+ emails[1].ToUpper().ToString();
But I've in output this wrong string :
NAME SURNAME#THEDOMAIN.
If this is your pattern name.surname#thedomain.com then you can just use Split with two delimiters, and get first and second parts:
var parts = "name.surname#thedomain.com".Split('.', '#');
string name = parts[0];
string surname = parts[1];
You could do this using a combination of string.Split calls, but it would be much neater to use the Regex class to do this.
Your regex should look something like this:
([a-zA-Z]*)\.([a-zA-Z]*)#thedomain\.com
You can then use the Regex.Match method to obtain the values from the matching groups.
Use strEMail_user.ToString().Split('#')[0] as first part of your email, where name and surname are stored. Then you can split that by . to get name and surname just like you did, if that's the pattern of your emails.
You are splitting the string name.surname#thedomain.com on the . character. That will give you the following pieces:
name
surname#thedomain
com
Remember, there is a .com at the end of your string. What you want to do is split just the beginning of the email address. Try splitting the entire email address first on the # symbol to return the following pieces:
name.surname
thedomain.com
Now you can split the first piece on the . character to get your name and surname.
Note that this is completely assumes that all of your email addresses are of this form, and ignores cultural differences in first name/last name placement.
C#/.NET 4.0
I need to parse a string containing a 18-digit number. I also need the substrings at the left and right side.
Example strings:
string a = "Frl Camp Gerbesklooster 871687120000000691 OPLDN 2010 H1";
string b = "some text with spaces 123456789012345678 more text";
How it should be parsed:
string aParsed[0] = "Frl Camp Gerbesklooster";
string aParsed[1] = "871687120000000691";
string aParsed[2] = "OPLDN 2010 H1";
string bParsed[0] = "some text with spaces";
string bParsed[1] = "123456789012345678";
string bParsed[2] = "more text";
There is always that 18-digit number in the middle of the string. I'm an absolute newbie to Regex so I don't actually have a try of my own.
What is the best way to do this? Should I use regular expressions?
Thanks.
You can use something like the regex: (.*)(\d{18})(.*).
The key here is to use {18} to specify that there must be exactly 18 digits and to capture each part in a group.
var parts = Regex.Matches(s, #"(.*)(\d{18})(.*)")
.Cast<Match>()
.SelectMany(m => m.Groups.Cast<Group>().Skip(1).Select(g=>g.Value))
.ToArray();
Daniƫl,
Although the question is answered the following may be a useful reference for learning Reg Expressions.
http://txt2re.com
Regards,
Liam
In a text box, I keep E-mail addresses.
for example
Text_box.value="a#hotmail.com,b#hotmail.com,c#hotmail.com"
How can I split all of the email addresses? Should I use Regex?
Finally, I want to keep any E-mail address which is correctly coded by user
string[] s=Text_box.Text.split(',');
Regex R=new Regex("\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b");
var temp=from t in s where R.IsMatch(t) select t;
List<string> final=new List<string>();
final.addrange(temp);
use this
string[] emails = list.Split(new char[]{','});
This will only print the matched email address and not which does not match.
private void Match()
{
Regex validationExpression = new Regex(#"\w+([-+.']\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*");
string text = "whatever#gmail;a#hotmail.com,gmail#sync,b#hotmail.com,c#hotmail.com,what,dhinesh#c";
MatchCollection matchCollection = validationExpression.Matches(text);
foreach (var matchedEmailAddress in matchCollection)
{
Console.WriteLine(matchedEmailAddress.ToString());
}
Console.ReadLine();
}
This will print
a#hotmail.com
b#hotmail.com
c#hotmail.com
Other things will not be matched by regular expression.
"a#hotmail.com,b#hotmail.com,c#hotmail.com".Split(',');
There are two ways to split string.
1) Every string type object has method called Split() which takes array of characters or array of strings. Elements of this array are used to split given string.
string[] parts = Text_box.value.Split(new char[] {','});
2) Although string.Split() is enough in this example, we can achieve same result using regular expressions. Regex to split is :
string[] parts = Regex.Split(Text_box.value,#",");
You have to use correct regexp to find all forms of email adresses (with latin letters).
Check on wikipedia ( http://en.wikipedia.org/wiki/Email_address ) for correct syntax of email address (easier way) or in RFC5322, 5321 (much harder to understand).
I'm using this:
(?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|""(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*"")#(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])