Extracting string between two characters?

Extracting string between two characters? - c#

I want to extract email id between < >
for example.
input string : "abc" <abc#gmail.com>; "pqr" <pqr#gmail.com>;
output string : abc#gmail.com;pqr#gmail.com

Without regex, you can use this:
public static string GetStringBetweenCharacters(string input, char charFrom, char charTo)
{
int posFrom = input.IndexOf(charFrom);
if (posFrom != -1) //if found char
{
int posTo = input.IndexOf(charTo, posFrom + 1);
if (posTo != -1) //if found char
{
return input.Substring(posFrom + 1, posTo - posFrom - 1);
}
}
return string.Empty;
}
And then:
GetStringBetweenCharacters("\"abc\" <abc#gmail.com>;", '<', '>')
you will get
abc#gmail.com

string input = #"""abc"" <abc#gmail.com>; ""pqr"" <pqr#gmail.com>;";
var output = String.Join(";", Regex.Matches(input, #"\<(.+?)\>")
.Cast<Match>()
.Select(m => m.Groups[1].Value));

Tested
string input = "\"abc\" <abc#gmail.com>; \"pqr\" <pqr#gmail.com>;";
matchedValuesConcatenated = string.Join(";",
Regex.Matches(input, #"(?<=<)([^>]+)(?=>)")
.Cast<Match>()
.Select(m => m.Value));
(?<=<) is a non capturing look behind so < is part of the search but not included in the output
The capturing group is anything not > one or more times
Can also use non capturing groups #"(?:<)([^>]+)(?:>)"
The answer from LB +1 is also correct. I just did not realize it was correct until I wrote an answer myself.

Use the String.IndexOf(char, int) method to search for < starting at a given index in the string (e.g. the last index that you found a > character at, i.e. at the end of the previous e-mail address - or 0 when looking for the first address).
Write a loop that repeats for as long as you find another < character, and everytime you find a < character, look for the next > character. Use the String.Substring(int, int) method to extract the e-mail address whose start and end position is then known to you.

Could use the following regex and some linq.
var regex = new Regex(#"\<(.*?)\>");
var input= #"""abc"" <abc#gmail.com>; ""pqr"" <pqr#gmail.com>";
var matches = regex.Matches(input);
var res = string.Join(";", matches.Cast<Match>().Select(x => x.Value.Replace("<","").Replace(">","")).ToArray());
The <> brackets get removed afterwards, you could also integrate it into Regex I guess.

string str = "\"abc\" <abc#gmail.com>; \"pqr\" <pqr#gmail.com>;";
string output = string.Empty;
while (str != string.Empty)
{
output += str.Substring(str.IndexOf("<") + 1, str.IndexOf(">") -1);
str = str.Substring(str.IndexOf(">") + 2, str.Length - str.IndexOf(">") - 2).Trim();
}

Related

How can I get the substring after a specifc word and on another word in a string

Say I have the string "Old Macdonald had a farm and on".
I want to get the substring "had a farm".
I want to get anything after the work "Macdonald" and up to the word "farm"
So the constants in the string are:
"Macdonald" - which I don't want included in the substring
"farm" - which I want included and the end word in the substring
I've been trying to incorporate the indexof etc. functions but can't seem to get it to work

You could use RegEx with (?<=Macdonald\s).*(?=\sand)
Explanation
Positive Lookbehind (?<=Macdonald\s)
Macdonald matches the characters Macdonald literally
\s matches any whitespace character
.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Positive Lookahead (?=\sand)
\s matches any whitespace character and matches the characters and literally
Example
var input = "Old Macdonald had a farm and on";
var regex = new Regex(#"(?<=Macdonald\s).*(?=\sand)", RegexOptions.Compiled | RegexOptions.IgnoreCase);
var match = regex.Match(input);
if (match.Success)
{
Console.WriteLine(match.Value);
}
else
{
Console.WriteLine("No farms for you");
}
Output
had a farm
Full Demo Here

You can use IndexOf() and Substring() as
string input = "Old Macdonald had a farm and on";
int pos=input.IndexOf("Macdonald");
if(pos > -1){
string s2=input.Substring(pos);
string second="farm";
int secLen=second.Length;
int pos2=s2.IndexOf(second);
if(pos2 > -1){
Console.WriteLine("Substring: {0}", s2.Substring(0,pos2+secLen));
}
}

As I've mentioned in my comment, I would recommend using of Regex (the way TheGeneral suggested). But there is another way to do it.
Adding it as workaround
string input = "Old Macdonald had a farm and on";
List<string> words = input.Split(" ".ToCharArray()).ToList();
string finalString = "";
int indexOfMac = words.IndexOf("Macdonald");
int indexOfFarm = words.IndexOf("farm");
if (indexOfFarm != -1 && indexOfMac != -1 && //if word is not there in string, index will be '-1'
indexOfMac < indexOfFarm) //checking if 'macdonald' comes before 'farm' or not
{
//looping from Macdonald + 1 to farm, and make final string
for(int i = indexOfMac + 1; i <= indexOfFarm; i++)
{
finalString += words[i] + " ";
}
}
else
{
finalString = "No farms for you";
}
Console.WriteLine(finalString);

And of course Linq can't be excluded from the list of solutions:
string SearchString = "Old Macdonald had a farm and on";
string fromPattern = "Macdonald";
string toPattern = "farm";
string result = string.Join(" ", SearchString
.Split((char)32)
.SkipWhile(s => s != fromPattern)
.Skip(1)
.TakeWhile(s => s != toPattern)
.Append(toPattern));

Remove last occurrence of a string in a string

I have a string that is of nature
RTT(50)
RTT(A)(50)
RTT(A)(B)(C)(50)
What I want to is to remove the last () occurrence from the string. That is if the string is - RTT(50), then I want RTT only returned. If it is RTT(A)(50), I want RTT(A) returned etc.
How do I achieve this? I currently use a substring method that takes out any occurrence of the () regardless. I thought of using:
Regex.Matches(node.Text, "( )").Count
To count the number of occurrences so I did something like below.
if(Regex.Matches(node.Text, "( )").Count > 1)
//value = node.Text.Remove(Regex.//Substring(1, node.Text.IndexOf(" ("));
else
value = node.Text.Substring(0, node.Text.IndexOf(" ("));
The else part will do what I want. However, how to remove the last occurrence in the if part is where I am stuck.

The String.LastIndexOf method does what you need - returns the last index of a char or string.
If you're sure that every string will have at least one set of parentheses:
var result = node.Text.Substring(0, node.Text.LastIndexOf("("));
Otherwise, you could test the result of LastIndexOf:
var lastParenSet = node.Text.LastIndexOf("(");
var result =
node.Text.Substring(0, lastParenSet > -1 ? lastParenSet : node.Text.Count());

This should do what you want :
your_string = your_string.Remove(your_string.LastIndexOf(string_to_remove));
It's that simple.

There are a couple of different options to consider.
LastIndexOf
Get the last index of the ( character and take the substring up to that index. The downside of this approach is an additional last index check for ) would be needed to ensure that the format is correct and that it's a pair with the closing parenthesis occurring after the opening parenthesis (I did not perform this check in the code below).
var index = input.LastIndexOf('(');
if (index >= 0)
{
var result = input.Substring(0, index);
Console.WriteLine(result);
}
Regex with RegexOptions.RightToLeft
By using RegexOptions.RightToLeft we can grab the last index of a pair of parentheses.
var pattern = #"\(.+?\)";
var match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
if (match.Success)
{
var result = input.Substring(0, match.Index);
Console.WriteLine(result);
}
else
{
Console.WriteLine(input);
}
Regex depending on numeric format
If you're always expecting the final parentheses to have numeric content, similar to your example values where (50) is getting removed, we can use a pattern that matches any numbers inside parentheses.
var patternNumeric = #"\(\d+\)";
var result = Regex.Replace(input, patternNumeric, "");
Console.WriteLine(result);

It's very simple. You can easily achieve like this:
string a=RTT(50);
string res=a.substring (0,a.LastIndexOf("("))

As an extention:
namespace CustomExtensions
{
public static class StringExtension
{
public static string ReplaceLastOf(this string str, string fromStr, string toStr)
{
int lastIndexOf = str.LastIndexOf(fromStr);
if (lastIndexOf < 0)
return str;
string leading = str.Substring(0, lastIndexOf);
int charsToEnd = str.Length - (lastIndexOf + fromStr.Length);
string trailing = str.Substring(lastIndexOf+fromStr.Length, charsToEnd);
return leading + toStr + trailing;
}
}
}
Use:
string myFavColor = "My favourite color is blue";
string newFavColor = myFavColor.ReplaceLastOf("blue", "red");

try something a function this:
public static string ReplaceLastOccurrence(string source, string find, string replace)
{
int place = source.LastIndexOf(find);
return source.Remove(place, find.Length).Insert(place, replace);
}
It will remove the last occurrence of a string string and replace to another one, and use:
string result = ReplaceLastOccurrence(value, "(", string.Empty);
In this case, you find ( string inside the value string, and replace the ( to a string.Empty. It also could be used to replace to another information.

Substring or split word in clamp

I need an solution for my problem. I have a clause like:
Hello guys I am cool (test)
An now I need an effective method to split just only the part in the parentheses and the result should be:
test
My try is to split the String in words like. But I don't think it is the best way.
string[] words = s.Split(' ');

I do not think that split is the solution to your problem
Regex is very good for extracting data.
using System.Text.RegularExpression;
...
string result = Regex.Match(s, #"\((.*?)\)").Groups[1].Value;
This should do the trick.

Assuming:
var input = "Hello guys I am cool (test)";
..Non-Regex version:
var nonRegex = input.Substring(input.IndexOf('(') + 1, input.LastIndexOf(')') - (input.IndexOf('(') + 1));
..Regex version:
var regex = Regex.Match(input, #"\((\w+)\)").Groups[1].Value;

You can use regex for this:
string parenthesized = Regex.Match(s, #"(?<=\()[^)]+(?=\))").Value;
Here's an explanation of the various parts of the regex pattern:
(?<=\(): Lookbehind for the ( (excluded from the match)
[^)]+: Sequence of characters consisting of anything except )
(?=\)): Lookahead for the ) (excluded from the match)

The most efficient way is using string methods but you don't need Split but Substring and IndexOf. Note that this currently just finds a single word in parentheses:
string text = "Hello guys I am cool (test)";
string result = "--no parentheses--";
int index = text.IndexOf('(');
if(index++ >= 0) // ++ used to look behind ( which is a single character
{
int endIndex = text.IndexOf(')', index);
if(endIndex >= 0)
{
result = text.Substring(index, endIndex - index);
}
}

string s = "Hello guys I am cool (test)";
var result = s.Substring(s.IndexOf("test"), 4);

String not in regular expression

I want to use Regex to find matches in a string. There are other ways to find the pattern I am looking for, but I am interested in the Regex solution.
Concider these strings
"ABC123"
"ABC245"
"ABC435"
"ABC Oh say can You see"
I want to match the find "ABC" followed by ANYTHING BUT "123". What is the correct regex expression?

Using a negative lookahead:
/ABC(?!123)/
You can check if there are matches in a string str with:
Regex.IsMatch(str, "ABC(?!123)")
Full example:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string[] strings = {
"ABC123",
"ABC245",
"ABC435",
"ABC Oh say can You see"
};
string pattern = "ABC(?!123)";
foreach (string str in strings)
{
Console.WriteLine(
"\"{0}\" {1} match.",
str, Regex.IsMatch(str, pattern) ? "does" : "does not"
);
}
}
}
Live Demo
Alas, my Regex above will match ABC as long as it is not followed by 123. If you need to match at least a character after ABC that is not 123 (that is, do not match ABC on its own/end of the string), you can use ABC(?!123)., the dot ensures that you match at least one character after ABC: demo.
I believe the first Regex is what you're looking for though (as long as "nothing" can be considered "anything" :P).

Try the following test code. This should do what you require
string s1 = "ABC123";
string s2 = "we ABC123 weew";
string s3 = "ABC435";
string s4 = "Can ABC Oh say can You see";
List<string> list = new List<string>() { s1, s2, s3, s4 };
Regex regex = new Regex(#".*(?<=.*ABC(?!.*123.*)).*");
Match m = null;
foreach (string s in list)
{
m = regex.Match(s);
if (m != null)
Console.WriteLine(m.ToString());
}
The output is:
ABC435
Can ABC Oh say can You see
This uses both a 'Negative Lookahead' and a 'Positive Lookbehind'.
I hope this helps.

An alternative to regex, should you find this easier to use. Only a suggestion.
List<string> strs = new List<string>() { "ABC123",
"ABC245",
"ABC435",
"NOTABC",
"ABC Oh say can You see"
};
for (int i = 0; i < strs.Count; i++)
{
//Set the current string variable
string str = strs[i];
//Get the index of "ABC"
int index = str.IndexOf("ABC");
//Do you want to remove if ABC doesn't exist?
if (index == -1)
continue;
//Set the index to be the next character from ABC
index += 3;
//If the index is within the length with 3 extra characters (123)
if (index <= str.Length && (index + 3) <= str.Length)
if (str.Substring(index, 3) == "123")
strs.RemoveAt(i);
}

How to find the number of occurrences of a letter in only the first sentence of a string?

I want to find number of letter "a" in only first sentence. The code below finds "a" in all sentences, but I want in only first sentence.
static void Main(string[] args)
{
string text; int k = 0;
text = "bla bla bla. something second. maybe last sentence.";
foreach (char a in text)
{
char b = 'a';
if (b == a)
{
k += 1;
}
}
Console.WriteLine("number of a in first sentence is " + k);
Console.ReadKey();
}

This will split the string into an array seperated by '.', then counts the number of 'a' char's in the first element of the array (the first sentence).
var count = Text.Split(new[] { '.', '!', '?', })[0].Count(c => c == 'a');
This example assumes a sentence is separated by a ., ? or !. If you have a decimal number in your string (e.g. 123.456), that will count as a sentence break. Breaking up a string into accurate sentences is a fairly complex exercise.

This is perhaps more verbose than what you were looking for, but hopefully it'll breed understanding as you read through it.
public static void Main()
{
//Make an array of the possible sentence enders. Doing this pattern lets us easily update
// the code later if it becomes necessary, or allows us easily to move this to an input
// parameter
string[] SentenceEnders = new string[] {"$", #"\.", #"\?", #"\!" /* Add Any Others */};
string WhatToFind = "a"; //What are we looking for? Regular Expressions Will Work Too!!!
string SentenceToCheck = "This, but not to exclude any others, is a sample."; //First example
string MultipleSentencesToCheck = #"
Is this a sentence
that breaks up
among multiple lines?
Yes!
It also has
more than one
sentence.
"; //Second Example
//This will split the input on all the enders put together(by way of joining them in [] inside a regular
// expression.
string[] SplitSentences = Regex.Split(SentenceToCheck, "[" + String.Join("", SentenceEnders) + "]", RegexOptions.IgnoreCase);
//SplitSentences is an array, with sentences on each index. The first index is the first sentence
string FirstSentence = SplitSentences[0];
//Now, split that single sentence on our matching pattern for what we should be counting
string[] SubSplitSentence = Regex.Split(FirstSentence, WhatToFind, RegexOptions.IgnoreCase);
//Now that it's split, it's split a number of times that matches how many matches we found, plus one
// (The "Left over" is the +1
int HowMany = SubSplitSentence.Length - 1;
System.Console.WriteLine(string.Format("We found, in the first sentence, {0} '{1}'.", HowMany, WhatToFind));
//Do all this again for the second example. Note that ideally, this would be in a separate function
// and you wouldn't be writing code twice, but I wanted you to see it without all the comments so you can
// compare and contrast
SplitSentences = Regex.Split(MultipleSentencesToCheck, "[" + String.Join("", SentenceEnders) + "]", RegexOptions.IgnoreCase | RegexOptions.Singleline);
SubSplitSentence = Regex.Split(SplitSentences[0], WhatToFind, RegexOptions.IgnoreCase | RegexOptions.Singleline);
HowMany = SubSplitSentence.Length - 1;
System.Console.WriteLine(string.Format("We found, in the second sentence, {0} '{1}'.", HowMany, WhatToFind));
}
Here is the output:
We found, in the first sentence, 3 'a'.
We found, in the second sentence, 4 'a'.

You didn't define "sentence", but if we assume it's always terminated by a period (.), just add this inside the loop:
if (a == '.') {
break;
}
Expand from this to support other sentence delimiters.

Simply "break" the foreach(...) loop when you encounter a "." (period)

Well, assuming you define a sentence as being ended with a '.''
Use String.IndexOf() to find the position of the first '.'. After that, searchin a SubString instead of the entire string.

find the place of the '.' in the text ( you can use split )
count the 'a' in the text from the place 0 to instance of the '.'

string SentenceToCheck = "Hi, I can wonder this situation where I can do best";
//Here I am giving several way to find this
//Using Regular Experession
int HowMany = Regex.Split(SentenceToCheck, "a", RegexOptions.IgnoreCase).Length - 1;
int i = Regex.Matches(SentenceToCheck, "a").Count;
// Simple way
int Count = SentenceToCheck.Length - SentenceToCheck.Replace("a", "").Length;
//Linq
var _lamdaCount = SentenceToCheck.ToCharArray().Where(t => t.ToString() != string.Empty)
.Select(t => t.ToString().ToUpper().Equals("A")).Count();
var _linqAIEnumareable = from _char in SentenceToCheck.ToCharArray()
where !String.IsNullOrEmpty(_char.ToString())
&& _char.ToString().ToUpper().Equals("A")
select _char;
int a =linqAIEnumareable.Count;
var _linqCount = from g in SentenceToCheck.ToCharArray()
where g.ToString().Equals("a")
select g;
int a = _linqCount.Count();

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Extracting string between two characters? - c#

I want to extract email id between < > for example. input string : "abc" <abc#gmail.com>; "pqr" <pqr#gmail.com>; output string : abc#gmail.com;pqr#gmail.com

string input = #"""abc"" <abc#gmail.com>; ""pqr"" <pqr#gmail.com>;"; var output = String.Join(";", Regex.Matches(input, #"\<(.+?)\>") .Cast<Match>() .Select(m => m.Groups[1].Value));

string str = "\"abc\" <abc#gmail.com>; \"pqr\" <pqr#gmail.com>;"; string output = string.Empty; while (str != string.Empty) { output += str.Substring(str.IndexOf("<") + 1, str.IndexOf(">") -1); str = str.Substring(str.IndexOf(">") + 2, str.Length - str.IndexOf(">") - 2).Trim(); }

Related

How can I get the substring after a specifc word and on another word in a string

Remove last occurrence of a string in a string

Substring or split word in clamp

String not in regular expression

How to find the number of occurrences of a letter in only the first sentence of a string?

Categories

Resources