I am trying to get string between same strings:
The texts starts here ** Get This String ** Some other text ongoing here.....
I am wondering how to get the string between stars. Should I should use some regex or other functions?
You can try Split:
string source =
"The texts starts here** Get This String **Some other text ongoing here.....";
// 3: we need 3 chunks and we'll take the middle (1) one
string result = source.Split(new string[] { "**" }, 3, StringSplitOptions.None)[1];
You can use IndexOf to do the same without regular expressions.
This one will return the first occurence of string between two "**" with trimed whitespaces. It also has checks of non-existence of a string which matches this condition.
public string FindTextBetween(string text, string left, string right)
{
// TODO: Validate input arguments
int beginIndex = text.IndexOf(left); // find occurence of left delimiter
if (beginIndex == -1)
return string.Empty; // or throw exception?
beginIndex += left.Length;
int endIndex = text.IndexOf(right, beginIndex); // find occurence of right delimiter
if (endIndex == -1)
return string.Empty; // or throw exception?
return text.Substring(beginIndex, endIndex - beginIndex).Trim();
}
string str = "The texts starts here ** Get This String ** Some other text ongoing here.....";
string result = FindTextBetween(str, "**", "**");
I usually prefer to not use regex whenever possible.
If you want to use regex, this could do:
.*\*\*(.*)\*\*.*
The first and only capture has the text between stars.
Another option would be using IndexOf to find the position of the first star, check if the following character is a star too and then repeat that for the second set. Substring the part between those indexes.
If you can have multiple pieces of text to find in one string, you can use following regex:
\*\*(.*?)\*\*
Sample code:
string data = "The texts starts here ** Get This String ** Some other text ongoing here..... ** Some more text to find** ...";
Regex regex = new Regex(#"\*\*(.*?)\*\*");
MatchCollection matches = regex.Matches(data);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1].Value);
}
You could use split but this would only work if there is 1 occurrence of the word.
Example:
string output = "";
string input = "The texts starts here **Get This String **Some other text ongoing here..";
var splits = input.Split( new string[] { "**", "**" }, StringSplitOptions.None );
//Check if the index is available
//if there are no '**' in the string the [1] index will fail
if ( splits.Length >= 2 )
output = splits[1];
Console.Write( output );
Console.ReadKey();
You can use SubString for this:
String str="The texts starts here ** Get This String ** Some other text ongoing here";
s=s.SubString(s.IndexOf("**"+2));
s=s.SubString(0,s.IndexOf("**"));
Related
I'm trying to make an app that's looking for a string entered by a user. There will be a text file that's going to store a lot of strings and the app will be checking if the string can be found within this file and display the index of the string. In case the string can't be found, the app will look for specific patterns.
Here's an example of the text file:
This
This |
This is |
This car is #
| - one word
# - one or more words
How will the app work?
If "This" is the string entered by the user, the app will display the index of the first line (0).
If "This apple" is the string entered by the user, the app will display the index of "This |" (1).
If "This is awesome" is the string entered by the user, the app will display the index of "This is |" (2).
If "The car is blue and I like it" is the string entered by the user, the app will display the index of "This car is #" (3).
Usually, if I'm looking for a string I would use this code:
string[] grammarFile = File.ReadAllLines(#"C:\Users\user_name\Desktop\Text.txt");
int resp = Array.IndexOf(grammarFile, userString);
Console.WriteLine(resp);
The main problem is that I have no idea how I could do this for patterns.
You need a definition for a word. I will assume that a word is a consecutive string of any non-whitespace characters.
Let's define a regex that matches a single word:
var singleWordRegex = #"[^\s]+";
and a regex that matches one or more words (a sequence of non-whitespace characters, followed by a sequence of whitespace characters or the end of the string):
var oneOrMoreWordsRegex = #"([^\s]+([\s]|$)+)+";
Now you can transform each string from your textfile to a regex like this:
Regex ToRegex(this string grammarEntry)
{
var singleWordRegex = #"[^\s]+";
var oneOrMoreWordsRegex = #"([^\s]+([\s]|$)+)+";
return new Regex("^" + grammarEntry.Replace("|", singleWordRegex).Replace("#", oneOrMoreWordsRegex) + "$" );
}
and test every grammar entry like this:
var userString = ReadUserString();
string[] grammarFile = File.ReadAllLines(#"C:\Users\user_name\Desktop\Text.txt");
var resp = -1;
for(int i = 0; i < grammarFile.Length; ++i)
{
var grammarEntry = grammarFile[i];
if(grammarEntry.ToRegex().IsMatch(userString))
{
resp = i;
break;
}
}
Console.WriteLine(resp);
On a side note, if you're going to perform many matches it might be wise to save all ToRegex calls to an array as preprocessing.
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
I need to extract the number 12345 alone and I don't need the numbers 9677125 and 35478.
What regex can I use?
Here is the regex for extracting 5 digit number in the beginning of the string:
^(\d{5})&
If length is arbitrary:
^(\d+)&
If termination pattern is not always &:
^(\d+)[^\d]
Based on the Sayse's comment you can simply rewrite as:
^(\d+)
and in case of the termination is some number(for instance 999) then:
^(\d+)999
You don't need regex if you only want to extract the first number:
string temp = "12345&refere?X=Assess9677125?Folder_id=35478";
int first = Int32.Parse(String.Join("", temp.TakeWhile(c => Char.IsDigit(c))));
Console.WriteLine(first); // 12345
If the number you want is always at the beginning of the string and terminated by an ampersand (&) you don't need a regex at all. Just split the string on the ampersand and get the first element of the resulting array:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var splitArray = String.Split('&', temp);
var number = splitArray[0]; // returns 12345
Alternatively, you can get the index of the ampersand and substring up to that point:
String temp = "12345&refere?X=Assess9677125?Folder_id=35478";
var ampersandIndex = temp.IndexOf("&");
var number = temp.SubString(0, ampersandIndex); // returns 12345
From what you haven given us this is fairly simple:
var regex = new Regex(#"^(?<number>\d+)&");
var match = regex.Match("12345&refere?X=Assess9677125?Folder_id=35478");
if (match.Success)
{
var number = int.Parse(match.Groups["number"].Value);
}
Edit: Of course you can replace the argument of new Regex with any of the combinations Giorgi has given.
I have a string in this format:
ABCD_EFDG20120700.0.xml
This has a pattern which has three parts to it:
First is the set of chars before the '_', the 'ABCD'
Second are the set of chars 'EFDG' after the '_'
Third are the remaining 20120700.0.xml
I can split the original string and get the number(s) from the second element in the split result using this switch:
\d+
Match m = Regex.Match(splitname[1], "\\d+");
That returns only '20120700'. But I need '20120700.0'.
How do I get the required string?
You can extend your regex to look for any number of digits, then period and then any number of digits once again:
Match m = Regex.Match(splitname[1], "\\d+\\.\\d+");
Although with such regular expression you don't even need to split the string:
string s = "ABCD_EFDG20120700.0.xml";
Match m = Regex.Match(s, "\\d+\\.\\d+");
string result = m.Value; // result is 20120700.0
I can suggest you to use one regex operation for all you want like this:
var rgx = new Regex(#"^([^_]+)_([^\d.]+)([\d.]+\d+)\.(.*)$");
var matches = rgx.Matches(input);
if (matches.Count > 0)
{
Console.WriteLine("{0}", matches[0].Groups[0]); // All input string
Console.WriteLine("{0}", matches[0].Groups[1]); // ABCD
Console.WriteLine("{0}", matches[0].Groups[2]); // EFGH
Console.WriteLine("{0}", matches[0].Groups[3]); // 20120700.0
Console.WriteLine("{0}", matches[0].Groups[4]); // xml
}
I have a string that is of nature
RTT(50)
RTT(A)(50)
RTT(A)(B)(C)(50)
What I want to is to remove the last () occurrence from the string. That is if the string is - RTT(50), then I want RTT only returned. If it is RTT(A)(50), I want RTT(A) returned etc.
How do I achieve this? I currently use a substring method that takes out any occurrence of the () regardless. I thought of using:
Regex.Matches(node.Text, "( )").Count
To count the number of occurrences so I did something like below.
if(Regex.Matches(node.Text, "( )").Count > 1)
//value = node.Text.Remove(Regex.//Substring(1, node.Text.IndexOf(" ("));
else
value = node.Text.Substring(0, node.Text.IndexOf(" ("));
The else part will do what I want. However, how to remove the last occurrence in the if part is where I am stuck.
The String.LastIndexOf method does what you need - returns the last index of a char or string.
If you're sure that every string will have at least one set of parentheses:
var result = node.Text.Substring(0, node.Text.LastIndexOf("("));
Otherwise, you could test the result of LastIndexOf:
var lastParenSet = node.Text.LastIndexOf("(");
var result =
node.Text.Substring(0, lastParenSet > -1 ? lastParenSet : node.Text.Count());
This should do what you want :
your_string = your_string.Remove(your_string.LastIndexOf(string_to_remove));
It's that simple.
There are a couple of different options to consider.
LastIndexOf
Get the last index of the ( character and take the substring up to that index. The downside of this approach is an additional last index check for ) would be needed to ensure that the format is correct and that it's a pair with the closing parenthesis occurring after the opening parenthesis (I did not perform this check in the code below).
var index = input.LastIndexOf('(');
if (index >= 0)
{
var result = input.Substring(0, index);
Console.WriteLine(result);
}
Regex with RegexOptions.RightToLeft
By using RegexOptions.RightToLeft we can grab the last index of a pair of parentheses.
var pattern = #"\(.+?\)";
var match = Regex.Match(input, pattern, RegexOptions.RightToLeft);
if (match.Success)
{
var result = input.Substring(0, match.Index);
Console.WriteLine(result);
}
else
{
Console.WriteLine(input);
}
Regex depending on numeric format
If you're always expecting the final parentheses to have numeric content, similar to your example values where (50) is getting removed, we can use a pattern that matches any numbers inside parentheses.
var patternNumeric = #"\(\d+\)";
var result = Regex.Replace(input, patternNumeric, "");
Console.WriteLine(result);
It's very simple. You can easily achieve like this:
string a=RTT(50);
string res=a.substring (0,a.LastIndexOf("("))
As an extention:
namespace CustomExtensions
{
public static class StringExtension
{
public static string ReplaceLastOf(this string str, string fromStr, string toStr)
{
int lastIndexOf = str.LastIndexOf(fromStr);
if (lastIndexOf < 0)
return str;
string leading = str.Substring(0, lastIndexOf);
int charsToEnd = str.Length - (lastIndexOf + fromStr.Length);
string trailing = str.Substring(lastIndexOf+fromStr.Length, charsToEnd);
return leading + toStr + trailing;
}
}
}
Use:
string myFavColor = "My favourite color is blue";
string newFavColor = myFavColor.ReplaceLastOf("blue", "red");
try something a function this:
public static string ReplaceLastOccurrence(string source, string find, string replace)
{
int place = source.LastIndexOf(find);
return source.Remove(place, find.Length).Insert(place, replace);
}
It will remove the last occurrence of a string string and replace to another one, and use:
string result = ReplaceLastOccurrence(value, "(", string.Empty);
In this case, you find ( string inside the value string, and replace the ( to a string.Empty. It also could be used to replace to another information.
I have a string 4(4X),4(4N),3(3X) from this string I want to make string 4,4,3. If I am getting the string 4(4N),3(3A),2(2X) then I want to make my string 4,3,2.
Please someone tell me how can I solve my problem.
This Linq query selects substring from each part of input string, starting from beginning till first open brace:
string input = "4(4N),3(3A),2(2X)";
string result = String.Join(",", input.Split(',')
.Select(s => s.Substring(0, s.IndexOf('('))));
// 4,3,2
This may help:
string inputString = "4(4X),4(4N),3(3X)";
string[] temp = inputString.Split(',');
List<string> result = new List<string>();
foreach (string item in temp)
{
result.Add(item.Split('(')[0]);
}
var whatYouNeed = string.Join(",", result);
You can use regular expressions
String input = #"4(4X),4(4N),3(3X)";
String pattern = #"(\d)\(\1.\)";
// ( ) - first group.
// \d - one number
// \( and \) - braces.
// \1 - means the repeat of first group.
String result = Regex.Replace(input, pattern, "$1");
// $1 means, that founded patterns will be replcaed by first group
//result = 4,4,3