Split constantly on the last delimiter in C# - c#

I have the following string:
string x = "hello;there;;you;;;!;"
The result I want is a list of length four with the following substrings:
"hello"
"there;"
"you;;"
"!"
In other words, how do I split on the last occurrence when the delimiter is repeating multiple times? Thanks.

You need to use a regex based split:
var s = "hello;there;;you;;;!;";
var res = Regex.Split(s, #";(?!;)").Where(m => !string.IsNullOrEmpty(m));
Console.WriteLine(string.Join(", ", res));
// => hello, there;, you;;, !
See the C# demo
The ;(?!;) regex matches any ; that is not followed with ;.
To also avoid matching a ; at the end of the string (and thus keep it attached to the last item in the resulting list) use ;(?!;|$) where $ matches the end of string (can be replaced with \z if the very end of the string should be checked for).

It seems that you don't want to remove empty entries but keep the separators.
You can use this code:
string s = "hello;there;;you;;;!;";
MatchCollection matches = Regex.Matches(s, #"(.+?);(?!;)");
foreach(Match match in matches)
{
Console.WriteLine(match.Captures[0].Value);
}

string x = "hello;there;;you;;;!;"
var splitted = x.Split(new char[] { ';' }, StringSplitOptions.RemoveEmptryEntries);
foreach (var s in splitted)
Console.WriteLine("{0}", s);

Related

How to first 'Split a string to an Array' then 'Add something to that Array'? || C# Console App

I'm trying to create a program that splits a string to an array then adds
to that array.
Splitting the string works but adding to the array is really putting up a
fight.
//here i create the text
string text = Console.ReadLine();
Console.WriteLine();
//Here i split my text to elements in an Array
var punctuation = text.Where(Char.IsPunctuation).Distinct().ToArray();
var words = text.Split().Select(x => x.Trim(punctuation));
//here i display the splitted string
foreach (string x in words)
{
Console.WriteLine(x);
}
//Here a try to add something to the Array
Array.words(ref words, words.Length + 1);
words[words.Length - 1] = "addThis";
//I try to display the updated array
foreach (var x in words)
{
Console.WriteLine(x);
}
//Here are the error messages |*error*|
Array.|*words*|(ref words, words.|*Length*| + 1);
words[words.|*Length*| - 1] = "addThis";
'Array' does not contain definition for 'words'
Does not contain definition for Length
Does not contain definition for length */
Convert the IEnumerable to List:
var words = text.Split().Select(x => x.Trim(punctuation)).ToList();
Once it is a list, you can call Add
words.Add("addThis");
Technically, if you want to split on punctuation, I suggest Regex.Split instead of string.Split
using System.Text.RegularExpressions;
...
string text =
#"Text with punctuation: comma, full stop. Apostroph's and ""quotation?"" - ! Yes!";
var result = Regex.Split(text, #"\p{P}");
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
Text with punctuation # Space is not a punctuation, 3 words combined
comma
full stop
Apostroph # apostroph ' is a punctuation, split as required
s and
quotation
Yes
if you want to add up some items, I suggest Linq Concat() and .ToArray():
string text =
string[] words = Regex
.Split(text, #"\p{P}")
.Concat(new string[] {"addThis"})
.ToArray();
However, it seems that you want to extract words, not to split on puctuation which you can do matching these words:
using System.Linq;
using System.Text.RegularExpressions;
...
string text =
#"Text with punctuation: comma, full stop. Apostroph's and ""quotation?"" - ! Yes!";
string[] words = Regex
.Matches(text, #"[\p{L}']+") // Let word be one or more letters or apostrophs
.Cast<Match>()
.Select(match => match.Value)
.Concat(new string[] { "addThis"})
.ToArray();
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
Text
with
punctuation
comma
full
stop
Apostroph's
and
quotation
Yes
addThis

Remove numbers in specific part of string (within parentheses)

I have a string Test123(45) and I want to remove the numbers within the parenthesis. How would I go about doing that?
So far I have tried the following:
string str = "Test123(45)";
string result = Regex.Replace(str, "(\\d)", string.Empty);
This however leads to the result Test(), when it should be Test123().
tis replaces all parenthesis, filled with digits by parenthesis
string str = "Test123(45)";
string result = Regex.Replace(str, #"\(\d+\)", "()");
\d+(?=[^(]*\))
Try this.Use with verbatinum mode #.The lookahead will make sure number have ) without ( before it.Replace by empty string.
See demo.
https://regex101.com/r/uE3cC4/4
string str = "Test123(45)";
string result = Regex.Replace(str, #"\(\d+\)", "()");
you can also try this way:
string str = "Test123(45)";
string[] delimiters ={#"("};;
string[] split = str.Split(delimiters, StringSplitOptions.None);
var b=split[0]+"()";
Remove a number that is in fact inside parentheses BUT not the parentheses and keep anything else inside them that is not a number with C# Regex.Replace means matching all parenthetical substrings with \([^()]+\) and then removing all digits inside the MatchEvaluator.
Here is a C# sample program:
var str = "Test123(45) and More (5 numbers inside parentheses 123)";
var result = Regex.Replace(str, #"\([^()]+\)", m => Regex.Replace(m.Value, #"\d+", string.Empty));
// => Test123() and More ( numbers inside parentheses )
To remove digits that are enclosed in ( and ) symbols, the ASh's \(\d+\) solution will work well: \( matches a literal (, \d+ matches 1+ digits, \) matches a literal ).

Regex Ignore first and last terminator

I have string in text that have uses | as a delimiter.
Example:
|2P|1|U|F8|
I want the result to be 2P|1|U|F8. How can I do that?
The regex is very easy, but why not just use Trim():
var str = "|2P|1|U|F8|";
str = str.Trim(new[] {'|'});
or just without new[] {...}:
str = str.Trim('|');
Output:
In case there are leading/trailing whitespaces, you can use chained Trims:
var str = "\r\n |2P|1|U|F8| \r\n";
str = str.Trim().Trim('|');
Output will be the same.
You can use String.Substring:
string str = "|2P|1|U|F8|";
string newStr = str.Substring(1, str.Length - 2);
Just remove the starting and the ending delimiter.
#"^\||\|$"
Use the below regex and then replace the match with an empty string.
Regex rgx = new Regex(#"^\||\|$");
string result = rgx.Replace(input, "");
Use mulitline modifier m when you're dealing with multiple lines.
Regex rgx = new Regex(#"(?m)^\||\|$");
Since | is a special char in regex, you need to escape this in-order to match a literal | symbol.
string input = "|2P|1|U|F8|";
foreach (string item in input.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine(item);
}
Result is:
2P
1
U
F8
^\||\|$
You can try this.Replace by empty string.Use verbatim mode.See demo.
https://regex101.com/r/oF9hR9/14
For completionists-sake, you can also use Mid
Strings.Mid("|2P|1|U|F8|", 2, s.Length - 2)
This will cut out the part from the second character to the previous to last one and produce the correct output.
I'm assuming that at some point you will want to parse the string to extract its '|' separated components, so here goes another alternative that goes in that direction:
string.Join("|", theString.Split(new[] {'|'}, StringSplitOptions.RemoveEmptyEntries))

how to remove special char from the string and make new string?

I have a string 4(4X),4(4N),3(3X) from this string I want to make string 4,4,3. If I am getting the string 4(4N),3(3A),2(2X) then I want to make my string 4,3,2.
Please someone tell me how can I solve my problem.
This Linq query selects substring from each part of input string, starting from beginning till first open brace:
string input = "4(4N),3(3A),2(2X)";
string result = String.Join(",", input.Split(',')
.Select(s => s.Substring(0, s.IndexOf('('))));
// 4,3,2
This may help:
string inputString = "4(4X),4(4N),3(3X)";
string[] temp = inputString.Split(',');
List<string> result = new List<string>();
foreach (string item in temp)
{
result.Add(item.Split('(')[0]);
}
var whatYouNeed = string.Join(",", result);
You can use regular expressions
String input = #"4(4X),4(4N),3(3X)";
String pattern = #"(\d)\(\1.\)";
// ( ) - first group.
// \d - one number
// \( and \) - braces.
// \1 - means the repeat of first group.
String result = Regex.Replace(input, pattern, "$1");
// $1 means, that founded patterns will be replcaed by first group
//result = 4,4,3

Regular expression to break string C#

Here is my string:
1-1 This is my first string. 1-2 This is my second string. 1-3 This is my third string.
How can I break like in C# like;
result[0] = This is my first string.
result[1] = This is my second string.
result[2] = This is my third string.
IEnumerable<string> lines = Regex.Split(text, "(?:^|[\r\n]+)[0-9-]+ ").Skip(1);
EDIT: If you want the result in an array you can do string[] result = lines.ToArray();
Regex regex = new Regex("^(?:[0-9]+-[0-9]+ )(.*?)$", RegexOptions.Multiline);
var str = "1-1 This is my first string.\n1-2 This is my second string.\n1-3 This is my third string.";
var matches = regex.Matches(str);
List<string> strings = matches.Cast<Match>().Select(p => p.Groups[1].Value).ToList();
foreach (var s in strings)
{
Console.WriteLine(s);
}
We use a multiline Regex, so that ^ and $ are the beginning and end of the line. We skip one or more numbers, a -, one or more numbers and a space (?:[0-9]+-[0-9]+ ). We lazily (*?) take everything (.) else until the end of the line (.*?)$, lazily so that the end of the line $ is more "important" than any character .
Then we put the matches in a List<string> using Linq.
Lines will end with newline, carriage-return or both, This splits the string into lines with all line-endings.
using System.Text.RegularExpressions;
...
var lines = Regex.Split( input, "[\r\n]+" );
Then you can do what you want with each line.
var words = Regex.Split( line[i], "\s" );

Categories