Splitting on “,” but not “/,” - c#

Question: How do I write an expression to split a string on ',' but not '/,'? Later I'll want to replace '/,' with ', '.
Details...
Delimiter: ','
Skip Char: '/'
Example input: "Mister,Bill,is,made,of/,clay"
I want to split this input into an array: {"Mister", "Bill", "is", "made", "of, clay"}
I know how to do this with a char prev, cur; and some indexers, but that seems beta.
Java Regex has a split functionality, but I don't know how to replicate this behavior in C#.
Note: This isn't a duplicate question, this is the same question but for a different language.

I believe you're looking for a negative lookbehind:
var regex = new Regex("(?<!/),");
var result = regex.Split(str);
this will split str on all commas that are not preceded by a slash. If you want to keep the '/,' in the string then this will work for you.
Since you said that you wanted to split the string and later replace the '/,' with ', ', you'll want to do the above first then you can iterate over the result and replace the strings like so:
var replacedResult = result.Select(s => s.Replace("/,", ", ");

string s = "Mister,Bill,is,made,of/,clay";
var arr = s.Replace("/,"," ").Split(',');
result : {"Mister", "Bill", "is", "made", "of clay"}

Using Regex:
var result = Regex.Split("Mister,Bill,is,made,of/,clay", "(?<=[^/]),");

Just use a Replace to remove the commas from your string :
s.Replace("/,", "//").Split(',').Select(x => x.Replace("//", ","));

You can use this in c#
string regex = #"(?:[^\/]),";
var match = Regex.Split("Mister,Bill,is,made,of/,clay", regex, RegexOptions.IgnoreCase);
After that you can replace /, and continue your operation as you like

Related

Regex Ignore first and last terminator

I have string in text that have uses | as a delimiter.
Example:
|2P|1|U|F8|
I want the result to be 2P|1|U|F8. How can I do that?
The regex is very easy, but why not just use Trim():
var str = "|2P|1|U|F8|";
str = str.Trim(new[] {'|'});
or just without new[] {...}:
str = str.Trim('|');
Output:
In case there are leading/trailing whitespaces, you can use chained Trims:
var str = "\r\n |2P|1|U|F8| \r\n";
str = str.Trim().Trim('|');
Output will be the same.
You can use String.Substring:
string str = "|2P|1|U|F8|";
string newStr = str.Substring(1, str.Length - 2);
Just remove the starting and the ending delimiter.
#"^\||\|$"
Use the below regex and then replace the match with an empty string.
Regex rgx = new Regex(#"^\||\|$");
string result = rgx.Replace(input, "");
Use mulitline modifier m when you're dealing with multiple lines.
Regex rgx = new Regex(#"(?m)^\||\|$");
Since | is a special char in regex, you need to escape this in-order to match a literal | symbol.
string input = "|2P|1|U|F8|";
foreach (string item in input.Split("|".ToCharArray(), StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine(item);
}
Result is:
2P
1
U
F8
^\||\|$
You can try this.Replace by empty string.Use verbatim mode.See demo.
https://regex101.com/r/oF9hR9/14
For completionists-sake, you can also use Mid
Strings.Mid("|2P|1|U|F8|", 2, s.Length - 2)
This will cut out the part from the second character to the previous to last one and produce the correct output.
I'm assuming that at some point you will want to parse the string to extract its '|' separated components, so here goes another alternative that goes in that direction:
string.Join("|", theString.Split(new[] {'|'}, StringSplitOptions.RemoveEmptyEntries))

Split string into array that is enclosed in capturing parentheses

I have the following string in my project:
((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))
I want to split this string into parts where i get the information within the capturing parentheses (for example 1,Filepath or (1,Filepath), but the whole string is in capturing parentheses too as you can see. The result i then try to put into array with string[] array = Regex.Split(originalString,SomeRegexHere)
Now i am wondering what would be the best approach be, just remove the first and last character of the string so i don't have the capturing parentheses enclosing the whole string, or is there some way to use Regular expressions on this to get the result i want to ?
string s = "((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))";
var data = s.Split(new string[]{"(", ")"}, StringSplitOptions.RemoveEmptyEntries)
your Data would be then
["1,01/31/2015",
"1,Filepath",
"1,name",
"1,code",
"1,String",
"1,"]
You can create a substring without the first 2 and last 2 brackets and then split this on the enclosing brackets
var s = "((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))";
var result = s.Substring(2, s.Length - 4)
.Split(new string[]{")("}, StringSplitOptions.RemoveEmptyEntries);
foreach(var r in result)
Console.WriteLine(r);
Output
1,01/31/2015
1,Filepath
1,name
1,code
1,String
1,
Example
(?<=\()[^()]*(?=\))
Just do a match and get your contents instead of splitting.See demo.
https://regex101.com/r/eS7gD7/15

Simple Regex Befuddlement

I have some strings of the form
string strA = "Cmd:param1:'C:\\SomePath\SomeFileName.ext':param2";
string strB = "Cmd:'C:\\SomePath\SomeFileName.ext':param2:param3";
I want to split this string on ':' so I can extract the N parameters. Some parameters can contain file paths [as explicitly] shown and I don't want to split on the ':'s that are within the parentheses. I can do this with a regex but I am confused as to how to get the regex to split only if there is no "'" on both sides of the colon.
I have attempted
string regex = #"(?<!'):(?!')";
string regex = #"(?<!'(?=')):";
that is continue matching only if no "'" on the left and no "'" on the right (negative look behind/ahead), but this is still splitting on the colon contained in 'C:\SomePath\SomeFileName.ext'.
How can I amend this regex to do as I require?
Thanks for your time.
Note: I have found that the following regex works. However, I would like to know if there is a better way of doing this?
string regex = #"(?<!.*'.*):|:(?!.*'.*)";
Consider this approach:
var guid = Guid.NewGuid().ToString();
var r = Regex.Replace(strA, #"'.*'", m =>
{
return m.Value.Replace(":", guid);
})
.Split(':')
.Select(s => s.Replace(guid, ":"))
.ToList();
Rather than try to construct a lookbehind regex to split on, you could construct a regex to match the fields themselves and take the set of matches of that regex. EG a field is either a quoted sequence of non-quotes (ie can include :), or it can't include the separator:
string regex = "'[^']*'|[^':]*";
var result = Regex.Matches(strA, regex);
You want to split on (?<!\b[a-z]):(?!\\) (use RegexOptions.IgnoreCase).
Not as pretty but you could replace :\ with safe characters and then return them back to :\ after the split.
string[] param = strA.Replace(#":\", "|||").Split(':').Select(x => x.Replace("|||", #":\")).ToArray();

C# Why i can not split the string?

string myNumber = "3.44";
Regex regex1 = new Regex(".");
string[] substrings = regex1.Split(myNumber);
foreach (var substring in substrings)
{
Console.WriteLine("The string is : {0} and the length is {1}",substring, substring.Length);
}
Console.ReadLine();
I tried to split the string by ".", but it the splits return 4 empty string. Why?
. means "any character" in regular expressions. So don't split using a regex - split using String.Split:
string[] substrings = myNumber.Split('.');
If you really want to use a regex, you could use:
Regex regex1 = new Regex(#"\.");
The # makes it a verbatim string literal, to stop you from having to escape the backslash. The backslash within the string itself is an escape for the dot within the regex parser.
the easiest solution would be: string[] val = myNumber.Split('.');
. is a reserved character in regex. if you literally want to match a period, try:
Regex regex1 = new Regex(#"\.");
However, you're better off simply using myNumber.Split(".");
The dot matches a single character, without caring what that character
is. The only exception are newline characters.
Source: http://www.regular-expressions.info/dot.html
Therefore your implying in your code to split the string at each character.
Use this instead.
string substr = num.Split('.');
Keep it simple, use String.Split() method;
string[] substrings = myNumber.Split('.');
It has an other overload which allows specifying split options:
public string[] Split(
char[] separator,
StringSplitOptions options
)
You don't need regex you do that by using Split method of string object
string myNumber = "3.44";
String[] substrings = myNumber.Split(".");
foreach (var substring in substrings)
{
Console.WriteLine("The string is : {0} and the length is {1}",substring, substring.Length);
}
Console.ReadLine();
The period "." is being interpreted as any single character instead of a literal period.
Instead of using regular expressions you could just do:
string[] substrings = myNumber.Split(".");
In Regex patterns, the period character matches any single character. If you want the Regex to match the actual period character, you must escape it in the pattern, like so:
#"\."
Now, this case is somewhat simple for Regex matching; you could instead use String.Split() which will split based on the occurrence of one or more static strings or characters:
string[] substrings = myNumber.Split('.');
try
Regex regex1 = new Regex(#"\.");
EDIT: Er... I guess under a minute after Jon Skeet is not too bad, anyway...
You'll want to place an escape character before the "." - like this "\\."
"." in a regex matches any character, so if you pass 4 characters to a regex with only ".", it will return four empty strings. Check out this page for common operators.
Try
Regex regex1 = new Regex("[.]");

Substring of a variant string

I have the following return of a printer:
{Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000}
Ok, I need to save, in a list, the return in parts.
e.g.
[0] "Ta005000000000000000000F"
[1] "00000000000000000I"
[2] "00000000000000000N"
...
The problem is that the number of characters varies.
A tried to make it going into the 'space', taking the substring, but failed...
Any suggestion?
Use String.Split on a single space, and use StringSplitOptions.RemoveEmptyEntries to make sure that multiple spaces are seen as only one delimiter:
var source = "00000000000000000FS 0000000...etc";
var myArray = source.Split(' ', StringSplitOptions.RemoveEmptyEntries);
#EDIT: An elegant way to get rid of the braces is to include them as separators in the Split (thanks to Joachim Isaksson in the comments):
var myArray = source.Split(new[] {' ', '{', '}'}, StringSplitOptions.RemoveEmptyEntries);
You could use a Regex for this:
string input = "{Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000}";
IEnumerable<string> matches = Regex.Matches(input, "[0-9a-zA-Z]+").Select(m => m.Value);
You can use string.split to create an array of substrings. Split allows you to specify multiple separator characters and to ignore repeated splits if necessary.
You could use the .Split member of the "String" class and split the parts up to that you want.
Sample would be:
string[] input = {Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000};
string[] splits = input.Split(' ');
Console.WriteLine(splits[0]); // Ta005000000000000000000F
And so on.
Just off the bat. Without considering the encompassing braces:
string printMsg = "Ta005000000000000000000F 00000000000000000I
00000000000000000N 00000000000000000FS
00000000000000000IS 00000000000000000NS 00000000000000000";
string[] msgs = printMsg.Split(' ').ForEach(s=>s.Trim()).ToArray();
Could work.

Categories