Split separates strings and ignore/remove other delimited string - c#

I'm reading a comma-delimited list of strings from a config file. I need to check the following steps
1) check to see if the string has `[`, if it is then remove or ignore...
2) split `,` `-` //which i am doing below...
Here is what I able to do so far;
string mediaString = "Cnn-[news],msnbc";
string[] split = mediaString.Split(new Char[] { ',', '-' }); //gets me the bracket
what I want is to ignore/remove the string which is in the brackets so the end result should be like this:
mediaString =
Cnn
msnbc

Using Linq:
mediaString.Split(new Char[] { ',', '-' }).Where(val => !val.Contains('[')
You can make the test (val.Contains(...)) as sophisticated as you like (e.g. starts and ends with, regular expression, specific values, call an object provided via a DI framework if you want to get all enterprisey).

Use Regex replace to clean your string
string str = #"Cnn-[news],msnbc";
Regex regex = new Regex(#"\[.*\]");
string cleanStr = regex.Replace(str, "");
string[] split = cleanStr.Split(new Char[] { ',', '-' });

Without using LINQ or regex:
Split your string as you are doing now.
Create a data structure of string type for example: List.
Run over the results array and for each entry check it contains the specified character, if it doesn't add it to the List.
In the end you should have a List with the required result.
This The regex solution is far more elegant but if you cannot use reg ex this one should do it

Related

C# Regex split by comma outside the { }

I am not as familiar with RegEx as I probably should be.
However, I am looking for an expression(s) that matches a variant of values.
My string:
2020/09/10 05:41:02,ABC,888,!"#$%'()=~|{`}*+_?><-^\#[;:]./\,{"data1-1":"48.16","data1-2":"!"#$%'()=~|{`}*+_?><-^\#[;:]./\"}
I am trying to split comma using regular expression to get the result below:
string regex = "," + #"\s*(?![^{}]*\})";
List listResult = Regex.Split(myString, regex).ToList();
The received results are not correct.
Can regular expressions be used in this case?
What could i use to split that string according to every comma outside the { }? Cheers
I'm not sure how this works with regular expressions. However, instead of using regex, you could just create a list with your delimiters and use the string.split method:
char[] delim = new [] {','}; //in your case just one delimiter
var listResult = myString.Split(delim, StringSplitOptions.RemoveEmptyEntries);
The string.split method returns an array.
You can check how comma separated value format (CSV) is usually parsed.
Here with a regex : https://stackoverflow.com/a/18147076/6424355
Split using comma is simpler if you don't needs quotes

Split string into array that is enclosed in capturing parentheses

I have the following string in my project:
((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))
I want to split this string into parts where i get the information within the capturing parentheses (for example 1,Filepath or (1,Filepath), but the whole string is in capturing parentheses too as you can see. The result i then try to put into array with string[] array = Regex.Split(originalString,SomeRegexHere)
Now i am wondering what would be the best approach be, just remove the first and last character of the string so i don't have the capturing parentheses enclosing the whole string, or is there some way to use Regular expressions on this to get the result i want to ?
string s = "((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))";
var data = s.Split(new string[]{"(", ")"}, StringSplitOptions.RemoveEmptyEntries)
your Data would be then
["1,01/31/2015",
"1,Filepath",
"1,name",
"1,code",
"1,String",
"1,"]
You can create a substring without the first 2 and last 2 brackets and then split this on the enclosing brackets
var s = "((1,01/31/2015)(1,Filepath)(1,name)(1,code)(1,String)(1, ))";
var result = s.Substring(2, s.Length - 4)
.Split(new string[]{")("}, StringSplitOptions.RemoveEmptyEntries);
foreach(var r in result)
Console.WriteLine(r);
Output
1,01/31/2015
1,Filepath
1,name
1,code
1,String
1,
Example
(?<=\()[^()]*(?=\))
Just do a match and get your contents instead of splitting.See demo.
https://regex101.com/r/eS7gD7/15

Splitting on “,” but not “/,”

Question: How do I write an expression to split a string on ',' but not '/,'? Later I'll want to replace '/,' with ', '.
Details...
Delimiter: ','
Skip Char: '/'
Example input: "Mister,Bill,is,made,of/,clay"
I want to split this input into an array: {"Mister", "Bill", "is", "made", "of, clay"}
I know how to do this with a char prev, cur; and some indexers, but that seems beta.
Java Regex has a split functionality, but I don't know how to replicate this behavior in C#.
Note: This isn't a duplicate question, this is the same question but for a different language.
I believe you're looking for a negative lookbehind:
var regex = new Regex("(?<!/),");
var result = regex.Split(str);
this will split str on all commas that are not preceded by a slash. If you want to keep the '/,' in the string then this will work for you.
Since you said that you wanted to split the string and later replace the '/,' with ', ', you'll want to do the above first then you can iterate over the result and replace the strings like so:
var replacedResult = result.Select(s => s.Replace("/,", ", ");
string s = "Mister,Bill,is,made,of/,clay";
var arr = s.Replace("/,"," ").Split(',');
result : {"Mister", "Bill", "is", "made", "of clay"}
Using Regex:
var result = Regex.Split("Mister,Bill,is,made,of/,clay", "(?<=[^/]),");
Just use a Replace to remove the commas from your string :
s.Replace("/,", "//").Split(',').Select(x => x.Replace("//", ","));
You can use this in c#
string regex = #"(?:[^\/]),";
var match = Regex.Split("Mister,Bill,is,made,of/,clay", regex, RegexOptions.IgnoreCase);
After that you can replace /, and continue your operation as you like

Substring of a variant string

I have the following return of a printer:
{Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000}
Ok, I need to save, in a list, the return in parts.
e.g.
[0] "Ta005000000000000000000F"
[1] "00000000000000000I"
[2] "00000000000000000N"
...
The problem is that the number of characters varies.
A tried to make it going into the 'space', taking the substring, but failed...
Any suggestion?
Use String.Split on a single space, and use StringSplitOptions.RemoveEmptyEntries to make sure that multiple spaces are seen as only one delimiter:
var source = "00000000000000000FS 0000000...etc";
var myArray = source.Split(' ', StringSplitOptions.RemoveEmptyEntries);
#EDIT: An elegant way to get rid of the braces is to include them as separators in the Split (thanks to Joachim Isaksson in the comments):
var myArray = source.Split(new[] {' ', '{', '}'}, StringSplitOptions.RemoveEmptyEntries);
You could use a Regex for this:
string input = "{Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000}";
IEnumerable<string> matches = Regex.Matches(input, "[0-9a-zA-Z]+").Select(m => m.Value);
You can use string.split to create an array of substrings. Split allows you to specify multiple separator characters and to ignore repeated splits if necessary.
You could use the .Split member of the "String" class and split the parts up to that you want.
Sample would be:
string[] input = {Ta005000000000000000000F 00000000000000000I 00000000000000000N 00000000000000000FS 00000000000000000IS 00000000000000000NS 00000000000000000};
string[] splits = input.Split(' ');
Console.WriteLine(splits[0]); // Ta005000000000000000000F
And so on.
Just off the bat. Without considering the encompassing braces:
string printMsg = "Ta005000000000000000000F 00000000000000000I
00000000000000000N 00000000000000000FS
00000000000000000IS 00000000000000000NS 00000000000000000";
string[] msgs = printMsg.Split(' ').ForEach(s=>s.Trim()).ToArray();
Could work.

C# string manipulation

I have a string like
A150[ff;1];A160;A100;D10;B10'
in which I want to extract A150, D10, B10
In between these valid string, i can have any characters. The one part that is consistent is the semicolumn between each legitimate strings.
Again the junk character that I am trying to remove itself can contain the semi column
Without having more detail for the specific rules, it looks like you want to use String.Split(';') and then construct a regex to parse out the string you really need foreach string in your newly created collection. Since you said that the semi colon can appear in the "junk" it's irrelevant since it won't match your regex.
var input = "A150[ff+1];A160;A150[ff-1]";
var temp = new List<string>();
foreach (var s in input.Split(';'))
{
temp.Add(Regex.Replace(s, "(A[0-9]*)\\[*.*", "$1"));
}
foreach (var s1 in temp.Distinct())
{
Console.WriteLine(s1);
}
produces the output
A150
A160
First,you should use
string s="A150[ff;1];A160;A100;D10;B1";
s.IndexOf("A160");
Through this command you can get the index of A160 and other words.
And then s.Remove(index,count).
If you only want to remove the 'junk' inbetween the '[' and ']' characters you can use regex for that
Regex regex = new Regex(#"\[([^\}]+)\]");
string result = regex.Replace("A150[ff;1];A160;A100;D10;B10", "");
Then String.Split to get the individual items

Categories