how to split a string by whitespaces in C#? - c#

How can I split this by whitespaces. (the first lines is its header)
I try this code but error "index out of range" at cbay.ABS = columnsC[5] because the second line return only 4 instead of 6 elements like in 1st line. I want the 2nd line also return 6 elements.
using (StringReader strrdr = new StringReader(strData))
{
string str;
while ((str = strrdr.ReadLine()) != null)
{
// str = str.Trim();
if ((Regex.IsMatch(str.Substring(0, 1), #"J")) || (Regex.IsMatch(str.Substring(0, 1), #"C")))
{
columnsC = Regex.Split(str, " +");
cbay.AC = columnsC[1];
cbay.AU = columnsC[2];
cbay.SA = columnsC[3];
cbay.ABS = columnsC[5];
// cbay.ABS = str;
}
}
}

In order to get only words without redundant witespaces you could pass StringSplitOptions.RemoveEmptyEntries as second argument for the Split method of the string and if will remove all redundant "whitespaces" since it will split on each whitespace. Instead of using Regex check this simple example:
string inputString = "Some string with words separated with multiple blanck characters";
string[] words = inputString.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string resultString = String.Join(" ", words); //joins the words without multiple whitespaces, this is for test only.
EDIT In your particular case, if you use this string where parts are separated with multiple whitespaces (at least three) it will work. Check the example:
string inputString = "J 16 16 13 3 3";
string[] words = inputString.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
EDIT2:This is the simplest and the dummiest solution to your problem but I think it will work:
if(str.Length>0 && ((str[0]=="J") || (str[0]=="C")))
{
columnsC = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
if((str[0]=="J")
{
cbay.AC = columnsC[1];
cbay.AU = columnsC[2];
cbay.SA = columnsC[3];
cbay.ABS = columnsC[5];
}
else
{
cbay.AU = columnsC[1];
cbay.SA = columnsC[2];
}
}

You could first replace multiple spaces with zeros and after that split on the remaining single spaces;
var test = "test 1 2 3";
var items = test.Replace(" ", "0").Split(' ');
You might get some 00 positions if there are many spaces, but that will still work I guess

Related

How can I use indexof and substring to extract numbers from a string and make a List<int> of the numbers?

var file = File.ReadAllText(#"D:\localfile.html");
int idx = file.IndexOf("something");
int idx1 = file.IndexOf("</script>", idx);
string results = file.Substring(idx, idx1 - idx);
The result in results is :
arrayImageTimes.push('202110071730');arrayImageTimes.push('202110071745');arrayImageTimes.push('202110071800');arrayImageTimes.push('202110071815');arrayImageTimes.push('202110071830');arrayImageTimes.push('202110071845');arrayImageTimes.push('202110071900');arrayImageTimes.push('202110071915');arrayImageTimes.push('202110071930');arrayImageTimes.push('202110071945');
I need to extract each number between ' and ' and add each number to a List
For example : to extract the number 202110071730 and add this number to a List
You can first split by ; to get a list of statements.
Then split each statement by ' to get everything in front of, between and after the '. Take the middle one ([1]).
string s = "arrayImageTimes.push('202110071730');arrayImageTimes.push('202110071745');arrayImageTimes.push('202110071800');arrayImageTimes.push('202110071815');arrayImageTimes.push('202110071830');arrayImageTimes.push('202110071845');arrayImageTimes.push('202110071900');arrayImageTimes.push('202110071915');arrayImageTimes.push('202110071930');arrayImageTimes.push('202110071945');";
var statements = s.Split(new string[] { ";" }, StringSplitOptions.RemoveEmptyEntries);
foreach (var statement in statements)
{
Console.WriteLine(statement.Split('\'')[1]); // add to a list instead
}
Or, for all the Regex fanboys, '(\d+)' captures a group between ' with some digits:
Regex r= new Regex("'(\\d+)'");
var matches = r.Matches(s);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1].Value); // add to a list instead
}
RegexStorm

c# split string and remove empty string

I want to remove empty and null string in the split operation:
string number = "9811456789, ";
List<string> mobileNos = number.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries).Select(mobile => mobile.Trim()).ToList();
I tried this but this is not removing the empty space entry
var mobileNos = number.Replace(" ", "")
.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries).ToList();
As I understand it can help to you;
string number = "9811456789, ";
List<string> mobileNos = number.Split(',').Where(x => !string.IsNullOrWhiteSpace(x)).ToList();
the result only one element in list as [0] = "9811456789".
Hope it helps to you.
a string extension can do this in neat way as below
the extension :
public static IEnumerable<string> SplitAndTrim(this string value, params char[] separators)
{
Ensure.Argument.NotNull(value, "source");
return value.Trim().Split(separators, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Trim());
}
then you can use it with any string as below
char[] separator = { ' ', '-' };
var mobileNos = number.SplitAndTrim(separator);
I know it's an old question, but the following works just fine:
string number = "9811456789, ";
List<string> mobileNos = number.Split(new char[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries).ToList();
No need for extension methods or whatsoever.
"string,,,,string2".Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
return ["string"],["string2"]
The easiest and best solution is to use both StringSplitOptions.TrimEntries to trim the results and StringSplitOptions.RemoveEmptyEntries to remove empty entries, fed in through the pipe operator (|).
string number = "9811456789, ";
List<string> mobileNos = number
.Split(',', StringSplitOptions.TrimEntries | StringSplitOptions.RemoveEmptyEntries)
.ToList();
Checkout the below test results to compare how each option works,

Split string by List

Split string by List:
I have SplitColl with delimeters:
xx
yy
..
..
And string like this:
strxx
When i try to split string:
var formattedText = "strxx";
var lst = new List<String>();
lst.Add("xx");
lst.Add("yy");
var arr = formattedText.Split(lst.ToArray(), 10, StringSplitOptions.RemoveEmptyEntries);
I have "str" result;
But how to skip this result? I want to get empty array in this case (when delim is a part of a word).
I expect, that when formattedText="str xx", result is str.
EDIT:
I have a many delimeters of address: such as street,city,town,etc.
And i try to get strings like: city DC-> DC.
But, when i get a word like:cityacdc-> i get acdc, but it not a name of a city.
It seems that you are not using your keywords really as delimiters but as search criterion. In this case you could use RegEx to search for each pattern. Here is an example program to illustrate this procedure:
static void Main(string[] args)
{
List<string> delim = new List<string> { "city", "street" };
string formattedText = "strxx street BakerStreet cityxx city London";
List<string> results = new List<string>();
foreach (var del in delim)
{
string s = Regex.Match(formattedText, del + #"\s\w+\b").Value;
if (!String.IsNullOrWhiteSpace(s))
{
results.Add(s.Split(' ')[1]);
}
}
Console.WriteLine(String.Join("\n", results));
Console.ReadKey();
}
This would handle this case:
And I try to get strings like: city DC --> DC
to handle the case where you want to find the word in front of your keyword:
I expect, that when formattedText="str xx", result is str
just switch the places of the matching criterion:
string s = Regex.Match(formattedText, #"\b\w+\s"+ del).Value;
and take the first element at the split
results.Add(s.Split(' ')[0]);
Give this a try, basically what I'm doing is first I remove any leading or tailing delimiters (only if they are separated with a space) from the formattedText string. Then using the remaining string I split it for each delimiter if it has spaces on both sides.
//usage
var result = FormatText(formattedText, delimiterlst);
//code
static string[] FormatText(string input, List<string> delimiters)
{
delimiters.ForEach(d => {
TrimInput(ref input, "start", d.ToCharArray());
TrimInput(ref input, "end", d.ToCharArray());
});
return input.Split(delimiters.Select(d => $" {d} ").ToArray(), 10, StringSplitOptions.RemoveEmptyEntries);
}
static void TrimInput(ref string input, string pos, char[] delimiter)
{
//backup
string temp = input;
//trim
input = (pos == "start") ? input.TrimStart(delimiter) : input.TrimEnd(delimiter);
string trimmed = (pos == "start") ? input.TrimStart() : input.TrimEnd();
//update string
input = (input != trimmed) ? trimmed : temp;
}

RegularExpressions with C#

How can I use Regular Expressions to split this string
String s = "[TEST name1=\"smith ben\" name2=\"Test\" abcd=\"Test=\" mmmm=\"Test=\"]";
into a list like below:
name1 smith ben
name2 Test
abcd Test=
mmmm Test=`
It is similar to getting attributes from an element but not quite.
The first thing to do is remove the brackets and 'TEST' part from the string so you are just left with the keys and values. Then you can split it (based on '\"') into an array, where the odd entries will be the keys, and the even entries will be the values. After that, it's easy enough to populate your list:
String s = "[TEST name1=\"smith ben\" name2=\"Test\" abcd=\"Test=\" mmmm=\"Test=\"]";
SortedList<string, string> list = new SortedList<string, string>();
//Remove the start and end tags
s = s.Remove(0, s.IndexOf(' '));
s = s.Remove(s.LastIndexOf('\"') + 1);
//Split the string
string[] pairs = s.Split(new char[] { '\"' }, StringSplitOptions.None);
//Add each pair to the list
for (int i = 0; i+1 < pairs.Length; i += 2)
{
string left = pairs[i].TrimEnd('=', ' ');
string right = pairs[i+1].Trim('\"');
list.Add(left, right);
}

Split it into an array

I have this string of proxy addresses, they are separated by an space, however x400 and x500 handles spaces into their addresses. What's the best approach to split it.
e.g.
smtp:john#a-mygot.com smtp:john#b-mygot.com smtp:john#c-mygot.com X400:C=us;A= ;P=mygot;O=Exchange;S=John;G=Gleen; SMTP:john#mygot.com
Expected result:
smtp:john#a-mygot.com
smtp:john#b-mygot.com
smtp:john#c-mygot.com
X400:C=us;A= ;P=mygot;O=Exchange;S=John;G=Gleen;
SMTP:john#mygot.com
thanks,
EDIT,
string mylist = "smtp:john#a-mygot.com smtp:john#b-mygot.com smtp:john#c-mygot.com X400:C=us;A= ;P=mygot;O=Exchange;S=John;G=Gleen; SMTP:john#mygot.com X500:/o=Example/ou=USA/cn=Recipients of /cn=juser smtp:myaddress";
string[] results = Regex.Split(mylist, #" +(?=\w+:)");
foreach (string part in results)
{
Console.WriteLine(part);
}
Result
smtp:john#a-mygot.com
smtp:john#b-mygot.com
smtp:john#c-mygot.com
X400:C=us;A= ;P=mygot;O=Exchange;S=John;G=Gleen;
SMTP:john#mygot.com
X500:/o=Example/ou=USA/cn=Recipients of /cn=juser
smtp:myaddress
Here is a Regex that should match the spaces before protocols. Try plugging it into Regex.Split like so:
string[] results = Regex.Split(input, #" +(?=\w+:)");
int index = smtp.indexOf("X400") ;
string[] smtps = smtpString.SubString(0,index).Split(" ") ;
int secondIndex = smtpString.indexOf("SMTP");
string xfour = smtpString.substring(index,secondIndex);
string lastString = smtpString.indexOf(secondIndex) ;
Should work, if the string format is that way.. and if I didn't screw up the indexes.. although you might want to check if the index isn't -1
Try this:
public static string[] SplitProxy(string text)
{
var list = new List<string>();
var tokens = text.Split(new char[] { ' ' });
var currentToken = new StringBuilder();
foreach (var token in tokens)
{
if (token.ToLower().Substring(0, 4) == "smtp")
{
if (currentToken.Length > 0)
{
list.Add(currentToken.ToString());
currentToken.Clear();
}
list.Add(token);
}
else
{
currentToken.Append(token);
}
}
if (currentToken.Length > 0)
list.Add(currentToken.ToString());
return list.ToArray();
}
It splits the string by spaces into tokens then goes through them one by one. If the token starts with smtp it is added to the result array. If not, that token is concatted with the following tokens to create one entry than is added to the result array. Should work with anything that has spaces and doesn't start with smtp.
I reckon the following line should do the work
var addrlist = variable.Split(new char[] { ' ' },StringSplitOptions.RemoveEmptyEntries);

Categories