Splitting on ; then on =, having issues using String.Split, regex required - c#

"key1"="value1 http://www.example.com?a=1";"key2"="value2 http://www.example.com?a=2";
I need to split the above line 2 times, the first time it is the comma character ; and the second time on the = sign.
It doesn't work correctly because the value part has the = sign in it also.
My code doesn't work as it was assuming the value part doesnt' have an = sign in it, and it isn't using regex simply String.Split('=').
Can someone help with the regex required, I added double quotes around both the key/value to help keep things seperate.

I didn't use a regex, but you could do something like the following:
string test =#"""key1""=""value1 http://www.example.com?a=1"";""key2""=""value2 http://www.example.com?a=2""";
string[] arr = test.Split(';');
foreach (string s in arr)
{
int index = s.IndexOf('=');
string key = s.Substring(0, index);
string value = s.Substring(index+1, s.Length - index);
}

Use the String.Split(char[], int) overload (http://msdn.microsoft.com/en-us/library/c1bs0eda.aspx). The second parameter will limit the number of substrings to return. If you know your strings will always have at least 1 equal sign (key/value pairs), then set the second parameter to 2.
string x = "key1=value1 http://www.example.com?a=1;key2=value2 http://www.example.com?a=2;";
char[] equal = new char[1] { '=' };
char[] semi = new char[1] { ';' };
string[] list = x.Split(semi, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in list)
{
string[] kvp = s.Split(equal, 2);
Console.WriteLine("Key: {0}, Value: {1}", kvp[0], kvp[1]);
}
-
Result:
Key: key1, Value: value1 http://www.example.com?a=1
Key: key2, Value: value2 http://www.example.com?a=2

Well what you can do, is you can use IndexOf to get the index of the first =
int i = myStr.IndexOf('=');
and then you can use the String.Substring to get the key and value
string key = myStr.Substring(0, i)
string value = myStr.SubString(i+1);
Here is some documentation on the String Class that you might find useful

You need to match not split the text
var keys= Regex.Matches(yourString,"""(.*?)""=.*?(http.*?)"";").Cast<Match>().Select(x=>
new
{
key=x.Groups[1].Value,
value=x.Groups[2].Value
}
);
foreach(key in keys)
{
key.key;//the key value
key.value;//the value
}

Your regex should look like this.
"(.+?)"="(.+?)"
Sadly I do not know C# but this should work in every language. To get the results you have to select for every match:
group(1) as keys
group(2) as values

You could also try using the Split method with multiple tokens
here this will give you a string[] of multiple values that were split out based on your tokens
if you want to remove empty Entries you could also do the code like this
string strValue = #"""key1""=""value1 http://www.example.com?a=1"";""key2""=""value2 http://www.example.com?a=2""";
string[] strSplit = strValue.Split(new string[] { "\";\"", "\"=\"", "\"" }, StringSplitOptions.RemoveEmptyEntries);
Results
strSplit {string[4]} string[]
[0] "key1"
[1] "value1 http://www.example.com?a=1"
[2] "key2"
[3] "value2 http://www.example.com?a=2"

Use String.Split with StringSplitOptions.RemoveEmptyEntries and an array of strings with delimiters
string s = "\"key1\"=\"value1 http://www.example.com?a=1\";\"key2\"=\"value2 http://www.example.com?a=2\"";
string[] result = s.Split(new string[] { "\";\"", "\"=\"", "\"" },
StringSplitOptions.RemoveEmptyEntries);
result = {string[4]}
[0]: "key1"
[1]: "value1 http://www.example.com?a=1"
[2]: "key2"
[3]: "value2 http://www.example.com?a=2"
I use the following delimiters (including the double quotes):
";"
"="
"

Related

How to split string with longer separators being preferred over shorter ones?

I have a string which I want to split in two. Usually it is a name, operator and a value. I'd like to split it into name and value. The name can be anything, the value too. What I have, is an array of operators and my idea is to use it as separators:
var input = "name>=2";
var separators = new string[]
{
">",
">=",
};
var result = input.Split(separators, StringSplitOptions.RemoveEmptyEntries);
Code above gives result being name and =2. But if I rearrange the order of separators, so the >= would be first, like this:
var separators = new string[]
{
">=",
">",
};
That way, I'm getting nice name and 2 which is what I'm trying to achieve. Sadly, keeping the separators in a perfect order is a no go for me. Also, my collection of separators is not immutable. So, I'm thinking maybe I could split the string with longer separators given precedence over the shorter ones?
Thanks for help!
Here is a related question, explaining why such behaviour occurs in Split() method.
You can try several options. If you have a colelction of the separators, you can sort them in the right order before splitting:
using System.Linq;
...
var result = input.Split(
separators.OrderByDescending(item => item.Length), // longest first
StringSplitOptions.RemoveEmptyEntries);
You can try organizing all (including possible) separators into a single pattern, e.g.
[><=]+
here we split by the longest sequence of >, < and =
var result = Regex.Split(input, "[><=]+");
Demo:
using System.Text.RegularExpressions;
...
string[] tests = new string[] {
"name>123",
"name<4",
"name=78",
"name==other",
"name===other",
"name<>78",
"name<<=4",
"name=>name + 455",
"name>=456",
"a_b_c=d_e_f",
};
string report = string.Join(Environment.NewLine, tests
.Select(test => string.Join("; ", Regex.Split(test, "[><=]+"))));
Console.Write(report);
Outcome:
name; 123
name; 4
name; 78
name; other
name; other
name; 78
name; 4
name; name + 455
name; 456
a_b_c; d_e_f
You may try doing a regex split on an alternation which lists the longer >= first:
var input = "name>=2";
string[] parts = Regex.Split(input, "(?:>=|>)");
foreach(var item in res)
{
Console.WriteLine(item.ToString());
}
This prints:
name
2
Note that had we split on (?:>|>=), the output would have been name and =2.

How to remove Whitespce from stringArray formed based on whitespace

I have a string which contains value like.
90 524 000 1234567890 2207 1926 00:34 02:40 S
Now i have broken this string into string Array based on white-space.Now i want to create one more string array into such a way so that all the white-space gets removed and it contains only real value.
Also i want to get the position of the string array element from the original string array based on the selection from the new string array formed by removing white space.
Please help me.
You can use StringSplitOptions.RemoveEmptyEntries via String.Split.
var values = input.Split(new [] {' '}, StringSplitOptions.RemoveEmptyEntries);
StringSplitOptions.RemoveEmptyEntries: The return value does not include array elements that contain an empty string
When the Split method encounters two consecutive white-space it will return an empty string.Using StringSplitOptions.RemoveEmptyEntries will remove the empty strings and give you only the values you want.
You can also achieve this using LINQ
var values = input.Split().Where(x => x != string.Empty).ToArray();
Edit: If I understand you correctly you want the positions of the values in your old array. If so you can do this by creating a dictionary where the keys are the actual values and the values are indexes:
var oldValues = input.Split(' ');
var values = input.Split().Where(x => x != string.Empty).ToArray();
var indexes = values.ToDictionary(x => x, x => Array.IndexOf(oldValues, x));
Then indexes["1234567890"] will give you the position of 1234567890 in the first array.
You can use StringSplitOptions.RemoveEmptyEntries:
string[] arr = str.Split(new[] { ' ', '\t' }, StringSplitOptions.RemoveEmptyEntries);
Note that i've also added tab character as delimiter. There are other white-space characters like the line separator character, add as desired. Full list here.
string s = "90 524 000 1234567890 2207 1926 00:34 02:40 S ";
s.Split(' ').Where(x=>!String.IsNullOrWhiteSpace(x))

How to break a string at each comma?

Hi guys I have a problem at hand that I can't seem to figure out, I have a string (C#) which looks like this:
string tags = "cars, motor, wheels, parts, windshield";
I need to break this string at every comma and get each word assign to a new string by itself like:
string individual_tag = "car";
I know I have to do some kind of loop here but I'm not really sure how to approach this, any help will be really appreciate it.
No loop needed. Just a call to Split():
var individualStrings = tags.Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);
You can use one of String.Split methods
Split Method (Char[])
Split Method (Char[], StringSplitOptions)
Split Method (String[], StringSplitOptions)
let's try second option:
I'm giving , and space as split chars then on each those character occurrence input string will be split, but there can be empty strings in the results. we can remove them using StringSplitOptions.RemoveEmptyEntries parameter.
string[] tagArray = tags.Split(new char[]{',', ' '},
StringSplitOptions.RemoveEmptyEntries);
OR
string[] tagArray = s.Split(", ".ToCharArray(),
StringSplitOptions.RemoveEmptyEntries);
you can access each tag by:
foreach (var t in tagArray )
{
lblTags.Text = lblTags.Text + " " + t; // update lable with tag values
//System.Diagnostics.Debug.WriteLine(t); // this result can be see on your VS out put window
}
make use of Split function will do your task...
string[] s = tags.Split(',');
or
String.Split Method (Char[], StringSplitOptions)
char[] charSeparators = new char[] {',',' '};
string[] words = tags.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
string[] words = tags.Split(',');
You are looking for the C# split() function.
string[] tags = tags.Split(',');
Edit:
string[] tag = tags.Trim().Split(new string[] { ", " }, StringSplitOptions.RemoveEmptyEntries);
You should definitely use the form supplied by Justin Niessner. There were two key differences that may be helpful depending on the input you receive:
You had spaces after your ,s so it would be best to split on ", "
StringSplitOptions.RemoveEmptyEntries will remove the empty entry that is possible in the case that you have a trailing comma.
Program that splits on spaces [C#]
using System;
class Program
{
static void Main()
{
string s = "there, is, a, cat";
string[] words = s.Split(", ".ToCharArray());
foreach (string word in words)
{
Console.WriteLine(word);
}
}
}
Output
there
is
a
cat
Reference

Splitting a String into only 2 parts

I want to take a string from a textbox (txtFrom) and save the first word and save whatever is left in another part. (the whatever is left is everything past the first space)
Example string = "Bob jones went to the store"
array[0] would give "Bob"
array[1] would give "jones went to the store"
I know there is string[] array = txtFrom.Split(' '); , but that gives me an array of 6 with individual words.
Use String.Split(Char[], Int32) overload like this:
string[] array = txtFrom.Text.Split(new char[]{' '},2);
http://msdn.microsoft.com/en-us/library/c1bs0eda.aspx
You simply combine a split with a join to get the first element:
string[] items = source.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string firstItem = items[0];
string remainingItems = string.Join(" ", items.Skip(1).ToList());
You simply take the first item and then reform the remainder back into a string.
char[] delimiterChars = { ' ', ',' };
string text = txtString.Text;
string[] words = text.Split(delimiterChars, 2);
txtString1.Text = words[0].ToString();
txtString2.Text = words[1].ToString();
There is an overload of the String.Split() method which takes an integer representing the number of substrings to return.
So your method call would become: string[] array = txtFrom.Text.Split(' ', 2);
You can also try RegularExpressions
Match M = System.Text.RegularExpressions.Regex.Match(source,"(.*?)\s(.*)");
M.Groups[1] //Bob
M.Groups[2] // jones went to the store
The regular expression matches everything up to the first space and stores it in the first group the ? mark tells it to make the smallest match possible. The second clause grabs everything after the space and stores it in the second group

string.split - by multiple character delimiter

i am having trouble splitting a string in c# with a delimiter of "][".
For example the string "abc][rfd][5][,][."
Should yield an array containing;
abc
rfd
5
,
.
But I cannot seem to get it to work, even if I try RegEx I cannot get a split on the delimiter.
EDIT: Essentially I wanted to resolve this issue without the need for a Regular Expression. The solution that I accept is;
string Delimiter = "][";
var Result[] = StringToSplit.Split(new[] { Delimiter }, StringSplitOptions.None);
I am glad to be able to resolve this split question.
To show both string.Split and Regex usage:
string input = "abc][rfd][5][,][.";
string[] parts1 = input.Split(new string[] { "][" }, StringSplitOptions.None);
string[] parts2 = Regex.Split(input, #"\]\[");
string tests = "abc][rfd][5][,][.";
string[] reslts = tests.Split(new char[] { ']', '[' }, StringSplitOptions.RemoveEmptyEntries);
Another option:
Replace the string delimiter with a single character, then split on that character.
string input = "abc][rfd][5][,][.";
string[] parts1 = input.Replace("][","-").Split('-');
Regex.Split("abc][rfd][5][,][.", #"\]\]");
More fast way using directly a no-string array but a string:
string[] StringSplit(string StringToSplit, string Delimitator)
{
return StringToSplit.Split(new[] { Delimitator }, StringSplitOptions.None);
}
StringSplit("E' una bella giornata oggi", "giornata");
/* Output
[0] "E' una bella giornata"
[1] " oggi"
*/
In .NETCore 2.0 and beyond, there is a Split overload that allows this:
string delimiter = "][";
var results = stringToSplit.Split(delimiter);
Split (netcore 2.0 version)

Categories