Remove comma's between the [] brackets in string using c# - c#

I want remove comma's between the square brackets[] instead of entire comma from the string.
Here my string is,
string result= "a,b,c,[c,d,e],f,g,[h,i,j]";
Expected output:
a,b,c,[cde],f,g,[hij]
Thanks advance.

As I've written, you need a simple state machine (inside brackets, outside brackets)... Then for each character, you analyze it and if necessary you change the state of the state machine and decide if you need to output it or not.
public static string RemoveCommas(string str)
{
int bracketLevel = 0;
var sb = new StringBuilder(str.Length);
foreach (char ch in str)
{
switch (ch) {
case '[':
bracketLevel++;
sb.Append(ch);
break;
case ']':
if (bracketLevel > 0) {
bracketLevel--;
}
sb.Append(ch);
break;
case ',':
if (bracketLevel == 0) {
sb.Append(ch);
}
break;
default:
sb.Append(ch);
break;
}
}
return sb.ToString();
}
Use it like:
string result = "a,b,c,[c,d,e],f,g,[h,i,j]";
Console.WriteLine(RemoveCommas(result));
Note that to "save" the state of the state machine I'm using an int, so that it works with recursive brackets, like a,b,[c,d,[e,f]g,h]i,j
Just as an interesting exercise, it can be done with a slower LINQ expression:
string result2 = result.Aggregate(new
{
BracketLevel = 0,
Result = string.Empty,
}, (state, ch) => new {
BracketLevel = ch == '[' ?
state.BracketLevel + 1 :
ch == ']' && state.BracketLevel > 0 ?
state.BracketLevel - 1 :
state.BracketLevel,
Result = ch != ',' || state.BracketLevel == 0 ? state.Result + ch : state.Result
}).Result;
In the end the code is very similar... There is a state that is brought along (the BracketLevel) plus the string (Result) that is being built. please don't use it, it is only written as an amusing piece of LINQ.

Regex approach
string stringValue = "a,b,c,[c,d,e],f,g,[h,i,j]";
var result = Regex.Replace(stringValue, #",(?![^\]]*(?:\[|$))", string.Empty);
if you don't have nested brackets

You can try this:
var output = new string(result
.Where((s, index) => s != ',' ||
IsOutside(result.Substring(0, index)))
.ToArray()
);
//output: a,b,c,[cde],f,g,[hij]
And
private static bool IsOutside(string value)
{
return value.Count(i => i == '[') <= value.Count(i => i == ']');
}
But remember this is not the efficient way of doing this job.

Related

text parsing application c# without third party libraries

For example, there is a line:
name, tax, company.
To separate them i need a split method.
string[] text = File.ReadAllLines("file.csv", Encoding.Default);
foreach (string line in text)
{
string[] words = line.Split(',');
foreach (string word in words)
{
Console.WriteLine(word);
}
}
Console.ReadKey();
But how to divide if in quotes the text with a comma is indicated:
name, tax, "company, Ariel";<br>
"name, surname", tax, company;<br> and so on.
To make it like this :
Max | 12.3 | company, Ariel
Alex, Smith| 13.1 | Oriflame
It is necessary to take into account that the input data will not always be in an ideal format (as in the example). That is, there may be 3 quotes in a row or a string without commas. The program should not fall in any case. If it is impossible to parse, then issue a message about it.
Split using double quotes first. And Split using comma on the first string.
You can use TextFieldParser from Microsoft.VisualBasic.FileIO
var list = new List<Data>();
var isHeader=true;
using (TextFieldParser parser = new TextFieldParser(filePath))
{
parser.Delimiters = new string[] { "," };
while (true)
{
string[] parts = parser.ReadFields();
if(isHeader)
{
isHeader = false;
continue;
}
if (parts == null)
break;
list.Add(new Data
{
People = parts[0],
Tax = Double.Parse(parts[1]),
Company = parts[2]
});
}
}
Where Data is defined as
public class Data
{
public string People{get;set;}
public double Tax{get;set;}
public string Company{get;set;}
}
Please note you need to include Microsoft.VisualBasic.FileIO
Example Data,
Name,Tax,Company
Max,12.3,"company, Ariel"
Ariel,13.1,"company, Oriflame"
Output
Here's a bit of code that might help, not the most efficient but I use it to 'see' what is going on with the parsing if a particular line is giving trouble.
string[] text = File.ReadAllLines("file.csv", Encoding.Default);
string[] datArr;
string tmpStr;
foreach (string line in text)
{
ParseString(line, ",", "!####!", out datArr, out tmpStr)
foreach(string s in datArr)
{
Console.WriteLine(s);
}
}
Console.ReadKey();
private static void ParseString(string inputString, string origDelim, string newDelim, out string[] retArr, out string retStr)
{
string tmpStr = inputString;
retArr = new[] {""};
retStr = "";
if (!string.IsNullOrWhiteSpace(tmpStr))
{
//If there is only one Quote character in the line, ignore/remove it:
if (tmpStr.Count(f => f == '"') == 1)
tmpStr = tmpStr.Replace("\"", "");
string[] tmpArr = tmpStr.Split(new[] {origDelim}, StringSplitOptions.None);
var inQuote = 0;
StringBuilder lineToWrite = new StringBuilder();
foreach (var s in tmpArr)
{
if (s.Contains("\""))
inQuote++;
switch (inQuote)
{
case 1:
//Begin quoted text
lineToWrite.Append(lineToWrite.Length > 0
? newDelim + s.Replace("\"", "")
: s.Replace("\"", ""));
if (s.Length > 4 && s.Substring(0, 2) == "\"\"" && s.Substring(s.Length - 2, 2) != "\"\"")
{
//if string has two quotes at the beginning and is > 4 characters and the last two characters are NOT quotes,
//inquote needs to be incremented.
inQuote++;
}
else if ((s.Substring(0, 1) == "\"" && s.Substring(s.Length - 1, 1) == "\"" &&
s.Length > 1) || (s.Count(x => x == '\"') % 2 == 0))
{
//if string has more than one character and both begins and ends with a quote, then it's ok and counter should be reset.
//if string has an EVEN number of quotes, it should be ok and counter should be reset.
inQuote = 0;
}
else
{
inQuote++;
}
break;
case 2:
//text between the quotes
//If we are here the origDelim value was found between the quotes
//include origDelim so there is no data loss.
//Example quoted text: "Dr. Mario, Sr, MD";
// ", Sr" would be handled here
// ", MD" would be handled in case 3 end of quoted text.
lineToWrite.Append(origDelim + s);
break;
case 3:
//End quoted text
//If we are here the origDelim value was found between the quotes
//and we are at the end of the quoted text
//include origDelim so there is no data loss.
//Example quoted text: "Dr. Mario, MD"
// ", MD" would be handled here.
lineToWrite.Append(origDelim + s.Replace("\"", ""));
inQuote = 0;
break;
default:
lineToWrite.Append(lineToWrite.Length > 0 ? newDelim + s : s);
break;
}
}
if (lineToWrite.Length > 0)
{
retStr = lineToWrite.ToString();
retArr = tmpLn.Split(new[] {newDelim}, StringSplitOptions.None);
}
}
}

Variant delimiting behavior depending on brackets wrapping string in C#

I have the following (incorrect) code:
this.MetadataProperties = new HashSet<string>(SplitMetadataProperties(metadataProperties));
private static List<string> SplitMetadataProperties(string properties)
{
// Speed up accesses and splits
char[] propertiesArray = properties.ToCharArray();
// Look for a delimiter
int bracketCount = 0;
int start = 0;
List<string> result = new List<string>();
for (int i = 0; i < propertiesArray.Length; ++i)
{
switch (properties[i])
{
case '[':
// Column open
bracketCount++;
break;
case ']':
// Column close
bracketCount--;
break;
case '|':
// Delimiter
if (bracketCount != 0)
{
// Treat this as a normal character, since it's not actually a delimiter. It's a part of a column.
break;
}
if (i > start)
{
// It's not empty, add it
int propertiesArrayLength = propertiesArray.Length - start;
result.Add(new string(propertiesArray, start, propertiesArrayLength));
}
// This is a delimiter. Split off this property and move to the next one
start = i + 1;
break;
}
// Add last item if needed
if (start < propertiesArray.Length)
{
int propertiesArrayLength = propertiesArray.Length - start;
result.Add(new string(propertiesArray, start, propertiesArrayLength));
}
}
return result;
}
Here's the desired behavior. Suppose we have:
string properties = "foo||||bar";
Desired result is:
expectedHashSet[0] = "foo";
expectedHashSet[1] = "bar";
However, right now I'm getting this:
"foo||||bar"
"|||bar"
"||bar"
"|bar"
"bar"
Note that if you have:
string properties = "[foo||||bar]";
then the desired result is instead the same exact string:
"[foo||||bar]";
"|" and "[" / "]" cause special behavior, hence the switch. I think the issue lies somewhere in the length calculations or related computation, but I'm not quite sure the exact change(s) I need. Any suggestions? Thanks!
I'd recommend the regular expressions solution here:
using System.Text.RegularExpressions;
And
private static IEnumerable<string> SplitMetadataProperties(string properties)
{
string pattern = #"(\[.+\])|[^\|\s]+";
foreach (Match match in Regex.Matches(properties, pattern))
yield return match.Value;
}
Now all you need is to define your HashSet
this.MetadataProperties = new HashSet<string>(SplitMetadataProperties(metadataProperties));
For the input:
[foo||||bar]
foo||||bar
[foo||||bar]
foo||||bar
foo|||bar
[foo||||bar]
[foo||||bar]
The output was:
[foo||||bar]
foo
bar
Pattern explanation:
Find either a string that starts with [ and ends with ] and have at least one character in between no matter what is it; or, a string with any character except for the delimiter (|) and white-spaces.
What you need is a state machine to parse the input. In this case, the state is inside [,] or outside. I normally like to create an IEnumerable state machine to return each parsed field, which can then be further processed as needed.
So, an extension method to handle your parsing:
public static class StringExt {
public static IEnumerable<string> SplitMetadataProperties(this string s) {
var sb = new StringBuilder();
bool inQuote = false;
foreach (var ch in s) {
switch (ch) {
case '[':
inQuote = true;
sb.Append(ch);
break;
case ']':
inQuote = false;
sb.Append(ch);
break;
case '|':
if (inQuote)
sb.Append(ch);
else {
if (sb.Length > 0) {
yield return sb.ToString();
sb.Clear();
}
}
break;
default:
sb.Append(ch);
break;
}
}
if (sb.Length > 0)
yield return sb.ToString();
}
}
And, now you can parse your input and process as needed:
this.MetadataProperties = new HashSet<string>(metadataProperties.SplitMetadataProperties());
NOTE: I would normally use an extension method (that is included in the latest .Net):
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source) => new HashSet<T>(source);
public static HashSet<T> ToHashSet<T>(this IEnumerable<T> source, IEqualityComparer<T> cmp) => new HashSet<T>(source, cmp);
So you would have
this.MetadataProperties = metadataProperties.SplitMetadataProperties().ToHashSet();
You want a list or array? For array:
string properties = "foo||||bar";
if (!(properties.StartWith("[") && properties.EndWith("]"))
{
var expectedHashSet = properties.Split(new char[] { '|'}, StringSplitOptions.RemoveEmptyEntries);
}
For a List simply use .ToList().
I think I now understand what you are asking. This one is actually based on Aly El-Haddad's reply, only made a slight addition for edge cases.
void Main()
{
string properties = "foo||||bar[as|is]ak|yak[yet|another]ll";
foreach (var t in SplitMetadataProperties(properties))
{
Console.WriteLine(t);
}
}
private static IEnumerable<string> SplitMetadataProperties(string properties)
{
foreach (Match m in Regex.Matches(properties, #"\[.+?\]"))
{
properties = properties.Replace(m.Value, "|" + m.Value + "|");
}
string pattern = #"(\[.+?\])|[^\|\s]+";
foreach (Match match in Regex.Matches(properties, pattern))
{
yield return match.Value;
}
}

C# string parse

I have string like this
string temp = "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40','MUL,NBLD,NITA,NUND','','Address line 2'"
Each pair of single quote is a field delimited by a comma. I want to empty the 8th field in the string. I cannot simply do replace("MUL,NBLD,NITA,NUND","") because that field could contain anything. also please note the the 4th field is a number and therefore has no single quote around 5.
How can I achieve this?
static void Main()
{
var temp = "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40','MUL,NBLD,NITA,NUND','','Address line 2'";
var parts = Split(temp).ToArray();
parts[7] = null;
var ret = string.Join(",", parts);
// or replace the above 3 lines with this...
//var ret = string.Join(",", Split(temp).Select((v,i)=>i!=7 ? v : null));
//ret == "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40',,'','Address line 2'"
}
public static IEnumerable<string> Split(string input, char delimiter = ',', char quote = '\'')
{
string temp = "";
bool skipDelimiter = false;
foreach (var c in input)
{
if (c == quote)
skipDelimiter = !skipDelimiter;
else if (c == delimiter && !skipDelimiter)
{
//do split
yield return temp;
temp = "";
continue;
}
temp += c;
}
yield return temp;
}
I made a small implementation below. I explain the logic in the comments. Basically you want to write a simple parser to accomplish what you described.
edit0: just realized I did the opposite of what you asked for oops..fixed now
edit1: replacing the string with null as opposed to eliminating the entire field from the comma-delimited list.
static void Main(string[] args)
{
string temp = "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40','MUL,NBLD,NITA,NUND','','Address line 2'";
//keep track of the single quotes
int singleQuoteCount= 0;
//keep track of commas
int comma_count = 0;
String field = "";
foreach (Char chr in temp)
{
//add to the field string if we are not between the 7th and 8th comma not counting commas between single quotes
if (comma_count != 7)
field += chr;
//plug in null string between two single quotes instead of whatever chars are in the eigth field.
else if (chr == '\'' && singleQuoteCount %2 ==1)
field += "\'',";
if (chr == '\'') singleQuoteCount++;
//only want to add to comma_count if we are outside of single quotes.
if (singleQuoteCount % 2 == 0 && chr == ',') comma_count++;
}
}
If you would use '-' (or other char) instead of ',' inside of the fields (exam: 'MUL-NBLD-NITA-NUND'), you could use this code:
static void Main(string[] args)
{
string temp = "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40','MUL-NBLD-NITA-NUND','','Address line 2'";
temp = replaceField(temp, 8);
}
static string replaceField(string list, int field)
{
string[] fields = list.Split(',');
string chosenField = fields[field - 1 /*<--Arrays start at 0!*/];
if(!(field == fields.Length))
list = list.Replace(chosenField + ",", "");
else
list = list.Replace("," + chosenField, "");
return list;
}
//Return-Value: "'ADDR_LINE_2','MODEL','TABLE',5,'S','Y','C40','','Address line 2'"

C# string.split() separate string by uppercase

I've been using the Split() method to split strings. But this work if you set some character for condition in string.Split(). Is there any way to split a string when is see Uppercase?
Is it possible to get few words from some not separated string like:
DeleteSensorFromTemplate
And the result string is to be like:
Delete Sensor From Template
Use Regex.split
string[] split = Regex.Split(str, #"(?<!^)(?=[A-Z])");
Another way with regex:
public static string SplitCamelCase(string input)
{
return System.Text.RegularExpressions.Regex.Replace(input, "([A-Z])", " $1", System.Text.RegularExpressions.RegexOptions.Compiled).Trim();
}
If you do not like RegEx and you really just want to insert the missing spaces, this will do the job too:
public static string InsertSpaceBeforeUpperCase(this string str)
{
var sb = new StringBuilder();
char previousChar = char.MinValue; // Unicode '\0'
foreach (char c in str)
{
if (char.IsUpper(c))
{
// If not the first character and previous character is not a space, insert a space before uppercase
if (sb.Length != 0 && previousChar != ' ')
{
sb.Append(' ');
}
}
sb.Append(c);
previousChar = c;
}
return sb.ToString();
}
I had some fun with this one and came up with a function that splits by case, as well as groups together caps (it assumes title case for whatever follows) and digits.
Examples:
Input -> "TodayIUpdated32UPCCodes"
Output -> "Today I Updated 32 UPC Codes"
Code (please excuse the funky symbols I use)...
public string[] SplitByCase(this string s) {
var ʀ = new List<string>();
var ᴛ = new StringBuilder();
var previous = SplitByCaseModes.None;
foreach(var ɪ in s) {
SplitByCaseModes mode_ɪ;
if(string.IsNullOrWhiteSpace(ɪ.ToString())) {
mode_ɪ = SplitByCaseModes.WhiteSpace;
} else if("0123456789".Contains(ɪ)) {
mode_ɪ = SplitByCaseModes.Digit;
} else if(ɪ == ɪ.ToString().ToUpper()[0]) {
mode_ɪ = SplitByCaseModes.UpperCase;
} else {
mode_ɪ = SplitByCaseModes.LowerCase;
}
if((previous == SplitByCaseModes.None) || (previous == mode_ɪ)) {
ᴛ.Append(ɪ);
} else if((previous == SplitByCaseModes.UpperCase) && (mode_ɪ == SplitByCaseModes.LowerCase)) {
if(ᴛ.Length > 1) {
ʀ.Add(ᴛ.ToString().Substring(0, ᴛ.Length - 1));
ᴛ.Remove(0, ᴛ.Length - 1);
}
ᴛ.Append(ɪ);
} else {
ʀ.Add(ᴛ.ToString());
ᴛ.Clear();
ᴛ.Append(ɪ);
}
previous = mode_ɪ;
}
if(ᴛ.Length != 0) ʀ.Add(ᴛ.ToString());
return ʀ.ToArray();
}
private enum SplitByCaseModes { None, WhiteSpace, Digit, UpperCase, LowerCase }
Here's another different way if you don't want to be using string builders or RegEx, which are totally acceptable answers. I just want to offer a different solution:
string Split(string input)
{
string result = "";
for (int i = 0; i < input.Length; i++)
{
if (char.IsUpper(input[i]))
{
result += ' ';
}
result += input[i];
}
return result.Trim();
}

parsing of a string containing an array

I'd like to convert string containing recursive array of strings to an array of depth one.
Example:
StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") == ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"]
Seems quite simple. But, I come from functional background and I'm not that familiar with .NET Framework standard libraries, so every time (I started from scratch like 3 times) I end up just plain ugly code. My latest implementation is here. As you see, it's ugly as hell.
So, what's the C# way to do this?
#ojlovecd has a good answer, using Regular Expressions.
However, his answer is overly complicated, so here's my similar, simpler answer.
public string[] StringToArray(string input) {
var pattern = new Regex(#"
\[
(?:
\s*
(?<results>(?:
(?(open) [^\[\]]+ | [^\[\],]+ )
|(?<open>\[)
|(?<-open>\])
)+)
(?(open)(?!))
,?
)*
\]
", RegexOptions.IgnorePatternWhitespace);
// Find the first match:
var result = pattern.Match(input);
if (result.Success) {
// Extract the captured values:
var captures = result.Groups["results"].Captures.Cast<Capture>().Select(c => c.Value).ToArray();
return captures;
}
// Not a match
return null;
}
Using this code, you will see that StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") will return the following array: ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"].
For more information on the balanced groups that I used for matching balanced braces, take a look at Microsoft's documentation.
Update:
As per the comments, if you want to also balance quotes, here's a possible modification. (Note that in C# the " is escaped as "") I also added descriptions of the pattern to help clarify it:
var pattern = new Regex(#"
\[
(?:
\s*
(?<results>(?: # Capture everything into 'results'
(?(open) # If 'open' Then
[^\[\]]+ # Capture everything but brackets
| # Else (not open):
(?: # Capture either:
[^\[\],'""]+ # Unimportant characters
| # Or
['""][^'""]*?['""] # Anything between quotes
)
) # End If
|(?<open>\[) # Open bracket
|(?<-open>\]) # Close bracket
)+)
(?(open)(?!)) # Fail while there's an unbalanced 'open'
,?
)*
\]
", RegexOptions.IgnorePatternWhitespace);
with Regex, it can solve your problem:
static string[] StringToArray(string str)
{
Regex reg = new Regex(#"^\[(.*)\]$");
Match match = reg.Match(str);
if (!match.Success)
return null;
str = match.Groups[1].Value;
List<string> list = new List<string>();
reg = new Regex(#"\[[^\[\]]*(((?'Open'\[)[^\[\]]*)+((?'-Open'\])[^\[\]]*)+)*(?(Open)(?!))\]");
Dictionary<string, string> dic = new Dictionary<string, string>();
int index = 0;
str = reg.Replace(str, m =>
{
string temp = "ojlovecd" + (index++).ToString();
dic.Add(temp, m.Value);
return temp;
});
string[] result = str.Split(',');
for (int i = 0; i < result.Length; i++)
{
string s = result[i].Trim();
if (dic.ContainsKey(s))
result[i] = dic[s].Trim();
else
result[i] = s;
}
return result;
}
Honestly I would just write this method in an F# assembly as its probably much easier. If you look at the JavaScriptSerializer implementation in C# (with a decompiler like dotPeek or reflector) you can see how messy the array parsing code is for a similar array in JSON. Granted this has to handle a much more varied array of tokens, but you get the idea.
Here is their DeserializeList implementation, uglier than it is normally as its dotPeek's decompiled version, not the original, but you get the idea. The DeserializeInternal would recurse down to the child list.
private IList DeserializeList(int depth)
{
IList list = (IList) new ArrayList();
char? nullable1 = this._s.MoveNext();
if (((int) nullable1.GetValueOrDefault() != 91 ? 1 : (!nullable1.HasValue ? 1 : 0)) != 0)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayStart));
bool flag = false;
char? nextNonEmptyChar;
char? nullable2;
do
{
char? nullable3 = nextNonEmptyChar = this._s.GetNextNonEmptyChar();
if ((nullable3.HasValue ? new int?((int) nullable3.GetValueOrDefault()) : new int?()).HasValue)
{
char? nullable4 = nextNonEmptyChar;
if (((int) nullable4.GetValueOrDefault() != 93 ? 1 : (!nullable4.HasValue ? 1 : 0)) != 0)
{
this._s.MovePrev();
object obj = this.DeserializeInternal(depth);
list.Add(obj);
flag = false;
nextNonEmptyChar = this._s.GetNextNonEmptyChar();
char? nullable5 = nextNonEmptyChar;
if (((int) nullable5.GetValueOrDefault() != 93 ? 0 : (nullable5.HasValue ? 1 : 0)) == 0)
{
flag = true;
nullable2 = nextNonEmptyChar;
}
else
goto label_8;
}
else
goto label_8;
}
else
goto label_8;
}
while (((int) nullable2.GetValueOrDefault() != 44 ? 1 : (!nullable2.HasValue ? 1 : 0)) == 0);
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayExpectComma));
label_8:
if (flag)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayExtraComma));
char? nullable6 = nextNonEmptyChar;
if (((int) nullable6.GetValueOrDefault() != 93 ? 1 : (!nullable6.HasValue ? 1 : 0)) != 0)
throw new ArgumentException(this._s.GetDebugString(AtlasWeb.JSON_InvalidArrayEnd));
else
return list;
}
Recursive parsing is just not managed as well though in C# as it is in F#.
There is no real "standard" way of doing this. Note that the implementation can get pretty messy if you want to consider all possibilities. I would recommend something recursive like:
private static IEnumerable<object> StringToArray2(string input)
{
var characters = input.GetEnumerator();
return InternalStringToArray2(characters);
}
private static IEnumerable<object> InternalStringToArray2(IEnumerator<char> characters)
{
StringBuilder valueBuilder = new StringBuilder();
while (characters.MoveNext())
{
char current = characters.Current;
switch (current)
{
case '[':
yield return InternalStringToArray2(characters);
break;
case ']':
yield return valueBuilder.ToString();
valueBuilder.Clear();
yield break;
case ',':
yield return valueBuilder.ToString();
valueBuilder.Clear();
break;
default:
valueBuilder.Append(current);
break;
}
Although your not restricted to recursiveness and can always fall back to a single method like
private static IEnumerable<object> StringToArray1(string input)
{
Stack<List<object>> levelEntries = new Stack<List<object>>();
List<object> current = null;
StringBuilder currentLineBuilder = new StringBuilder();
foreach (char nextChar in input)
{
switch (nextChar)
{
case '[':
levelEntries.Push(current);
current = new List<object>();
break;
case ']':
current.Add(currentLineBuilder.ToString());
currentLineBuilder.Clear();
var last = current;
if (levelEntries.Peek() != null)
{
current = levelEntries.Pop();
current.Add(last);
}
break;
case ',':
current.Add(currentLineBuilder.ToString());
currentLineBuilder.Clear();
break;
default:
currentLineBuilder.Append(nextChar);
break;
}
}
return current;
}
Whatever smells good to you
using System;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.VisualBasic.FileIO; //Microsoft.VisualBasic.dll
using System.IO;
public class Sample {
static void Main(){
string data = "[a, b, [c, [d, e]], f, [g, h], i]";
string[] fields = StringToArray(data);
//check print
foreach(var item in fields){
Console.WriteLine("\"{0}\"",item);
}
}
static string[] StringToArray(string data){
string[] fields = null;
Regex innerPat = new Regex(#"\[\s*(.+)\s*\]");
string innerStr = innerPat.Matches(data)[0].Groups[1].Value;
StringBuilder wk = new StringBuilder();
var balance = 0;
for(var i = 0;i<innerStr.Length;++i){
char ch = innerStr[i];
switch(ch){
case '[':
if(balance == 0){
wk.Append('"');
}
wk.Append(ch);
++balance;
continue;
case ']':
wk.Append(ch);
--balance;
if(balance == 0){
wk.Append('"');
}
continue;
default:
wk.Append(ch);
break;
}
}
var reader = new StringReader(wk.ToString());
using(var csvReader = new TextFieldParser(reader)){
csvReader.SetDelimiters(new string[] {","});
csvReader.HasFieldsEnclosedInQuotes = true;
fields = csvReader.ReadFields();
}
return fields;
}
}

Categories