How can I split strings by two commas ,,? - c#

void lvnf_SelectedIndexChanged(object sender, EventArgs e)
{
results = new List<int>();
richTextBox1.Text = File.ReadAllText(listViewCostumControl1.lvnf.Items[listViewCostumControl1.lvnf.SelectedIndices[0]].Text);
FileInfo fi = new FileInfo(listViewCostumControl1.lvnf.Items[listViewCostumControl1.lvnf.SelectedIndices[0]].Text);
lblfilesizeselected.Text = ExtensionMethods.ToFileSize(fi.Length);
lblfilesizeselected.Visible = true;
filePath = Path.GetDirectoryName(fi.FullName);
string words = textBox1.Text;
string[] splittedwords = words.Split(new string[] { ",," }, StringSplitOptions.None);
foreach (string myword in splittedwords)
{
HighlightPhrase(richTextBox1, myword, Color.Yellow);
lblviewerselectedfile.Text = results.Count.ToString();
lblviewerselectedfile.Visible = true;
if (results.Count > 0)
{
numericUpDown1.Maximum = results.Count;
numericUpDown1.Enabled = true;
richTextBox1.SelectionStart = results[(int)numericUpDown1.Value - 1];
richTextBox1.ScrollToCaret();
}
}
}
This is the line that make the split :
string[] splittedwords = words.Split(new string[] { ",," }, StringSplitOptions.None);
The problem is if I'm typing the textBox1 for example sadsdss,,s,,form1,,,,,,,,f,,dd,,,,,,
Then all the places that have more then two commas it count as empty string when highlighting the words :
void HighlightPhrase(RichTextBox box, string phrase, Color color)
{
int pos = box.SelectionStart;
string s = box.Text;
for (int ix = 0; ;)
{
int jx = s.IndexOf(phrase, ix, StringComparison.CurrentCultureIgnoreCase);
if (jx < 0)
{
break;
}
else
{
box.SelectionStart = jx;
box.SelectionLength = phrase.Length;
box.SelectionColor = color;
ix = jx + 1;
results.Add(jx);
}
}
box.SelectionStart = pos;
box.SelectionLength = 0;
}
The exception is on the line :
int jx = s.IndexOf(phrase, ix, StringComparison.CurrentCultureIgnoreCase);
System.ArgumentOutOfRangeException: 'Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: startIndex'
because the phrase is empty string ""
what I want to do is that every place there are more then two commas like ,,, count it as string as word even if the user type s,,1,,form1,,,,,,
so the words s 1 form1 and ,,,,,, all should be counted as results and words that should be highlighted.

If you want to remove empty entries, just do it with a help of StringSplitOptions.RemoveEmptyEntries option:
string[] splittedwords = words.Split(
new string[] { ",," },
StringSplitOptions.RemoveEmptyEntries);
Another posibility is to query with a help of Linq, which can be helpful if you want to exclude (filter out) some words, e.g.
using System.Linq;
...
string[] splittedwords = words
.Split(new string[] { ",," }, StringSplitOptions.None)
.Where(item => !string.IsNullOrWhiteSpace(item))
.ToArray();

Related

C# - Split at 20 characters but not if it is in between a word

I want to split a string into smaller parts, not exceeding a string length of 20 characters.
The current code is able to split an input string into an array of strings of length 20. However, this could cut a word.
The current code is:
string[] Array;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (i % 20 == 0 && i != 0) {
sb.Append('~');
}
sb.Append(input[i]);
}
Array = sb.ToString().Split('~');
For an input of this: Hello. This is a string. Goodbye., the output would be ['Hello. This is a str', 'ing. Goodbye.'].
However, I don’t want the string to be cut if it’s a word. That word should move to the next string in the array. How can I get the following output instead?
['Hello. This is a', 'string. Goodbye.']
First split your sentence on word-boundary:
var words = myString.Split();
Now concatenate words as long as not more than 20 characters are within your current line:
var lines = new List<string> { words[0] };
var lineNum = 0;
for(int i = 1; i < words.Length; i++)
{
if(lines[lineNum].Length + words[i].Length + 1 <= 20)
lines[lineNum] += " " + words[i];
else
{
lines.Add(words[i]);
lineNum++;
}
}
Here is a fiddle for testing: https://dotnetfiddle.net/s0LrFC
Could be more elegant but this will split the string to lines of a maximum number of characters. The words will be kept together unless they exceed the given length.
public static string[] SplitString(string input, int lineLen)
{
StringBuilder sb = new StringBuilder();
string[] words = input.Split(' ');
string line = string.Empty;
string sp = string.Empty;
foreach (string w in words)
{
string word = w;
while (word != string.Empty)
{
if (line == string.Empty)
{
while (word.Length >= lineLen)
{
sb.Append(word.Substring(0, lineLen) + "~");
word = word.Substring(lineLen);
}
if (word != string.Empty)
line = word;
word = string.Empty;
sp = " ";
}
else if (line.Length + word.Length <= lineLen)
{
line += sp + word;
sp = " ";
word = string.Empty;
}
else
{
sb.Append(line + "~");
line = string.Empty;
sp = string.Empty;
}
}
}
if (line != string.Empty)
sb.Append(line);
return sb.ToString().Split('~');
}
To test:
string[] lines = SplitString("This is a test of the string splitter KGKGKJGKGHKJHJKJKHGJHGhghsjagsjasgajsgjahs yes!", 20);
foreach (string line in lines)
{
Console.WriteLine(line);
}
Output:
This is a test of the
string splitter
KGKGKJGKGHKJHJKJKHGJ
HGhghsjagsjasgajsgja
hs yes!
I believe it's faster to split it only at places where it needs to be, instead of every word. With lines.SelectMany(x => Split(x, 80) can be used with multiline texts:
private static IEnumerable<string> Split(string text, int maxLength)
{
var i = 0;
while (i + maxLength < text.Length)
{
var partIndex = text.LastIndexOf(' ', i + maxLength, maxLength);
if (partIndex == -1)
partIndex = i + maxLength;
yield return text[i..partIndex];
i = partIndex + 1;
}
yield return text[i..];
}

Why isn't my bubble sort sorting my array correctly?

So I have this bubble sort, first time trying to create one and this is what I have.
For some reason it's printing out the array in a weird way.
It should sort it by letters as far as I know.
How do I properly do a bubble sort without using LINQ or Array.Sort(); This is for school so I need to do the bubble sort algorithm.
Here is an image of what it prints out.
class Program
{
static string[] animals = new string[] { "cat", "elephant", "tiger", "fish", "dolphin", "giraffe", "hippo", "lion", "rat", "string ray" };
static void Main(string[] args)
{
BubbleSort();
Console.ReadLine();
}
private static void BubbleSort()
{
bool swap;
string temp;
string[] animals = new string[] { "cat", "elephant", "tiger", "fish", "dolphin", "giraffe", "hippo", "lion", "rat", "string ray" };
for (int index = 0; index < (animals.Length - 1); index++)
{
if (string.Compare(animals[index], animals[index + 1], true) < 0) //if first number is greater then second then swap
{
//swap
temp = animals[index];
animals[index] = animals[index + 1];
animals[index + 1] = temp;
swap = true;
}
}
foreach (string item in animals)
{
Console.WriteLine(item);
}
}
}
For Bubblesort you need two nested loops since you are passing the Array not once but multiple times.
private static void BubbleSort()
{
string temp;
string[] animals = new string[] { "cat", "elephant", "tiger", "fish", "dolphin", "giraffe", "hippo", "lion", "rat", "string ray" };
for (int i = 1; i < animals.Length; i++)
{
for (int j = 0; j < animals.Length - i; j++)
{
if (string.Compare(animals[j], animals[j + 1], StringComparison.Ordinal) <= 0) continue;
temp = animals[j];
animals[j] = animals[j + 1];
animals[j + 1] = temp;
}
}
foreach (string item in animals)
{
Console.WriteLine(item);
}
}
PS: Next time, use the search a bit longer, the code above is almost 100% taken from http://stackoverflow.com/questions/38624840/bubble-sort-string-array-c-sharp.

Find the longest substring without any number and at least one upper case character C#

I have to Find the longest SubString without any number and at least one upper case character using c#. If the string is "sdcF01h" then o/p should be "sdcF"
My approach.
String testString = "sdcF01F";
//var splitString = testString.Split("[0-9]");
int startIndex = 0;
int longestStartIndex = 0;
int endIndex = 0;
int index = 0;
int longestLength = int.MinValue;
bool foundUpperCase = false;
while (index <= testString.Length)
{
if (index == testString.Length || char.IsDigit(testString[index]))
{
if (foundUpperCase && index > startIndex && index - startIndex > longestLength)
{
longestLength = index - startIndex;
endIndex = index;
longestStartIndex = startIndex;
}
startIndex = index + 1;
foundUpperCase = false;
}
else if (char.IsUpper(testString[index]))
{
foundUpperCase = true;
}
index++;
}
endIndex--;
var res1 = testString.Substring(longestStartIndex, endIndex);
Console.WriteLine(res1);
But this is not the most optimal solution.
There is problem in your question example:
if the string is "sdch01F" then o/p should be "sdcF" My approach.
The right should be result is F.
I suppose you mean that "sdcF01F" the result is "sdcF" (like in the your code example).
Any way this is my solution*:
private string GetLongestSubstring(string testString)
{
var longestSubstring = string.Empty;
if (string.IsNullOrEmpty(testString))
{
return longestSubstring;
}
var rg = new Regex("[A-Z]");
var currentSubstring = string.Empty;
for (int i = 0; i < testString.Length; i++)
{
var currentChar = testString[i];
var isValidChar = !char.IsDigit(currentChar);
if (!isValidChar)
{
var newSubstring = currentSubstring;
currentSubstring = string.Empty;
var matches = rg.Match(newSubstring);
var iscurrentSubstringContainsAtLeastOneCapitalLetter = matches.Success;
if (iscurrentSubstringContainsAtLeastOneCapitalLetter)
{
if (longestSubstring.Length < newSubstring.Length)
{
longestSubstring = newSubstring;
}
}
continue;
}
currentSubstring += currentChar.ToString();
}
if (currentSubstring.Length > longestSubstring.Length)
{
longestSubstring = currentSubstring;
}
return longestSubstring;
}
Note, the function takes into account that there is no space (" ") in the string.
Just split, sort on length, and test for upper
public static string GetSubset(string input = "LKAH8slfsfjlllj9lkjlkjasdf;lk7ljasdflkasdjsfdljk")
{
if (string.IsNullOrEmpty(input))
return string.Empty;
foreach(String s in input.Split(new Char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' }).OrderByDescending(x => x.Length))
{
foreach (char c in s)
if (char.IsUpper(c))
return s;
}
return string.Empty;
}
var abc= Regex.Matches("co12dEname123abP", #"[a-zA-Z]+|\d+")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
List<string> lst = new List<string>();
for (int i = 0; i < abc.Length; i++)
{
if (abc[i].Any(char.IsDigit))
continue;
if (abc[i].Any(c => char.IsUpper(c)))
lst.Add(abc[i]);
}
var finalOutput =lst.OrderByDescending(x => x.Length).FirstOrDefault();

Sorting a List alphabetically and numerically

I've written a code that has a list and I want to sort the list alphabetically and numerically.
For example the first items in the list are
list[0] = "INPUT 10"
list[1] = "INPUT 5".
I want my list reorganized like this:
list[0] = "INPUT 5"
list[1] = "INPUT 10".
So basically my program gets checked items from a checked list box,stores them in a list, and I want it to reorganize the list alphabettically.
The checked list box has items like INPUT 1,INPUT 2,INPUT 3...and so fourth. Can anyone suggest me a way of how to go about this?
UPDATED CODE
I've updated my code and now this code splits the strings into INPUT and 10.The "q" list obtains the checked items in the input box,the string "s" array gets the splittd data from the q list. Then the "numbers" list gets only the number part of the string for example "INPUT 5",the number list will only get "5".Then I want to sort these numbers and build another string combining the sorted number list and the string "INPUT" and add it to the output checkedlistbox. My code isnt working though...any suggestions?It should sort the numbers but it doesnt...anyone have any suggestions of why this code isnt working? And I keep on getting error messages of the array being able to unhandle negative integers and what not.
List<string> q = new List<string>();
List<string> numbers = new List<string>();
private void button_ekle_Click(object sender, EventArgs e)
{
for (int k = clb_input.Items.Count - 1; k >= 0; k--)
{
if (clb_input.GetItemChecked(k) == true)
{
q.Add(clb_input.Items[k].ToString());
//clb_output.Items.Add(clb_input.Items[k]);
clb_input.Items.RemoveAt(k);
}
else { }
}
string[] s = new string[q.Count * 2];
//string[] numbers=new string[q.Count/2];
for (int t = 1; t <= q.Count * 2; t++)
{
if (q != null)
s = q[t - 1].ToString().Split(' ');
else { s[t] = null; }
}
for (int x = 1; x <= q.Count; x++)
{
if (s[2 * x - 1] != null)
{
numbers[x - 1] = s[2 * x - 1];
numbers.Sort();
clb_output.Items.Add("INPUT "+ numbers[x - 1].ToString());
}
else { numbers[x - 1] = null; }
}
}
What you need is Alphanumeric Sorting ( most commonly seen in windows explorer, the way files are sorted)
Code can be found here : http://www.dotnetperls.com/alphanumeric-sorting
Sample
class Program
{
static void Main()
{
string[] highways = new string[]
{
"100F",
"50F",
"SR100",
"SR9"
};
//
// We want to sort a string array called highways in an
// alphanumeric way. Call the static Array.Sort method.
//
Array.Sort(highways, new AlphanumComparatorFast());
//
// Display the results
//
foreach (string h in highways)
{
Console.WriteLine(h);
}
}
}
Output
50F
100F
SR9
SR100
Implementation
public class AlphanumComparatorFast : IComparer
{
public int Compare(object x, object y)
{
string s1 = x as string;
if (s1 == null)
{
return 0;
}
string s2 = y as string;
if (s2 == null)
{
return 0;
}
int len1 = s1.Length;
int len2 = s2.Length;
int marker1 = 0;
int marker2 = 0;
// Walk through two the strings with two markers.
while (marker1 < len1 && marker2 < len2)
{
char ch1 = s1[marker1];
char ch2 = s2[marker2];
// Some buffers we can build up characters in for each chunk.
char[] space1 = new char[len1];
int loc1 = 0;
char[] space2 = new char[len2];
int loc2 = 0;
// Walk through all following characters that are digits or
// characters in BOTH strings starting at the appropriate marker.
// Collect char arrays.
do
{
space1[loc1++] = ch1;
marker1++;
if (marker1 < len1)
{
ch1 = s1[marker1];
}
else
{
break;
}
} while (char.IsDigit(ch1) == char.IsDigit(space1[0]));
do
{
space2[loc2++] = ch2;
marker2++;
if (marker2 < len2)
{
ch2 = s2[marker2];
}
else
{
break;
}
} while (char.IsDigit(ch2) == char.IsDigit(space2[0]));
// If we have collected numbers, compare them numerically.
// Otherwise, if we have strings, compare them alphabetically.
string str1 = new string(space1);
string str2 = new string(space2);
int result;
if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
{
int thisNumericChunk = int.Parse(str1);
int thatNumericChunk = int.Parse(str2);
result = thisNumericChunk.CompareTo(thatNumericChunk);
}
else
{
result = str1.CompareTo(str2);
}
if (result != 0)
{
return result;
}
}
return len1 - len2;
}
}
The simplest solution is to just left-pad the numerical values with a space to the same length.
List<string> lst = new List<string>
{
"Item 9",
"Item 999",
"Thing 999",
"Thing 5",
"Thing 1",
"Item 20",
"Item 10",
};
lst.Sort();
Output:
Item 9
Item 10
Item 20
Item 999
Thing 1
Thing 5
Thing 999
And you can always remove the extra white space used for padding after the sorting operation is performed.
You can use Sort with a Comparer like this:
List<string> q = new List<string>();
private void button_ekle_Click(object sender, EventArgs e)
{
for (int k=clb_input.Items.Count-1; k >= 0; k--)
{
if (clb_input.GetItemChecked(k) == true)
{
q.Add(clb_input.Items[k].ToString());
clb_input.Items.RemoveAt(k);
}
else { }
}
q.Sort((p1,p2)=>((int)(p1.split(" ")[1])).CompareTo((int)(p2.split(" ")[1])));
for (int t = 0; t < q.Count; t++)
{
clb_output.Items.Add(q[t].ToString());
}
}

Splitting Comma Separated Values (CSV)

How to split the CSV file in c sharp? And how to display this?
I've been using the TextFieldParser Class in the Microsoft.VisualBasic.FileIO namespace for a C# project I'm working on. It will handle complications such as embedded commas or fields that are enclosed in quotes etc. It returns a string[] and, in addition to CSV files, can also be used for parsing just about any type of structured text file.
Display where? About splitting, the best way is to use a good library to that effect.
This library is pretty good, I can recommend it heartily.
The problems using naïve methods is that the usually fail, there are tons of considerations without even thinking about performance:
What if the text contains commas
Support for the many existing formats (separated by semicolon, or text surrounded by quotes, or single quotes, etc.)
and many others
Import Micorosoft.VisualBasic as a reference (I know, its not that bad) and use Microsoft.VisualBasic.FileIO.TextFieldParser - this handles CSV files very well, and can be used in any .Net language.
read the file one line at a time, then ...
foreach (String line in line.Split(new char[] { ',' }))
Console.WriteLine(line);
This is a CSV parser I use on occasion.
Usage: (dgvMyView is a datagrid type.)
CSVReader reader = new CSVReader("C:\MyFile.txt");
reader.DisplayResults(dgvMyView);
Class:
using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;
public class CSVReader
{
private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
private string[] FieldNames;
private List<string[]> Records;
private int ReadIndex;
public CSVReader(string File)
{
Records = new List<string[]>();
string[] Record = null;
StreamReader Reader = new StreamReader(File);
int Index = 0;
bool BlankRecord = true;
FieldNames = GetEscapedSVs(Reader.ReadLine());
while (!Reader.EndOfStream)
{
Record = GetEscapedSVs(Reader.ReadLine());
BlankRecord = true;
for (Index = 0; Index <= Record.Length - 1; Index++)
{
if (!string.IsNullOrEmpty(Record[Index])) BlankRecord = false;
}
if (!BlankRecord) Records.Add(Record);
}
ReadIndex = -1;
Reader.Close();
}
private string[] GetEscapedSVs(string Data)
{
return GetEscapedSVs(Data, ",", "\"");
}
private string[] GetEscapedSVs(string Data, string Separator, string Escape)
{
string[] Result = null;
int Index = 0;
int PriorMatchIndex = 0;
MatchCollection Matches = Regex.Matches(Data, string.Format(ESCAPE_SPLIT_REGEX, Separator, Escape));
Result = new string[Matches.Count];
for (Index = 0; Index <= Result.Length - 2; Index++)
{
Result[Index] = Data.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
Result[Result.Length - 1] = Data.Substring(PriorMatchIndex);
for (Index = 0; Index <= Result.Length - 1; Index++)
{
if (Regex.IsMatch(Result[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) Result[Index] = Result[Index].Substring(1, Result[Index].Length - 2);
Result[Index] = Result[Index].Replace(Escape + Escape, Escape);
if (Result[Index] == null) Result[Index] = "";
}
return Result;
}
public int FieldCount
{
get { return FieldNames.Length; }
}
public string GetString(int Index)
{
return Records[ReadIndex][Index];
}
public string GetName(int Index)
{
return FieldNames[Index];
}
public bool Read()
{
ReadIndex = ReadIndex + 1;
return ReadIndex < Records.Count;
}
public void DisplayResults(DataGridView DataView)
{
DataGridViewColumn col = default(DataGridViewColumn);
DataGridViewRow row = default(DataGridViewRow);
DataGridViewCell cell = default(DataGridViewCell);
DataGridViewColumnHeaderCell header = default(DataGridViewColumnHeaderCell);
int Index = 0;
ReadIndex = -1;
DataView.Rows.Clear();
DataView.Columns.Clear();
for (Index = 0; Index <= FieldCount - 1; Index++)
{
col = new DataGridViewColumn();
col.CellTemplate = new DataGridViewTextBoxCell();
header = new DataGridViewColumnHeaderCell();
header.Value = GetName(Index);
col.HeaderCell = header;
DataView.Columns.Add(col);
}
while (Read())
{
row = new DataGridViewRow();
for (Index = 0; Index <= FieldCount - 1; Index++)
{
cell = new DataGridViewTextBoxCell();
cell.Value = GetString(Index).ToString();
row.Cells.Add(cell);
}
DataView.Rows.Add(row);
}
}
}
I had got the result for my query. its like simple like i had read a file using io.file. and all the text are stored into a string. After that i splitted with a seperator. The code is shown below.
using System;
using System.Collections.Generic;
using System.Text;
namespace CSV
{
class Program
{
static void Main(string[] args)
{
string csv = "user1, user2, user3,user4,user5";
string[] split = csv.Split(new char[] {',',' '});
foreach(string s in split)
{
if (s.Trim() != "")
Console.WriteLine(s);
}
Console.ReadLine();
}
}
}
The following function takes a line from a CSV file and splits it into a List<string>.
Arguments:
string line = the line to split
string textQualifier = what (if any) text qualifier (i.e. "" or "\"" or "'")
char delim = the field delimiter (i.e. ',' or ';' or '|' or '\t')
int colCount = the expected number of fields (0 means don't check)
Example usage:
List<string> fields = SplitLine(line, "\"", ',', 5);
// or
List<string> fields = SplitLine(line, "'", '|', 10);
// or
List<string> fields = SplitLine(line, "", '\t', 0);
Function:
private List<string> SplitLine(string line, string textQualifier, char delim, int colCount)
{
List<string> fields = new List<string>();
string origLine = line;
char textQual = '"';
bool hasTextQual = false;
if (!String.IsNullOrEmpty(textQualifier))
{
hasTextQual = true;
textQual = textQualifier[0];
}
if (hasTextQual)
{
while (!String.IsNullOrEmpty(line))
{
if (line[0] == textQual) // field is text qualified so look for next unqualified delimiter
{
int fieldLen = 1;
while (true)
{
if (line.Length == 2) // must be final field (zero length)
{
fieldLen = 2;
break;
}
else if (fieldLen + 1 >= line.Length) // must be final field
{
fieldLen += 1;
break;
}
else if (line[fieldLen] == textQual && line[fieldLen + 1] == textQual) // escaped text qualifier
{
fieldLen += 2;
}
else if (line[fieldLen] == textQual && line[fieldLen + 1] == delim) // must be end of field
{
fieldLen += 1;
break;
}
else // not a delimiter
{
fieldLen += 1;
}
}
string escapedQual = textQual.ToString() + textQual.ToString();
fields.Add(line.Substring(1, fieldLen - 2).Replace(escapedQual, textQual.ToString())); // replace escaped qualifiers
if (line.Length >= fieldLen + 1)
{
line = line.Substring(fieldLen + 1);
if (line == "") // blank final field
{
fields.Add("");
}
}
else
{
line = "";
}
}
else // field is not text qualified
{
int fieldLen = line.IndexOf(delim);
if (fieldLen != -1) // check next delimiter position
{
fields.Add(line.Substring(0, fieldLen));
line = line.Substring(fieldLen + 1);
if (line == "") // final field must be blank
{
fields.Add("");
}
}
else // must be last field
{
fields.Add(line);
line = "";
}
}
}
}
else // if there is no text qualifier, then use existing split function
{
fields.AddRange(line.Split(delim));
}
if (colCount > 0 && colCount != fields.Count) // count doesn't match expected so throw exception
{
throw new Exception("Field count was:" + fields.Count.ToString() + ", expected:" + colCount.ToString() + ". Line:" + origLine);
}
return fields;
}
Problem: Convert a comma separated string into an array where commas in "quoted strings,,," should not be considered as separators but as part of an entry
Input:
String: First,"Second","Even,With,Commas",,Normal,"Sentence,with ""different"" problems",3,4,5
Output:
String-Array: ['First','Second','Even,With,Commas','','Normal','Sentence,with "different" problems','3','4','5']
Code:
string sLine;
sLine = "First,\"Second\",\"Even,With,Commas\",,Normal,\"Sentence,with \"\"different\"\" problems\",3,4,5";
// 1. Split line by separator; do not split if separator is within quotes
string Separator = ",";
string Escape = '"'.ToString();
MatchCollection Matches = Regex.Matches(sLine,
string.Format("({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*", Separator, Escape));
string[] asColumns = new string[Matches.Count + 1];
int PriorMatchIndex = 0;
for (int Index = 0; Index <= asColumns.Length - 2; Index++)
{
asColumns[Index] = sLine.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
asColumns[asColumns.Length - 1] = sLine.Substring(PriorMatchIndex);
// 2. Remove quotes
for (int Index = 0; Index <= asColumns.Length - 1; Index++)
{
if (Regex.IsMatch(asColumns[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) // If "Text" is sourrounded by quotes (but ignore double quotes => "Leave ""inside"" quotes")
{
asColumns[Index] = asColumns[Index].Substring(1, asColumns[Index].Length - 2); // "Text" => Text
}
asColumns[Index] = asColumns[Index].Replace(Escape + Escape, Escape); // Remove double quotes ('My ""special"" text' => 'My "special" text')
if (asColumns[Index] == null) asColumns[Index] = "";
}
The output array is asColumns

Categories