Splitting Comma Separated Values (CSV) - c#

How to split the CSV file in c sharp? And how to display this?

I've been using the TextFieldParser Class in the Microsoft.VisualBasic.FileIO namespace for a C# project I'm working on. It will handle complications such as embedded commas or fields that are enclosed in quotes etc. It returns a string[] and, in addition to CSV files, can also be used for parsing just about any type of structured text file.

Display where? About splitting, the best way is to use a good library to that effect.
This library is pretty good, I can recommend it heartily.
The problems using naïve methods is that the usually fail, there are tons of considerations without even thinking about performance:
What if the text contains commas
Support for the many existing formats (separated by semicolon, or text surrounded by quotes, or single quotes, etc.)
and many others

Import Micorosoft.VisualBasic as a reference (I know, its not that bad) and use Microsoft.VisualBasic.FileIO.TextFieldParser - this handles CSV files very well, and can be used in any .Net language.

read the file one line at a time, then ...
foreach (String line in line.Split(new char[] { ',' }))
Console.WriteLine(line);

This is a CSV parser I use on occasion.
Usage: (dgvMyView is a datagrid type.)
CSVReader reader = new CSVReader("C:\MyFile.txt");
reader.DisplayResults(dgvMyView);
Class:
using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;
public class CSVReader
{
private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
private string[] FieldNames;
private List<string[]> Records;
private int ReadIndex;
public CSVReader(string File)
{
Records = new List<string[]>();
string[] Record = null;
StreamReader Reader = new StreamReader(File);
int Index = 0;
bool BlankRecord = true;
FieldNames = GetEscapedSVs(Reader.ReadLine());
while (!Reader.EndOfStream)
{
Record = GetEscapedSVs(Reader.ReadLine());
BlankRecord = true;
for (Index = 0; Index <= Record.Length - 1; Index++)
{
if (!string.IsNullOrEmpty(Record[Index])) BlankRecord = false;
}
if (!BlankRecord) Records.Add(Record);
}
ReadIndex = -1;
Reader.Close();
}
private string[] GetEscapedSVs(string Data)
{
return GetEscapedSVs(Data, ",", "\"");
}
private string[] GetEscapedSVs(string Data, string Separator, string Escape)
{
string[] Result = null;
int Index = 0;
int PriorMatchIndex = 0;
MatchCollection Matches = Regex.Matches(Data, string.Format(ESCAPE_SPLIT_REGEX, Separator, Escape));
Result = new string[Matches.Count];
for (Index = 0; Index <= Result.Length - 2; Index++)
{
Result[Index] = Data.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
Result[Result.Length - 1] = Data.Substring(PriorMatchIndex);
for (Index = 0; Index <= Result.Length - 1; Index++)
{
if (Regex.IsMatch(Result[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) Result[Index] = Result[Index].Substring(1, Result[Index].Length - 2);
Result[Index] = Result[Index].Replace(Escape + Escape, Escape);
if (Result[Index] == null) Result[Index] = "";
}
return Result;
}
public int FieldCount
{
get { return FieldNames.Length; }
}
public string GetString(int Index)
{
return Records[ReadIndex][Index];
}
public string GetName(int Index)
{
return FieldNames[Index];
}
public bool Read()
{
ReadIndex = ReadIndex + 1;
return ReadIndex < Records.Count;
}
public void DisplayResults(DataGridView DataView)
{
DataGridViewColumn col = default(DataGridViewColumn);
DataGridViewRow row = default(DataGridViewRow);
DataGridViewCell cell = default(DataGridViewCell);
DataGridViewColumnHeaderCell header = default(DataGridViewColumnHeaderCell);
int Index = 0;
ReadIndex = -1;
DataView.Rows.Clear();
DataView.Columns.Clear();
for (Index = 0; Index <= FieldCount - 1; Index++)
{
col = new DataGridViewColumn();
col.CellTemplate = new DataGridViewTextBoxCell();
header = new DataGridViewColumnHeaderCell();
header.Value = GetName(Index);
col.HeaderCell = header;
DataView.Columns.Add(col);
}
while (Read())
{
row = new DataGridViewRow();
for (Index = 0; Index <= FieldCount - 1; Index++)
{
cell = new DataGridViewTextBoxCell();
cell.Value = GetString(Index).ToString();
row.Cells.Add(cell);
}
DataView.Rows.Add(row);
}
}
}

I had got the result for my query. its like simple like i had read a file using io.file. and all the text are stored into a string. After that i splitted with a seperator. The code is shown below.
using System;
using System.Collections.Generic;
using System.Text;
namespace CSV
{
class Program
{
static void Main(string[] args)
{
string csv = "user1, user2, user3,user4,user5";
string[] split = csv.Split(new char[] {',',' '});
foreach(string s in split)
{
if (s.Trim() != "")
Console.WriteLine(s);
}
Console.ReadLine();
}
}
}

The following function takes a line from a CSV file and splits it into a List<string>.
Arguments:
string line = the line to split
string textQualifier = what (if any) text qualifier (i.e. "" or "\"" or "'")
char delim = the field delimiter (i.e. ',' or ';' or '|' or '\t')
int colCount = the expected number of fields (0 means don't check)
Example usage:
List<string> fields = SplitLine(line, "\"", ',', 5);
// or
List<string> fields = SplitLine(line, "'", '|', 10);
// or
List<string> fields = SplitLine(line, "", '\t', 0);
Function:
private List<string> SplitLine(string line, string textQualifier, char delim, int colCount)
{
List<string> fields = new List<string>();
string origLine = line;
char textQual = '"';
bool hasTextQual = false;
if (!String.IsNullOrEmpty(textQualifier))
{
hasTextQual = true;
textQual = textQualifier[0];
}
if (hasTextQual)
{
while (!String.IsNullOrEmpty(line))
{
if (line[0] == textQual) // field is text qualified so look for next unqualified delimiter
{
int fieldLen = 1;
while (true)
{
if (line.Length == 2) // must be final field (zero length)
{
fieldLen = 2;
break;
}
else if (fieldLen + 1 >= line.Length) // must be final field
{
fieldLen += 1;
break;
}
else if (line[fieldLen] == textQual && line[fieldLen + 1] == textQual) // escaped text qualifier
{
fieldLen += 2;
}
else if (line[fieldLen] == textQual && line[fieldLen + 1] == delim) // must be end of field
{
fieldLen += 1;
break;
}
else // not a delimiter
{
fieldLen += 1;
}
}
string escapedQual = textQual.ToString() + textQual.ToString();
fields.Add(line.Substring(1, fieldLen - 2).Replace(escapedQual, textQual.ToString())); // replace escaped qualifiers
if (line.Length >= fieldLen + 1)
{
line = line.Substring(fieldLen + 1);
if (line == "") // blank final field
{
fields.Add("");
}
}
else
{
line = "";
}
}
else // field is not text qualified
{
int fieldLen = line.IndexOf(delim);
if (fieldLen != -1) // check next delimiter position
{
fields.Add(line.Substring(0, fieldLen));
line = line.Substring(fieldLen + 1);
if (line == "") // final field must be blank
{
fields.Add("");
}
}
else // must be last field
{
fields.Add(line);
line = "";
}
}
}
}
else // if there is no text qualifier, then use existing split function
{
fields.AddRange(line.Split(delim));
}
if (colCount > 0 && colCount != fields.Count) // count doesn't match expected so throw exception
{
throw new Exception("Field count was:" + fields.Count.ToString() + ", expected:" + colCount.ToString() + ". Line:" + origLine);
}
return fields;
}

Problem: Convert a comma separated string into an array where commas in "quoted strings,,," should not be considered as separators but as part of an entry
Input:
String: First,"Second","Even,With,Commas",,Normal,"Sentence,with ""different"" problems",3,4,5
Output:
String-Array: ['First','Second','Even,With,Commas','','Normal','Sentence,with "different" problems','3','4','5']
Code:
string sLine;
sLine = "First,\"Second\",\"Even,With,Commas\",,Normal,\"Sentence,with \"\"different\"\" problems\",3,4,5";
// 1. Split line by separator; do not split if separator is within quotes
string Separator = ",";
string Escape = '"'.ToString();
MatchCollection Matches = Regex.Matches(sLine,
string.Format("({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*", Separator, Escape));
string[] asColumns = new string[Matches.Count + 1];
int PriorMatchIndex = 0;
for (int Index = 0; Index <= asColumns.Length - 2; Index++)
{
asColumns[Index] = sLine.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
asColumns[asColumns.Length - 1] = sLine.Substring(PriorMatchIndex);
// 2. Remove quotes
for (int Index = 0; Index <= asColumns.Length - 1; Index++)
{
if (Regex.IsMatch(asColumns[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) // If "Text" is sourrounded by quotes (but ignore double quotes => "Leave ""inside"" quotes")
{
asColumns[Index] = asColumns[Index].Substring(1, asColumns[Index].Length - 2); // "Text" => Text
}
asColumns[Index] = asColumns[Index].Replace(Escape + Escape, Escape); // Remove double quotes ('My ""special"" text' => 'My "special" text')
if (asColumns[Index] == null) asColumns[Index] = "";
}
The output array is asColumns

Related

Take only letters from the string and reverse them

I'm preparing for my interview, faced the problem with the task. The case is that we're having a string:
test12pop90java989python
I need to return new string where words will be reversed and numbers will stay in the same place:
test12pop90java989python ==> tset12pop90avaj989nohtyp
What I started with:
Transferring string to char array
Use for loop + Char.IsNumber
??
var charArray = test.ToCharArray();
for (int i = 0; i < charArray.Length; i++)
{
if (!Char.IsNumber(charArray[i]))
{
....
}
}
but currently I'm stuck and don't know how to proceed, any tips how it can be done?
You can't reverse a run of letters until you've observed the entire run; until then, you need to keep track of the pending letters to be reversed and appended to the final output upon encountering a number or the end of the string. By storing these pending characters in a Stack<> they are naturally returned in the reverse order they were added.
static string Transform(string input)
{
StringBuilder outputBuilder = new StringBuilder(input.Length);
Stack<char> pending = new Stack<char>();
foreach (char c in input)
if (char.IsNumber(c))
{
// In the reverse order of which they were added, consume
// and append pending characters as long as they are available
while (pending.Count > 0)
outputBuilder.Append(pending.Pop());
// Alternatively...
//foreach (char p in pending)
// outputBuilder.Append(p);
//pending.Clear();
outputBuilder.Append(c);
}
else
pending.Push(c);
// Handle pending characters when input does not end with a number
while (pending.Count > 0)
outputBuilder.Append(pending.Pop());
return outputBuilder.ToString();
}
A similar but buffer-free way is to do it is to store the index of the start of the current run of letters, then walk back through and append each character when a number is found...
static string Transform(string input)
{
StringBuilder outputBuilder = new StringBuilder(input.Length);
int lettersStartIndex = -1;
for (int i = 0; i < input.Length; i++)
{
char c = input[i];
if (char.IsNumber(c))
{
if (lettersStartIndex >= 0)
{
// Iterate backwards from the previous character to the start of the run
for (int j = i - 1; j >= lettersStartIndex; j--)
outputBuilder.Append(input[j]);
lettersStartIndex = -1;
}
outputBuilder.Append(c);
}
else if (lettersStartIndex < 0)
lettersStartIndex = i;
}
// Handle remaining characters when input does not end with a number
if (lettersStartIndex >= 0)
for (int j = input.Length - 1; j >= lettersStartIndex; j--)
outputBuilder.Append(input[j]);
return outputBuilder.ToString();
}
For both implementations, calling Transform() with...
string[] inputs = new string[] {
"test12pop90java989python",
"123test12pop90java989python321",
"This text contains no numbers",
"1a2b3c"
};
for (int i = 0; i < inputs.Length; i++)
{
string input = inputs[i];
string output = Transform(input);
Console.WriteLine($" Input[{i}]: \"{input }\"");
Console.WriteLine($"Output[{i}]: \"{output}\"");
Console.WriteLine();
}
...produces this output...
Input[0]: "test12pop90java989python"
Output[0]: "tset12pop90avaj989nohtyp"
Input[1]: "123test12pop90java989python321"
Output[1]: "123tset12pop90avaj989nohtyp321"
Input[2]: "This text contains no numbers"
Output[2]: "srebmun on sniatnoc txet sihT"
Input[3]: "1a2b3c"
Output[3]: "1a2b3c"
A possible solution using Regex and Linq:
using System;
using System.Text.RegularExpressions;
using System.Linq;
public class Program
{
public static void Main()
{
var result = "";
var matchList = Regex.Matches("test12pop90java989python", "([a-zA-Z]*)(\\d*)");
var list = matchList.Cast<Match>().SelectMany(o =>o.Groups.Cast<Capture>().Skip(1).Select(c => c.Value));
foreach (var el in list)
{
if (el.All(char.IsDigit))
{
result += el;
}
else
{
result += new string(el.Reverse().ToArray());
}
}
Console.WriteLine(result);
}
}
I've used code from stackoverflow.com/a/21123574/1037948 to create a list of Regex matches on line 11:
var list = matchList.Cast<Match>().SelectMany(o =>o.Groups.Cast<Capture>().Skip(1).Select(c => c.Value));
Hey you can do something like:
string test = "test12pop90java989python", tempStr = "", finalstr = "";
var charArray = test.ToCharArray();
for (int i = 0; i < charArray.Length; i++)
{
if (!Char.IsNumber(charArray[i]))
{
tempStr += charArray[i];
}
else
{
char[] ReverseString = tempStr.Reverse().ToArray();
foreach (char charItem in ReverseString)
{
finalstr += charItem;
}
tempStr = "";
finalstr += charArray[i];
}
}
if(tempStr != "" && tempStr != null)
{
char[] ReverseString = tempStr.Reverse().ToArray();
foreach (char charItem in ReverseString)
{
finalstr += charItem;
}
tempStr = "";
}
I hope this helps

C# - Split at 20 characters but not if it is in between a word

I want to split a string into smaller parts, not exceeding a string length of 20 characters.
The current code is able to split an input string into an array of strings of length 20. However, this could cut a word.
The current code is:
string[] Array;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.Length; i++)
{
if (i % 20 == 0 && i != 0) {
sb.Append('~');
}
sb.Append(input[i]);
}
Array = sb.ToString().Split('~');
For an input of this: Hello. This is a string. Goodbye., the output would be ['Hello. This is a str', 'ing. Goodbye.'].
However, I don’t want the string to be cut if it’s a word. That word should move to the next string in the array. How can I get the following output instead?
['Hello. This is a', 'string. Goodbye.']
First split your sentence on word-boundary:
var words = myString.Split();
Now concatenate words as long as not more than 20 characters are within your current line:
var lines = new List<string> { words[0] };
var lineNum = 0;
for(int i = 1; i < words.Length; i++)
{
if(lines[lineNum].Length + words[i].Length + 1 <= 20)
lines[lineNum] += " " + words[i];
else
{
lines.Add(words[i]);
lineNum++;
}
}
Here is a fiddle for testing: https://dotnetfiddle.net/s0LrFC
Could be more elegant but this will split the string to lines of a maximum number of characters. The words will be kept together unless they exceed the given length.
public static string[] SplitString(string input, int lineLen)
{
StringBuilder sb = new StringBuilder();
string[] words = input.Split(' ');
string line = string.Empty;
string sp = string.Empty;
foreach (string w in words)
{
string word = w;
while (word != string.Empty)
{
if (line == string.Empty)
{
while (word.Length >= lineLen)
{
sb.Append(word.Substring(0, lineLen) + "~");
word = word.Substring(lineLen);
}
if (word != string.Empty)
line = word;
word = string.Empty;
sp = " ";
}
else if (line.Length + word.Length <= lineLen)
{
line += sp + word;
sp = " ";
word = string.Empty;
}
else
{
sb.Append(line + "~");
line = string.Empty;
sp = string.Empty;
}
}
}
if (line != string.Empty)
sb.Append(line);
return sb.ToString().Split('~');
}
To test:
string[] lines = SplitString("This is a test of the string splitter KGKGKJGKGHKJHJKJKHGJHGhghsjagsjasgajsgjahs yes!", 20);
foreach (string line in lines)
{
Console.WriteLine(line);
}
Output:
This is a test of the
string splitter
KGKGKJGKGHKJHJKJKHGJ
HGhghsjagsjasgajsgja
hs yes!
I believe it's faster to split it only at places where it needs to be, instead of every word. With lines.SelectMany(x => Split(x, 80) can be used with multiline texts:
private static IEnumerable<string> Split(string text, int maxLength)
{
var i = 0;
while (i + maxLength < text.Length)
{
var partIndex = text.LastIndexOf(' ', i + maxLength, maxLength);
if (partIndex == -1)
partIndex = i + maxLength;
yield return text[i..partIndex];
i = partIndex + 1;
}
yield return text[i..];
}

Splitting text into lines with maximum length

I have a long string and I want to fit that in a small field. To achieve that, I break the string into lines on whitespace. The algorithm goes like this:
public static string BreakLine(string text, int maxCharsInLine)
{
int charsInLine = 0;
StringBuilder builder = new StringBuilder();
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
builder.Append(c);
charsInLine++;
if (charsInLine >= maxCharsInLine && char.IsWhiteSpace(c))
{
builder.AppendLine();
charsInLine = 0;
}
}
return builder.ToString();
}
But this breaks when there's a short word, followed by a longer word. "foo howcomputerwork" with a max length of 16 doesn't break, but I want it to. One thought I has was looking forward to see where the next whitespace occurs, but I'm not sure whether that would result in the fewest lines possible.
Enjoy!
public static string SplitToLines(string text, char[] splitOnCharacters, int maxStringLength)
{
var sb = new StringBuilder();
var index = 0;
while (text.Length > index)
{
// start a new line, unless we've just started
if (index != 0)
sb.AppendLine();
// get the next substring, else the rest of the string if remainder is shorter than `maxStringLength`
var splitAt = index + maxStringLength <= text.Length
? text.Substring(index, maxStringLength).LastIndexOfAny(splitOnCharacters)
: text.Length - index;
// if can't find split location, take `maxStringLength` characters
splitAt = (splitAt == -1) ? maxStringLength : splitAt;
// add result to collection & increment index
sb.Append(text.Substring(index, splitAt).Trim());
index += splitAt;
}
return sb.ToString();
}
Note that splitOnCharacters and maxStringLength could be saved in user settings area of the app.
Check the contents of the character before writing to the string builder and or it with the current count:
public static string BreakLine(string text, int maxCharsInLine)
{
int charsInLine = 0;
StringBuilder builder = new StringBuilder();
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
if (char.IsWhiteSpace(c) || charsInLine >= maxCharsInLine)
{
builder.AppendLine();
charsInLine = 0;
}
else
{
builder.Append(c);
charsInLine++;
}
}
return builder.ToString();
}
update a code a bit, the #dead.rabit goes to loop sometime.
public static string SplitToLines(string text,char[] splitanyOf, int maxStringLength)
{
var sb = new System.Text.StringBuilder();
var index = 0;
var loop = 0;
while (text.Length > index)
{
// start a new line, unless we've just started
if (loop != 0)
{
sb.AppendLine();
}
// get the next substring, else the rest of the string if remainder is shorter than `maxStringLength`
var splitAt = 0;
if (index + maxStringLength <= text.Length)
{
splitAt = text.Substring(index, maxStringLength).LastIndexOfAny(splitanyOf);
}
else
{
splitAt = text.Length - index;
}
// if can't find split location, take `maxStringLength` characters
if (splitAt == -1 || splitAt == 0)
{
splitAt = text.IndexOfAny(splitanyOf, maxStringLength);
}
// add result to collection & increment index
sb.Append(text.Substring(index, splitAt).Trim());
if(text.Length > splitAt)
{
text = text.Substring(splitAt + 1).Trim();
}
else
{
text = string.Empty;
}
loop = loop + 1;
}
return sb.ToString();
}

Changing commas within quotes

I am trying to read the data in a text file which is separated by commas. My problem, is that one of my data pieces has a comma within it. An example of what the text file looks like is:
a, b, "c, d", e, f.
I want to be able to take the comma between c and d and change it to a semicolon so that I can still use the string.Split() method.
using (StreamReader reader = new StreamReader("file.txt"))
{
string line;
while ((line = reader.ReadLine ()) != null) {
bool firstQuote = false;
for (int i = 0; i < line.Length; i++)
{
if (line [i] == '"' )
{
firstQuote = true;
}
else if (firstQuote == true)
{
if (line [i] == '"')
{
break;
}
if ((line [i] == ','))
{
line = line.Substring (0, i) + ";" + line.Substring (i + 1, (line.Length - 1) - i);
}
}
}
Console.WriteLine (line);
}
I am having a problem. Instead of producing
a, b, "c; d", e, f
it is producing
a, b, "c; d"; e; f
It is replacing all of the following commas with semicolons instead of just the comma in the quotes. Can anybody help me fix my existing code?
Basically if you find a closing " you recognize it as it was an opening quote.
Change the line:
firstQuote = true;
to
firstQuote = !firstQuote;
and it should work.
You need to reset firstquote to false after you hit the second quote.
else if (firstQuote == true) {
if (line [i] == '"') {
firstquote = false;
break;
}
Here is a simple application to get the required result
static void Main(string[] args)
{
String str = "a,b,\"c,d\",e,f,\"g,h\",i,j,k,l,\"m,n,o\"";
int firstQuoteIndex = 0;
int secodQuoteIndex = 0;
Console.WriteLine(str);
bool iteration = false;
//String manipulation
//if count is even then count/2 is the number of pairs of double quotes we are having
//so we have to traverse count/2 times.
int count = str.Count(s => s.Equals('"'));
if (count >= 2)
{
firstQuoteIndex = str.IndexOf("\"");
for (int i = 0; i < count / 2; i++)
{
if (iteration)
{
firstQuoteIndex = str.IndexOf("\"", firstQuoteIndex + 1);
}
secodQuoteIndex = str.IndexOf("\"", firstQuoteIndex + 1);
string temp = str.Substring(firstQuoteIndex + 1, secodQuoteIndex - (firstQuoteIndex + 1));
firstQuoteIndex = secodQuoteIndex + 1;
if (count / 2 > 1)
iteration = true;
string temp2= temp.Replace(',', ';');
str = str.Replace(temp, temp2);
Console.WriteLine(temp);
}
}
Console.WriteLine(str);
Console.ReadLine();
}
Please feel free to ask in case of doubt
string line = "a,b,mc,dm,e,f,mk,lm,g,h";
string result =replacestr(line, 'm', ',', ';');
public string replacestr(string line,char seperator,char oldchr,char newchr)
{
int cnt = 0;
StringBuilder b = new StringBuilder();
foreach (char chr in line)
{
if (cnt == 1 && chr == seperator)
{
b[b.ToString().LastIndexOf(oldchr)] = newchr;
b.Append(chr);
cnt = 0;
}
else
{
if (chr == seperator)
cnt = 1;
b.Append(chr);
}
}
return b.ToString();
}

Extract Field Names and max lengths from a text file using C#

I have a file that is a SQL Server result set saved as a text file.
Here is a sample of what the file looks like:
RWS_DMP_ID RV1_DMP_NUM CUS_NAME
3192 3957 THE ACME COMPANY
3192 3957 THE ACME COMPANY
3192 3957 THE ACME COMPANY
I want to create a C# program that reads this file and creates the following table of data:
Field MaxSize
----- -------
RWS_DMP_ID 17
RV1_DMP_NUM 17
CUS_NAME 42
This is a list of the field names and their max length. The max length is the beginning of the field to the space right before the beginning of the next field.
By the way I don't care about code performance. This is seldom used file processing utility.
I solved this with the following code:
objFile = new StreamReader(strPath + strFileName);
strLine = objFile.ReadLine();
intLineCnt = 0;
while (strLine != null)
{
intLineCnt++;
if (intLineCnt <= 3)
{
if (intLineCnt == 1)
{
strWords = SplitWords(strLine);
intNumberOfFields = strWords.Length;
foreach (char c in strLine)
{
if (bolNewField == true)
{
bolFieldEnd = false;
bolNewField = false;
}
if (bolFieldEnd == false)
{
if (c == ' ')
{
bolFieldEnd = true;
}
}
else
{
if (c != ' ')
{
if (intFieldCnt < strWords.Length)
{
strProcessedData[intFieldCnt, 0] = strWords[intFieldCnt];
strProcessedData[intFieldCnt, 1] = (intCharCnt - 1).ToString();
}
intFieldCnt++;
intCharCnt = 1;
bolNewField = true;
}
}
if (bolNewField == false)
{
intCharCnt++;
}
}
strProcessedData[intFieldCnt, 0] = strWords[intFieldCnt];
strProcessedData[intFieldCnt, 1] = intCharCnt.ToString();
}
else if (intLineCnt == 3)
{
intLine2Cnt= 0;
intTotalLength = 0;
while(intLine2Cnt < intNumberOfFields)
{
intSize = Convert.ToInt32(strProcessedData[intLine2Cnt, 1]);
if (intSize + intTotalLength > strLine.Length)
{
intSize = strLine.Length - intTotalLength;
}
strField = strLine.Substring(intTotalLength, intSize);
strField = strField.Trim();
strProcessedData[intLine2Cnt, intLineCnt - 1] = strField;
intTotalLength = intTotalLength + intSize + 1;
intLine2Cnt++;
}
}
}
strLine = objFile.ReadLine();
}`enter code here`
I'm aware that this code is a complete hack job. I'm looking for a better way to solve this problem.
Is there a better way to solve this problem?
THanks
I'm not sure how memory efficient this is, but I think it's a bit cleaner (assuming your fields are tab-delimited):
var COL_DELIMITER = new[] { '\t' };
string[] lines = File.ReadAllLines(strPath + strFileName);
// read the field names from the first line
var fields = lines[0].Split(COL_DELIMITER, StringSplitOptions.RemoveEmptyEntries).ToList();
// get a 2-D array of the columns (excluding the header row)
string[][] columnsArray = lines.Skip(1).Select(l => l.Split(COL_DELIMITER)).ToArray();
// dictionary of columns with max length
var max = new Dictionary<string, int>();
// for each field, select all columns, and take the max string length
foreach (var field in fields)
{
max.Add(field, columnsArray.Select(row => row[fields.IndexOf(field)]).Max(col => col.Trim().Length));
}
// output per requirment
Console.WriteLine(string.Join(Environment.NewLine,
max.Keys.Select(field => field + " " + max[field])
));
void MaximumWidth(StreamReader reader)
{
string[] columns = null;
int[] maxWidth = null;
string line;
while ((line = reader.ReadLine()) != null)
{
string[] cols = line.Split('\t');
if (columns == null)
{
columns = cols;
maxWidth = new int[cols.Length];
}
else
{
for (int i = 0; i < columns.Length; i++)
{
int width = cols[i].Length;
if (maxWidth[i] < width)
{
maxWidth[i] = width;
}
}
}
}
// ...
}
Here is what I came up with. The big takeaway is to use the IndexOf string function.
class Program
{
static void Main(string[] args)
{
String strFilePath;
String strLine;
Int32 intMaxLineSize;
strFilePath = [File path and name];
StreamReader objFile= null;
objFile = new StreamReader(strFilePath);
intMaxLineSize = File.ReadAllLines(strFilePath).Max(line => line.Length);
//Get the first line
strLine = objFile.ReadLine();
GetFieldNameAndFieldLengh(strLine, intMaxLineSize);
Console.WriteLine("Press <enter> to continue.");
Console.ReadLine();
}
public static void GetFieldNameAndFieldLengh(String strLine, Int32 intMaxSize)
{
Int32 x;
string[] fields = null;
string[,] strFieldSizes = null;
Int32 intFieldSize;
fields = SplitWords(strLine);
strFieldSizes = new String[fields.Length, 2];
x = 0;
foreach (string strField in fields)
{
if (x < fields.Length - 1)
{
intFieldSize = strLine.IndexOf(fields[x + 1]) - strLine.IndexOf(fields[x]);
}
else
{
intFieldSize = intMaxSize - strLine.IndexOf(fields[x]);
}
strFieldSizes[x, 0] = fields[x];
strFieldSizes[x, 1] = intFieldSize.ToString();
x++;
}
Console.ReadLine();
}
static string[] SplitWords(string s)
{
return Regex.Split(s, #"\W+");
}
}

Categories