I'm getting an error IndexOutOfRangeException was unhandled at the line int euros = int.Parse(values[1]).
My .csv file looks:
name, 1, 2
name1, 3, 4
name2, 5, 6
public static void ReadData(out Turistai[] tourists, out int amount)
{
amount = 0;
tourists = new Turistai[MaxTourists];
using (StreamReader reader = new StreamReader("C:\\Users\\Andrius\\Desktop\\Mokslams\\C#\\Pratybos\\P2\\P2.1\\turistai.csv"))
{
string line = null;
while( (line = reader.ReadLine()) != null)
{
string[] values = line.Split(';');
string name = values[0];
int euros = int.Parse(values[1]);
int cents = int.Parse(values[2]);
Console.WriteLine(euros);
//Turistai tourists = new Turistai(name, euros, cents);
amount++;
}
}
}
Probably, you have some empty lines in the CSV file.
I suggest using Linq to sum up the euros:
var data = File
.ReadLines("C:\\Users\\Andrius\\Desktop\\Mokslams\\C#\\Pratybos\\P2\\P2.1\\turistai.csv")
.Select(line => line.Split(','))
.Where(items => items.Length >= 2) // filter out empty/incomplete lines
// To debug, let's take euros only
.Select(items => int.Parse(items[1]));
// In the final solution we'll create Touristai instances
// .Select(items => new Touristai(items[0], int.Parse(items[1]), int.Parse(items[2])))
.ToArray();
Console.WriteLine(String.Join(Environment.NewLine, data));
Console.WriteLine(data.Sum());
Final solution will be
public static void ReadData(out Turistai[] tourists, out int amount) {
tourists = File
.ReadLines(#"C:\Users\Andrius\Desktop\Mokslams\C#\Pratybos\P2\P2.1\turistai.csv")
.Select(line => line.Split(','))
.Where(items => items.Length >= 2) // filter out empty/incomplete lines
.Select(items => new Touristai(items[0], int.Parse(items[1]), int.Parse(items[2])))
.ToArray();
//TODO: check syntax (I've sugested Touristai should have Euro property)
amount = tourists.Sum(tourist => tourist.Euro);
}
Your CSV input is comma separated while in the code you're splitting by semicolons. Change the split() parameter to ,:
string[] values = line.Split(',');
You may also want to add input format check to ensure the values array contains at least three items and the numeric fields do actually contain integer values (int.TryParse() may help with this):
while( (line = reader.ReadLine()) != null)
{
string[] values = line.Split(',');
if (values.Length < 3)
{
Console.Error.WriteLine("Invalid input line: " + line);
continue;
}
string name = values[0];
int euros;
if (!int.TryParse(values[1], out euros))
{
Console.Error.WriteLine("Invalid euros value in the line: " + line);
continue;
}
int cents;
if (!int.TryParse(values[2], out cents))
{
Console.Error.WriteLine("Invalid cents value in the line: " + line);
continue;
}
Console.WriteLine(euros);
//Turistai tourists = new Turistai(name, euros, cents);
amount++;
}
Related
I would like to know how to stack the same lines titled to packet and write to next file. For example, I had the following problem:
I read CSV file line by line, and I want to stack lines with the same titles to one packet.
file1:
Test;param1
Test;param2
Test1;param1
Test1;param2
Test1;param3
Test2;param1
result file:
Test;[param1,param2]
Test1;[param1,param2,param3]
Test2;[param1]
It does not have to be identical, but it is a hint on how to do something like that.
My code:
var enumLines = System.IO.File.ReadLines(pathZamowienia, Encoding.UTF8);
int factor = 0;
foreach (var line in enumLines)
{
var tabLine = line.Split(';').ToList();
if (factor == 0)
{
Console.WriteLine();
}
else
{
try
{
Title = tabLine[0];
}
catch (FormatException ex)
{
Console.WriteLine("Failure");
}
try
{
Param = tabLine[1];
}
catch (FormatException ex)
{
Console.WriteLine("Failure");
}
factor++;
}
You can use a LINQ query to group the lines
// Test input
var enumLines = new List<string> {
"Test;param1",
"Test;param2",
"Test1;param1",
"Test1;param2",
"Test1;param3",
"Test2;param1"
};
// Re-group the parameters
var newLines = enumLines
.Select(s => s.Split(';'))
.GroupBy(a => a[0], a => a[1])
.Select(g => g.Key + ";[" + String.Join(",", g) + "]");
// Test output:
foreach (string line in newLines) {
Console.WriteLine(line);
}
Output:
Test;[param1,param2]
Test1;[param1,param2,param3]
Test2;[param1]
Note that the group g itself is an enumeration of the aggregated values and also has a property Key. The first argument of GroupBy selects the Key, the second optional parameter selects the value to be aggregated. If it is omitted, the input (the string array a) is aggregated.
If the input includes misshaped lines, you could also exclude them with an additional Where-clause:
var newLines = enumLines
.Select(s => s.Split(';'))
.Where(a => a.Length >= 2)
.GroupBy(a => a[0], a => a[1])
.Select(g => g.Key + ";[" + String.Join(",", g) + "]");
This is what you can do. Parse your file first and then transform
var data - new Dictionary<string, List<string>>();
string[] lines = File.ReadAllLines(fileName);
foreach(string line in Lines)
{
string parts = line.Split(';');
if (!data.ContainsKey(parts[0]))
data.Add(parts[0], new List<string>());
data[parts[0]].Add(parts[1]);
}
// then you open stream and write this
foreach(var kvp in data)
{
string line = $"{kvp.Key};[{string.Join(',', kvp.Value)}]"
// write line here
}
// close stream
For example, there is a line:
name, tax, company.
To separate them i need a split method.
string[] text = File.ReadAllLines("file.csv", Encoding.Default);
foreach (string line in text)
{
string[] words = line.Split(',');
foreach (string word in words)
{
Console.WriteLine(word);
}
}
Console.ReadKey();
But how to divide if in quotes the text with a comma is indicated:
name, tax, "company, Ariel";<br>
"name, surname", tax, company;<br> and so on.
To make it like this :
Max | 12.3 | company, Ariel
Alex, Smith| 13.1 | Oriflame
It is necessary to take into account that the input data will not always be in an ideal format (as in the example). That is, there may be 3 quotes in a row or a string without commas. The program should not fall in any case. If it is impossible to parse, then issue a message about it.
Split using double quotes first. And Split using comma on the first string.
You can use TextFieldParser from Microsoft.VisualBasic.FileIO
var list = new List<Data>();
var isHeader=true;
using (TextFieldParser parser = new TextFieldParser(filePath))
{
parser.Delimiters = new string[] { "," };
while (true)
{
string[] parts = parser.ReadFields();
if(isHeader)
{
isHeader = false;
continue;
}
if (parts == null)
break;
list.Add(new Data
{
People = parts[0],
Tax = Double.Parse(parts[1]),
Company = parts[2]
});
}
}
Where Data is defined as
public class Data
{
public string People{get;set;}
public double Tax{get;set;}
public string Company{get;set;}
}
Please note you need to include Microsoft.VisualBasic.FileIO
Example Data,
Name,Tax,Company
Max,12.3,"company, Ariel"
Ariel,13.1,"company, Oriflame"
Output
Here's a bit of code that might help, not the most efficient but I use it to 'see' what is going on with the parsing if a particular line is giving trouble.
string[] text = File.ReadAllLines("file.csv", Encoding.Default);
string[] datArr;
string tmpStr;
foreach (string line in text)
{
ParseString(line, ",", "!####!", out datArr, out tmpStr)
foreach(string s in datArr)
{
Console.WriteLine(s);
}
}
Console.ReadKey();
private static void ParseString(string inputString, string origDelim, string newDelim, out string[] retArr, out string retStr)
{
string tmpStr = inputString;
retArr = new[] {""};
retStr = "";
if (!string.IsNullOrWhiteSpace(tmpStr))
{
//If there is only one Quote character in the line, ignore/remove it:
if (tmpStr.Count(f => f == '"') == 1)
tmpStr = tmpStr.Replace("\"", "");
string[] tmpArr = tmpStr.Split(new[] {origDelim}, StringSplitOptions.None);
var inQuote = 0;
StringBuilder lineToWrite = new StringBuilder();
foreach (var s in tmpArr)
{
if (s.Contains("\""))
inQuote++;
switch (inQuote)
{
case 1:
//Begin quoted text
lineToWrite.Append(lineToWrite.Length > 0
? newDelim + s.Replace("\"", "")
: s.Replace("\"", ""));
if (s.Length > 4 && s.Substring(0, 2) == "\"\"" && s.Substring(s.Length - 2, 2) != "\"\"")
{
//if string has two quotes at the beginning and is > 4 characters and the last two characters are NOT quotes,
//inquote needs to be incremented.
inQuote++;
}
else if ((s.Substring(0, 1) == "\"" && s.Substring(s.Length - 1, 1) == "\"" &&
s.Length > 1) || (s.Count(x => x == '\"') % 2 == 0))
{
//if string has more than one character and both begins and ends with a quote, then it's ok and counter should be reset.
//if string has an EVEN number of quotes, it should be ok and counter should be reset.
inQuote = 0;
}
else
{
inQuote++;
}
break;
case 2:
//text between the quotes
//If we are here the origDelim value was found between the quotes
//include origDelim so there is no data loss.
//Example quoted text: "Dr. Mario, Sr, MD";
// ", Sr" would be handled here
// ", MD" would be handled in case 3 end of quoted text.
lineToWrite.Append(origDelim + s);
break;
case 3:
//End quoted text
//If we are here the origDelim value was found between the quotes
//and we are at the end of the quoted text
//include origDelim so there is no data loss.
//Example quoted text: "Dr. Mario, MD"
// ", MD" would be handled here.
lineToWrite.Append(origDelim + s.Replace("\"", ""));
inQuote = 0;
break;
default:
lineToWrite.Append(lineToWrite.Length > 0 ? newDelim + s : s);
break;
}
}
if (lineToWrite.Length > 0)
{
retStr = lineToWrite.ToString();
retArr = tmpLn.Split(new[] {newDelim}, StringSplitOptions.None);
}
}
}
Actually I set four columns using data table and I want this column retrieve value from text file. I used regex for remove the particular line from the text file.
My objective is that I want to show text file on the grid using data table so first I am trying to create data table and remove the line (show at the program) using regex.
Here I post my full code.
namespace class
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
StreamReader sreader = File.OpenText(#"C:\FareSearchRegex.txt");
string line;
DataTable dt = new DataTable();
DataRow dr;
dt.Columns.Add("PTC");
dt.Columns.Add("CUR");
dt.Columns.Add("TAX");
dt.Columns.Add("FARE BASIS");
while ((line = sreader.ReadLine()) != null)
{
var pattern = "---------- RECOMMENDATION 1 OF 3 IN GROUP 1 (USD 168.90)----------";
var result = Regex.Replace(line,pattern," ");
dt.Rows.Add(line);
}
}
}
class Class1
{
string PTC;
string CUR;
float TAX;
public string gsPTC
{
get{ return PTC; }
set{ PTC = value; }
}
public string gsCUR
{
get{ return CUR; }
set{ CUR = value; }
}
public float gsTAX
{
get{ return TAX; }
set{ TAX = value; }
}
}
}
If your format is strict(e.g. always 4 columns) and you want to remove only this complete line i don't see any reason to use regex:
var rows = File.ReadLines(#"C:\FareSearchRegex.txt")
.Where(l => l != "---------- RECOMMENDATION 1 OF 3 IN GROUP 1 (USD 168.90)----------")
.Select(l => new { line = l, items = l.Split(','), row = dt.Rows.Add() });
foreach (var x in rows)
x.row.ItemArray = x.items;
(assumed that the fields are separated by comma)
Edit: This works with your pastebin:
string header = " PTC CUR TAX FARE BASIS";
bool takeNextLine = false;
foreach (string line in File.ReadLines(#"C:\FareSearchRegex.txt"))
{
if (line.StartsWith(header))
takeNextLine = true;
else if (takeNextLine)
{
var tokens = line.Split(new[] { #" " }, StringSplitOptions.RemoveEmptyEntries);
dt.Rows.Add().ItemArray = tokens.Where((t, i) => i != 2).ToArray();
takeNextLine = false;
}
}
(since you have an empty column which you want to exclude from the result i've used the clumsy and possibly error-prone(?) query Where((t, i) => i != 2))
To parse the file you'll need to:
Split the text of the file into data chunks. A chunk, in your case can be identified by the header PTC CUR TAX FARE BASIS and by the TOTAL line. To split the text you'll need to tokenize the input as follows> (i) define a regular expression to match the headers, (ii) define a regular expression to match the Total lines (footers); Using (i) and (ii) you can join them by the order of appearance index and determine the total size of each chunk (see the line with (x,y)=>new{StartIndex = x.Match.Index, EndIndex = y.Match.Index + y.Match.Length}) below). Use String.Substring method to separate the chunks.
Extract the data from each individual chunk. Knowing that data is split by lines you just have to iterate through all lines in a chunk (ignoring header and footer) and process each line.
This code should help:
string file = #"C:\FareSearchRegex.txt";
string text = File.ReadAllText(file);
var headerRegex = new Regex(#"^(\)>)?\s+PTC\s+CUR\s+TAX\s+FARE BASIS$", RegexOptions.IgnoreCase | RegexOptions.Multiline);
var totalRegex = new Regex(#"^\s+TOTAL[\w\s.]+?$",RegexOptions.IgnoreCase | RegexOptions.Multiline);
var lineRegex = new Regex(#"^(?<Num>\d+)?\s+(?<PTC>[A-Z]+)\s+\d+\s(?<Cur>[A-Z]{3})\s+[\d.]+\s+(?<Tax>[\d.]+)",RegexOptions.IgnoreCase | RegexOptions.Multiline);
var dataIndices =
headerRegex.Matches(text).Cast<Match>()
.Select((m, index) => new{ Index = index, Match = m })
.Join(totalRegex.Matches(text).Cast<Match>().Select((m, index) => new{ Index = index, Match = m }),
x => x.Index,
x => x.Index,
(x, y) => new{ StartIndex = x.Match.Index, EndIndex = y.Match.Index + y.Match.Length });
var items = dataIndices
.Aggregate(new List<string>(), (list, x) =>
{
var item = text.Substring(x.StartIndex, x.EndIndex - x.StartIndex);
list.Add(item);
return list;
});
var result = items.SelectMany(x =>
{
var lines = x.Split(new string[]{Environment.NewLine, "\r", "\n"}, StringSplitOptions.RemoveEmptyEntries);
return lines.Skip(1) //Skip header
.Take(lines.Length - 2) // Ignore footer
.Select(line =>
{
var match = lineRegex.Match(line);
return new
{
Ptc = match.Groups["PTC"].Value,
Cur = match.Groups["Cur"].Value,
Tax = Convert.ToDouble(match.Groups["Tax"].Value)
};
});
});
I am facing a problem while executing a sql query in C#.The sql query throws an error when the string contains more than 1000 enteries in the IN CLAUSE .The string has more than 1000 substrings each seperated by ','.
I want to split the string into string array each containing 999 strings seperated by ','.
or
How can i find the nth occurence of ',' in a string.
Pull the string from SQL server into a DataSet using a utilities code like
string strResult = String.Empty;
using (SqlCommand cmd = new SqlCommand())
{
cmd.Connection = conn;
cmd.CommandText = strSQL;
strResult = cmd.ExecuteScalar().ToString();
}
Get the returned string from SQL Server
Split the string on the ','
string[] strResultArr = strResult.Split(',');
then to get the nth string that is seperated by ',' (I think this is what you mean by "How can i find the nth occurence of ',' in a string." use
int n = someInt;
string nthEntry = strResultArr[someInt - 1];
I hope this helps.
You could use a regular expression and the Index property of the Match class:
// Long string of 2000 elements, seperated by ','
var s = String.Join(",", Enumerable.Range(0,2000).Select (e => e.ToString()));
// find all ',' and use '.Index' property to find the position in the string
// to find the first occurence, n has to be 0, etc. etc.
var nth_position = Regex.Matches(s, ",")[n].Index;
To create an array of strings of your requiered size, you could split your string and use LINQ's GroupBy to partition the result, and then joining the resulting groups together:
var result = s.Split(',').Select((x, i) => new {Group = i/1000, Value = x})
.GroupBy(item => item.Group, g => g.Value)
.Select(g => String.Join(",", g));
result now contains two strings, each with 1000 comma seperated elements.
How's this:
int groupSize = 1000;
string[] parts = s.Split(',');
int numGroups = parts.Length / groupSize + (parts.Length % groupSize != 0 ? 1 : 0);
List<string[]> Groups = new List<string[]>();
for (int i = 0; i < numGroups; i++)
{
Groups.Add(parts.Skip(i * groupSize).Take(groupSize).ToArray());
}
Maybe something like this:
string line = "1,2,3,4";
var splitted = line.Split(new[] {','}).Select((x, i) => new {
Element = x,
Index = i
})
.GroupBy(x => x.Index / 1000)
.Select(x => x.Select(y => y.Element).ToList())
.ToList();
After this you should just String.Join each IList<string>.
//initial string of 10000 entries divided by commas
string s = string.Join(", ", Enumerable.Range(0, 10000));
//an array of entries, from the original string
var ss = s.Split(',');
//auxiliary index
int index = 0;
//divide into groups by 1000 entries
var words = ss.GroupBy(w =>
{
try
{
return index / 1000;
}
finally
{
++index;
}
})//join groups into "words"
.Select(g => string.Join(",", g));
//print each word
foreach (var word in words)
Console.WriteLine(word);
Or you may find the indeces in the string and split it into substrings afterwards:
string s = string.Join(", ", Enumerable.Range(0, 100));
int index = 0;
var indeces =
Enumerable.Range(0, s.Length - 1).Where(i =>
{
if (s[i] == ',')
{
if (index < 9)
++index;
else
{
index = 0;
return true;
}
}
return false;
}).ToList();
Console.WriteLine(s.Substring(0, indeces[0]));
for (int i = 0; i < indeces.Count - 1; i++)
{
Console.WriteLine(s.Substring(indeces[i], indeces[i + 1] - indeces[i]));
}
However, I would think over, if it was possible to work with the entries before they are combined into one string. And probably think, if it was possible to prevent the necessity to make a query which needs that great list to pass into the IN statement.
string foo = "a,b,c";
string [] foos = foo.Split(new char [] {','});
foreach(var item in foos)
{
Console.WriteLine(item);
}
I have implemented Cuong's solution here:
C# Processing Fixed Width Files
Here is my code:
var lines = File.ReadAllLines(#fileFull);
var widthList = lines.First().GroupBy(c => c)
.Select(g => g.Count())
.ToList();
var list = new List<KeyValuePair<int, int>>();
int startIndex = 0;
for (int i = 0; i < widthList.Count(); i++)
{
var pair = new KeyValuePair<int, int>(startIndex, widthList[i]);
list.Add(pair);
startIndex += widthList[i];
}
var csvLines = lines.Select(line => string.Join(",",
list.Select(pair => line.Substring(pair.Key, pair.Value))));
File.WriteAllLines(filePath + "\\" + fileName + ".csv", csvLines);
#fileFull = File Path & Name
The issue I have is the first line of the input file also contains digits. So it could be AAAAAABBC111111111DD2EEEEEE etc. For some reason the output from Cuong's code gives me CSV headings like 1111RRRR and 222223333.
Does anyone know why this is and how I would fix it?
Header row example:
AAAAAAAAAAAAAAAABBBBBBBBBBCCCCCCCCDEFCCCCCCCCCGGGGGGGGHHHHHHHHIJJJJJJJJKKKKLLLLMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOPPPPQQQQ1111RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR222222222333333333444444444555555555666666666777777777888888888999999999S00001111TTTTTTTTTTTTUVWXYZ!"£$$$$$$%&
Converted header row:
AAAAAAAAAAAAAAAA BBBBBBBBBB CCCCCCCCDEFCCCCCC C C C GGGGGGGG HHHHHHHH I JJJJJJJJ KKKK LLLL MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO PPPP QQQQ 1111RRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR2222 222223333 333334444 444445555 555556666 666667777 777778888 888889999 99999S000 0 1111 TTTTTTTTTTTT U V W X Y Z ! ",�,$$$$$$,%,&,"
Jodrell - I implemented your suggestion but the header output is like:
BBBBBBBBBBCCCCCC CCCCCCCCD DEFCCCC GGGGGGGG HHHHHHH IJJJJJJ KKKKLLL LLL MMM NNNNNNNNNNNNNNNNNNNNNNNNNNNNN OOOOOOOOOOOOOOOOOOOOOOOOOOOOO PPPPQQQQ1111RRRRRRRRRRRRRRRRR QQQ 111 RRR 33333333 44444444 55555555 66666666 77777777 88888888 99999999 S0000111 111 TTT UVWXYZ!"�$$ %&
As Jodrell already mentioned, your code doesn't work because it assumed that the character representing each column header is distinct. Change the code that parse the header widths would fix it.
Replace:
var widthList = lines.First().GroupBy(c => c)
.Select(g => g.Count())
.ToList();
With:
var widthList = new List<int>();
var header = lines.First().ToArray();
for (int i = 0; i < header.Length; i++)
{
if (i == 0 || header[i] != header[i-1])
widthList.Add(0);
widthList[widthList.Count-1]++;
}
Parsed header columns:
AAAAAAAAAAAAAAAA BBBBBBBBBB CCCCCCCC D E F CCCCCCCCC GGGGGGGG HHHHHHHH I JJJJJJJJ KKKK LLLL MMMMMMMMMMMMMMMMMMMMMMMMMMMMMM NNNNNNNNNNNNNNNNNNNNNNNNNNNNNN OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO PPPP QQQQ 1111 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 222222222 333333333 444444444 555555555 666666666 777777777 888888888 999999999 S 0000 1111 TTTTTTTTTTTT U V W X Y Z ! " £ $$$$$$ % &
EDIT
Because the problem annoyed me I wrote some code that handles " and ,. This code replaces the header row with comma delimited alternating zeros and ones. Any commas or double quotes in the body are appropriately escaped.
static void FixedToCsv(string sourceFile)
{
if (sourceFile == null)
{
// Throw exception
}
var dir = Path.GetDirectory(sourceFile)
var destFile = string.Format(
"{0}{1}",
Path.GetFileNameWithoutExtension(sourceFile),
".csv");
if (dir != null)
{
destFile = Path.Combine(dir, destFile);
}
if (File.Exists(destFile))
{
// Throw Exception
}
var blocks = new List<KeyValuePair<int, int>>();
using (var output = File.OpenWrite(destFile))
{
using (var input = File.OpenText(sourceFile))
{
var outputLine = new StringBuilder();
// Make header
var header = input.ReadLine();
if (header == null)
{
return;
}
var even = false;
var lastc = header.First();
var counter = 0;
var blockCounter = 0;
foreach(var c in header)
{
counter++;
if (c == lastc)
{
blockCounter++;
}
else
{
blocks.Add(new KeyValuePair<int, int>(
counter - blockCounter - 1,
blockCounter));
blockCounter = 1;
outputLine.Append(',');
even = !even;
}
outputLine.Append(even ? '1' : '0');
lastc = c;
}
blocks.Add(new KeyValuePair<int, int>(
counter - blockCounter,
blockCounter));
outputLine.AppendLine();
var lineBytes = Encoding.UTF.GetBytes(outputLine.ToString());
outputLine.Clear();
output.Write(lineBytes, 0, lineBytes.Length);
// Process Body
var inputLine = input.ReadLine();
while (inputLine != null)
{
foreach(var block in block.Select(b =>
inputLine.Substring(b.Key, b.Value)))
{
var sanitisedBlock = block;
if (block.Contains(',') || block.Contains('"'))
{
santitisedBlock = string.Format(
"\"{0}\"",
block.Replace("\"", "\"\""));
}
outputLine.Append(sanitisedBlock);
outputLine.Append(',');
}
outputLine.Remove(outputLine.Length - 1, 1);
outputLine.AppendLine();
lineBytes = Encoding.UTF8.GetBytes(outputLne.ToString());
outputLine.Clear();
output.Write(lineBytes, 0, lineBytes.Length);
inputLine = input.ReadLine();
}
}
}
}
1 is repeated in your header row, so your two fours get counted as one eight and everything goes wrong from there.
(There is a block of four 1s after the Qs and another block of four 1s after the 0s)
Essentialy, your header row is invalid or, at least, doesen't work with the proposed solution.
Okay, you could do somthing like this.
public void FixedToCsv(string fullFile)
{
var lines = File.ReadAllLines(fullFile);
var firstLine = lines.First();
var widths = new List<KeyValuePair<int, int>>();
var innerCounter = 0;
var outerCounter = 0
var firstLineChars = firstLine.ToCharArray();
var lastChar = firstLineChars[0];
foreach(var c in firstLineChars)
{
if (c == lastChar)
{
innerCounter++;
}
else
{
widths.Add(new KeyValuePair<int, int>(
outerCounter
innerCounter);
innerCounter = 0;
lastChar = c;
}
outerCounter++;
}
var csvLines = lines.Select(line => string.Join(",",
widths.Select(pair => line.Substring(pair.Key, pair.Value))));
// Get filePath and fileName from fullFile here.
File.WriteAllLines(filePath + "\\" + fileName + ".csv", csvLines);
}