Copying CSV file while reordering/adding empty columns.
For example if ever line of incoming file has values for 3 out of 10 columns in order different from output like (except first which is header with column names):
col2,col6,col4 // first line - column names
2, 5, 8 // subsequent lines - values for 3 columns
and output expected to have
col0,col1,col2,col3,col4,col5,col6,col7,col8,col9
then output should be "" for col0,col1,col3,col5,col7,col8,col9,and values from col2,col4,col4 in the input file. So for the shown second line (2,5,8) expected output is ",,2,,5,,8,,,,,"
Below code I've tried and it is slower than I want.
I have two lists.
The first list filecolumnnames is created by splitting a delimited string (line) and this list gets recreated for every line in the file.
The second list list has the order in which the first list needs to be rearranged and re concatenated.
This works
string fileName = "F:\\temp.csv";
//file data has first row col3,col2,col1,col0;
//second row: 4,3,2,1
//so on
string fileName_recreated = "F:\\temp_1.csv";
int count = 0;
const Int32 BufferSize = 1028;
using (var fileStream = File.OpenRead(fileName))
using (var streamReader = new StreamReader(fileStream, Encoding.UTF8, true, BufferSize))
{
String line;
List<int> list = new List<int>();
string orderedcolumns = "\"\"";
string tableheader = "col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10";
List<string> tablecolumnnames = new List<string>();
List<string> filecolumnnames = new List<string>();
while ((line = streamReader.ReadLine()) != null)
{
count = count + 1;
StringBuilder sb = new StringBuilder("");
tablecolumnnames = tableheader.Split(',').ToList();
if (count == 1)
{
string fileheader = line;
//fileheader=""col2,col1,col0"
filecolumnnames = fileheader.Split(',').ToList();
foreach (string col in tablecolumnnames)
{
int index = filecolumnnames.IndexOf(col);
if (index == -1)
{
sb.Append(",");
// orderedcolumns=orderedcolumns+"+\",\"";
list.Add(-1);
}
else
{
sb.Append(filecolumnnames[index] + ",");
//orderedcolumns = orderedcolumns+ "+filecolumnnames["+index+"]" + "+\",\"";
list.Add(index);
}
// MessageBox.Show(orderedcolumns);
}
}
else
{
filecolumnnames = line.Split(',').ToList();
foreach (int items in list)
{
//MessageBox.Show(items.ToString());
if (items == -1)
{
sb.Append(",");
}
else
{
sb.Append(filecolumnnames[items] + ",");
}
}
//expected format sb.Append(filecolumnnames[3] + "," + filecolumnnames[2] + "," + filecolumnnames[2] + ",");
//sb.Append(orderedcolumns);
var result = String.Join (", ", list.Select(index => filecolumnnames[index]));
}
using (FileStream fs = new FileStream(fileName_recreated, FileMode.Append, FileAccess.Write))
using (StreamWriter sw = new StreamWriter(fs))
{
sw.WriteLine(sb.ToString());
}
}
I am trying to make it faster by constructing a string orderedcolumns and remove the second for each loop which happens for every row and replace it with constructed string.
so if you uncomment the orderedcolumns string construction orderedcolumns = orderedcolumns+ "+filecolumnnames["+index+"]" + "+\",\""; and uncomment the append sb.Append(orderedcolumns); I am expecting the value inside the constructed string but when I append the orderedcolumns it is appending the text i.e.
""+","+filecolumnnames[3]+","+filecolumnnames[2]+","+filecolumnnames[1]+","+filecolumnnames[0]+","+","+","+","+","+","+","
i.e. I instead want it to take the value inside the filecolumnnames[3] list and not the filecolumnnames[3] name itself.
Expected value: if that line has 1,2,3,4
I want the output to be 4,3,2,1 as filecolumnnames[3] will have 4, filecolumnnames[2] will have 3..
String.Join is the way to construct comma/space delimited strings from sequence.
var result = String.Join (", ", list.Select(index => filecolumnnames[index]);
Since you are reading only subset of columns and orders in input and output don't match I'd use dictionary to hold each row of input.
var row = tablecolumnnames
.Zip(line.Split(','), (Name,Value)=> new {Name,Value})
.ToDictionary(x => x.Name, x.Value);
For output I'd fill sequence from defaults or input row:
var outputLine = String.Join(",",
filecolumnnames
.Select(name => row.ContainsKey(name) ? row[name] : ""));
Note code is typed in and not compiled.
orderedcolumns = orderedcolumns+ "+filecolumnnames["+index+"]" + "+\",\""; "
should be
orderedcolumns = orderedcolumns+ filecolumnnames[index] + ",";
you should however use join as others have pointed out. Or
orderedcolumns.AppendFormat("{0},", filecolumnnames[index]);
you will have to deal with the extra ',' on the end
Related
I have a text file containing the following data:
Jason,155
Peter,200
May,320
Jack,100
The above Texts are all in the same txt file.
I need to insert the name and the value of the person with the highest value into a separate textbox. I can't figure out how to read the highest value using StreamReader.
EDIT: This is the code that i used to read the txt file. But I'm not sure how to write the codes to pick the person with the highest value to display in a TextBox.
string[] Contestants = File.ReadAllLines(filePath);
foreach (var member in Contestants)
{
string[] first = member.Split(',');
string firstTemp = first[0] + "," + first[1];
}
This way is reading each line and filter it using Linq
var maxRow = File.ReadLines("file.txt")
.Where(line => !string.IsNullOrEmpty(line))
.Select(line => line.Split(','))
.Where(words => words.Length == 2)
.Aggregate((i1, i2) => int.Parse(i1[1]) >= int.Parse(i2[1]) ? i1 : i2);
string name = maxRow[0];
int number = int.Parse(maxRow[1]);
would it help, read line by line with streamreader, split by "," and parse the integer from string?
Read like this
using var fs = new FileStream(path, FileMode.Open, FileAccess.Read);
using var sr = new StreamReader(fs, Encoding.UTF8);
string line = String.Empty;
while ((line = sr.ReadLine()) != null) { Console.WriteLine(line); }
Best Regards
I have a csv file with 2 million rows and file size of 2 GB. But due to a couple of free text form columns, these contain redundant CRLF and cause the file to not load in the SQL Server table. I get an error that the last column does not end with ".
I have the following code, but it gives an OutOfMemoryException when reading from fileName. The line is:
var lines = File.ReadAllLines(fileName);
How can I fix it? Ideally, I would like to split the file into two good and bad rows. Or delete rows that do not end with "CRLF.
int goodRow = 0;
int badRow = 0;
String badRowFileName = fileName.Substring(0, fileName.Length - 4) + "BadRow.csv";
String goodRowFileName = fileName.Substring(0, fileName.Length - 4) + "GoodRow.csv";
var charGood = "\"\"";
String lineOut = string.Empty;
String str = string.Empty;
var lines = File.ReadAllLines(fileName);
StringBuilder sbGood = new StringBuilder();
StringBuilder sbBad = new StringBuilder();
foreach (string line in lines)
{
if (line.Contains(charGood))
{
goodRow++;
sbGood.AppendLine(line);
}
else
{
badRow++;
sbBad.AppendLine(line);
}
}
if (badRow > 0)
{
File.WriteAllText(badRowFileName, sbBad.ToString());
}
if (goodRow > 0)
{
File.WriteAllText(goodRowFileName, sbGood.ToString());
}
sbGood.Clear();
sbBad.Clear();
msg = msg + "Good Rows - " + goodRow.ToString() + " Bad Rows - " + badRow.ToString() + " Done.";
You can translate that code like this to be much more efficient:
int goodRow = 0, badRow = 0;
String badRowFileName = fileName.Substring(0, fileName.Length - 4) + "BadRow.csv";
String goodRowFileName = fileName.Substring(0, fileName.Length - 4) + "GoodRow.csv";
var charGood = "\"\"";
using (var lines = File.ReadLines(fileName))
using (var swGood = new StreamWriter(goodRowFileName))
using (var swBad = new StreamWriter(badRowFileName))
{
foreach (string line in lines)
{
if (line.Contains(charGood))
{
goodRow++;
swGood.WriteLine(line);
}
else
{
badRow++;
swBad.WriteLine(line);
}
}
}
msg += $"Good Rows: {goodRow,9} Bad Rows: {badRow,9} Done.";
But I'd also look at using a real csv parser for this. There are plenty on NuGet. That might even let you clean up the data on the fly.
I would not suggest reading the entire file into memory, then processing the file, then writing all modified contents out to the new file.
Instead using file streams:
using (var rdr = new StreamReader(fileName))
using (var wrtrGood = new StreamWriter(goodRowFileName))
using (var wrtrBad = new StreamWriter(badRowFileName))
{
string line = null;
while ((line = rdr.ReadLine()) != null)
{
if (line.Contains(charGood))
{
goodRow++;
wrtr.WriteLine(line);
}
else
{
badRow++;
wrtrBad.WriteLine(line);
}
}
}
my csv data is something looks like this:
Device data for period 30/08/2016 to 30/08/2016
Site ID,Time,INC1_MD
VSI-18,2016-08-30 00:00:00,165.954
VSI-18,2016-08-30 00:01:00,14.524
VSI-18,2016-08-30 00:02:00,32.920
VSI-18,2016-08-30 00:03:00,48.508
VSI-18,2016-08-30 00:04:00,62.418
.....
and I try to ignore first two line and start at "VSI-18..."
and extract third column data which is after the date & time column
and export them into new csv file, 1 column per day
like:
day1,day2,day3
100,200,300
200,123,123
123,222,444
....
and here is my code
o_csv_loc.Text = varFile; //csv data file location
save_file_loc.Text = saveloc; //new csv file location
var reader = new StreamReader(File.OpenRead(varFile));
List <string[]> listA = new List<string[]>();
List<string[]> listB = new List<string[]>();
List<string[]> listC = new List<string[]>();
//I think these two code below is to skip first 2 line of csv data
//file and start read the third line (VSI-18...)
reader.ReadLine();
reader.ReadLine();
while (reader.Peek() > -1)
{
var line = reader.ReadLine();
var values = line.Split(';');
listA.Add(new string[] { values[0] });
listB.Add(new string[] { values[1] });
listC.Add(new string[] { values[2] });
//I think that listC is suppose to extract the data after the
//second comma which is third column
}
for the export data code I not yet finish because I can't figure out how to read data yet.
when debug, 'System.IndexOutOfRangeException' show on line
listB.Add(new string[] { values[1] });
Isn't should not be problem on this line? values[0] is not problem yet.
EDIT
I success to export data to new csv file
var reader = new StreamReader(File.OpenRead(varFile));
List <string[]> listA = new List<string[]>(); //here are the code
//changed
List<string[]> listB = new List<string[]>();
List<string[]> listC = new List<string[]>();
reader.ReadLine();
reader.ReadLine();
while (reader.Peek() > -1)
{
var line = reader.ReadLine();
var values = line.Split(',');
listA.Add(new string[] { values[0] });
listB.Add(new string[] { values[1] });
listC.Add(new string[] { values[2]});
}
using (System.IO.TextWriter writer = File.CreateText(saveloc))
{
for (int index = 0; index < listC.Count; index++)
{
writer.WriteLine(string.Join(",", listC[index]) + ',');
}
}
result is this:
165.954,
14.524,
32.920,
48.508,
62.418,
79.151,
96.982,
I still figuring how to detect new date and put into new column
First, like one of the comments stated...in your code you're using ; as the delimiter character however the csv file is using ,...so the result of var values = line.Split(';'); is an aray with only one element.
Second, I would safeguard my application against incorrect formats or corrupted data. For example
var line = reader.ReadLine();
if(string.IsNullOrEmpty())//<--empty row
continue;//<--ignore, or else add empty values to your in-memory lists
var values = line.Split(',');
listA.Add(new string[] { values.length > 0 ? values[0] : string.Empty });
listB.Add(new string[] { values.length > 1 ? values[1] : string.Empty });
listC.Add(new string[] { values.length > 2 ? values[2] : string.Empty });
//or simply
if(values.length < 3)
continue;//<--ignore, or else add empty values to your in-memory lists
Few points.
You might be getting an error/exception due to delimiter (;) used doesn't actually split the string, so values[1] throws IndexOutOfRange exception . Use the correct delimiters (, what you need).
If your intention is to generate new csv (with same string) why do you need to split the string? can't we directly write it to file (assuming it is , delimited)?
var sb = new StringBuilder();
while (reader.Peek() > -1)
{
var line = reader.ReadLine();
sb.AppendLine(line);
}
// Write to file.
File.WriteAllText(filePath, sb.ToString());
I've data in CSV in the pattern
A,B,C,D,E,F,G
C,F,G,L,K,O,F
a,b,c,d,e,f,g
f,t,s,n,e,K,c
B,F,d,e,t,m,A
I want these data to store in the form of:
A,B,C,D
B,C,D,E
C,D,E,F
D,E,F,G
.
.
.
While I'm trying to do in the below way, I missing one pattern in middle. for ex: C,D,E,F
Here is my code:
static void Main(string[] args)
{
FileStream fs = new FileStream("studentSheet.csv", FileMode.Open);
StreamReader reader = new StreamReader(fs);
List<string> subline = new List<string>();
string line = "";
while ((line = reader.ReadLine()) != null)
{
string[] splitstring = line.Split(';');
string ft = null;
int i =0;
while(i <( splitstring.Length - 3)+1)
{
ft = splitstring[i] + "," + splitstring[i+1]
+ "," + splitstring[i+2] +","+ splitstring[i+3];
subline.Add(ft);
i = i + 1;
}
}
foreach(string s in subline)
Console.WriteLine(s);
Console.ReadLine();
}
Assuming you're ok with reading it all into one big list called input, and you don't need it to be amazingly fast, you can just do:
List<string> output = Enumerable.Range(0, input.Length - 4)
.Select(i => String.Join(",", input.Skip(i).Take(4)))
.ToList();
ihave an string builder where it conatins email id( it conatins thousands of email id)
StringBuilder sb = new StringBuilder();
foreach (DataRow dr2 in dtResult.Rows)
{
strtxt = dr2[strMailID].ToString()+";";
sb.Append(strtxt);
}
string filepathEmail = Server.MapPath("Email");
using (StreamWriter outfile = new StreamWriter(filepathEmail + "\\" + "Email.txt"))
{
outfile.Write(sb.ToString());
}
now data is getting stored in text file like this:
abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;
abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;abc#gmail.com;ab#gmail.com;
But i need to store them like where every row should only only 10 email id, so that i looks good**
any idea how to format the data like this in .txt file? any help would be great
Just add a counter in your loop and append a line break every 10 lines.
int counter = 0;
StringBuilder sb = new StringBuilder();
foreach (DataRow dr2 in dtResult.Rows)
{
counter++;
strtxt = dr2[strMailID].ToString()+";";
sb.Append(strtxt);
if (counter % 10 == 0)
{
sb.Append(Environment.NewLine);
}
}
Use a counter and add a line break each tenth item:
StringBuilder sb = new StringBuilder();
int cnt = 0;
foreach (DataRow dr2 in dtResult.Rows) {
sb.Append(dr2[strMailID]).Append(';');
if (++cnt == 10) {
cnt = 0;
sb.AppendLine();
}
}
string filepathEmail = Path.Combine(Server.MapPath("Email"), "Email.txt");
File.WriteAllText(filepathEmail, sb.ToString());
Notes:
Concatentate strings using the StringBuilder instead of first concatenating and then appending.
Use Path.Combine to combine the path and file name, this works on any platform.
You can use the File.WriteAllText method to save the string in a single call instead of writing to a StreamWriter.
as it said you may add a "line break" I suggest to add '\t' tab after each address so your file will be CSV format and you can import it in Excel for instance.
Use a counter to keep track of number of mail already written, like this:
int i = 0;
foreach (string mail in mails) {
var strtxt = mail + ";";
sb.Append(strtxt);
i++;
if (i % 10==0)
sb.AppendLine();
}
Every 10 mails written, i modulo 10 equals 0, so you put an end line in the string builder.
Hope this can help.
Here's an alternate method using LINQ if you don't mind any overheads.
string filepathEmail = Server.MapPath("Email");
using (StreamWriter outfile = new StreamWriter(filepathEmail + "\\" + "Email.txt"))
{
var rows = dtResult.Rows.Cast<DataRow>(); //make the rows enumerable
var lines = from ivp in rows.Select((dr2, i) => new {i, dr2})
group ivp.dr2[strMailID] by ivp.i / 10 into line //group every 10 emails
select String.Join(";", line); //put them into a string
foreach (string line in lines)
outfile.WriteLine(line);
}