How can I utilize StreamWriter to write to a csv file? - c#

So here's what I'm working with. I'm trying to take an XML file, pull the info from the attributes, append them together, and write it to a CSV file. I'm still relatively new to programming, and the other programmer is out of the office today, so I could really use some assistance.
My first question, regards the StringBuilder. Do I need to have an AppendLine at the end of my StringBuilder, so that each string output from the foreach loop is on a new line? And would I need to do that inside the foreach loop?
My second question regards actually writing my string to the CSV file. Would it look something like?
swOutputFile.WriteLine(strAppendedJobData)
And I think this would also go inside the foreach loop, but I'm not too sure.
Thanks for the help, I hope I've worded my question in a somewhat easy to understand manner.
//Create a stream writer to write the data from returned XML job ticket to a new CSV
StreamWriter swOutputFile;
string strComma = ",";
swOutputFile = new StreamWriter(new FileStream("C:\\Dev\\AppendedJobData.csv", FileMode.Create, FileAccess.Write, FileShare.Read));
//Get nodes from returned XML ticket
XmlNodeList xmlJobs = xdResults.SelectNodes("/Updates/Jobs/Job");
//Pull out data from XML attributes
foreach (XmlElement xeJobUpdate in xmlJobs)
{
//Break down the job data
string strProjectID = xeJobUpdate.GetAttribute("SharpOwlProjectID");
string strJobNumber = xeJobUpdate.GetAttribute("JobNumber");
string strClientCode = xeJobUpdate.GetAttribute("SharpOwlClientCode");
string strClient = xeJobUpdate.GetAttribute("Client");
string strVCAOffice = xeJobUpdate.GetAttribute("VCAOffice");
string strLoadStatus = xeJobUpdate.GetAttribute("LoadStatus");
//Build the string to be added to the new CSV file
StringBuilder sbConcatJob = new StringBuilder();
sbConcatJob.Append(strProjectID).Append(strComma).Append(strJobNumber)
.Append(strComma).Append(strClientCode).Append(strComma).Append(strClient).Append(strComma)
.Append(strVCAOffice).Append(strComma).Append(strLoadStatus).Append(strComma);
string strAppendedJobData = sbConcatJob.ToString();

if you want to do it a bit more elegant you could do something like that:
using(StreamWriter swOutputFile = new StreamWriter(new FileStream("C:\\Dev\\AppendedJobData.csv", FileMode.Create, FileAccess.Write, FileShare.Read)))
{
//Get nodes from returned XML ticket
XmlNodeList xmlJobs = xdResults.SelectNodes("/Updates/Jobs/Job");
//Pull out data from XML attributes
foreach (XmlElement xeJobUpdate in xmlJobs)
{
List<String> lineItems = new List<String>();
lineItems.add(xeJobUpdate.GetAttribute("SharpOwlProjectID"));
//add all the other items
swOutputFile.WriteLine(String.Join(',', myLine.ToArray()));
}
//after the loop you close the writer
}
//all the work is done much easier

My first question, regards the
StringBuilder. Do I need to have an
AppendLine at the end of my
StringBuilder, so that each string
output from the foreach loop is on a
new line? And would I need to do that
inside the foreach loop?
My only advice since it appears you have not attempted this would be to try it. It is the only way you will learn.
swOutputFile.WriteLine(strAppendedJobData)
This would write an entire line of text to a file.

You really have two options here:
If you call sbConcatJob.AppendLine() inside the foreach loop you can build the contents of the file in one string builder then call swOutputFile.Write(sbConcatJob.ToString()) outside of the foreach loop to write the file.
If you keep your code as it is now you can add sw.OutputFile.WriteLine(sbConcatJob.ToString()) inside the foreach loop and write the file one line at a time.

Related

Confusion in row and column of DataTable

I am following a tutorial of an inventory stock management system in C# language.
The original csv file is a stock list, which contains four categories:
Item Code, Item Description, Item Count, OnOrder
The original csv file:
In the tutorial, the code is generating a DataTable object, which will be used in the GridView demo in the application.
Here is the code:
DataTable dataTable = new DataTable();
dataTable.Columns.Add("Item Code");
dataTable.Columns.Add("Item Description");
dataTable.Columns.Add("Current Count");
dataTable.Columns.Add("On Order");
string CSV_FilePath = "C:/Users/xxxxx/Desktop/stocklist.csv";
StreamReader streamReader = new StreamReader(CSV_FilePath);
string[] rawData = new string[File.ReadAllLines(CSV_FilePath).Length];
rawData = streamReader.ReadLine().Split(',');
while(!streamReader.EndOfStream)
{
rawData = streamReader.ReadLine().Split(',');
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]);
}
dataGridView1.DataSource = dataTable;
I am assuming that rawData = streamReader.ReadLine().Split(','); splits the file into an array object like this:
["A0001", "Horse on Wheels","5","No"]
["A0002","Elephant on Wheels","2","No"]
In the while loop, it literates through each line (each array) and assign each of the rawData[x] into corresponding column.
Is this right to understand this code snippet? Thanks in advance.
Another question is, why do I need to run
rawData = streamReader.ReadLine().Split(',');
in a while loop?
Thanks in advance.
Your code should actually look like this:
DataTable dataTable = new DataTable();
dataTable.Columns.Add("Item Code");
dataTable.Columns.Add("Item Description");
dataTable.Columns.Add("Current Count");
dataTable.Columns.Add("On Order");
string CSV_FilePath = "C:/Users/xxxxx/Desktop/stocklist.csv";
using(StreamReader streamReader = new StreamReader(CSV_FilePath))
{
// Skip the header row
streamReader.ReadLine();
while(!streamReader.EndOfStream)
{
string[] rawData = streamReader.ReadLine().Split(','); // read a row and split it into cells
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]); // add the elements from each cell as a row in the datatable
}
}
dataGridView1.DataSource = dataTable;
Changes I've made:
We've added a using block around StreamReader to ensure that the file handle is only open for as long as we need to read the file.
We now only read the file once, not twice.
Since we only need the rawData in the scope of the while loop, I've moved it into the loop.
Explaining what's wrong:
The following line reads the entire file, and then counts how many rows are in it. With this information, we initialize an array with as many positions as there are rows in the file. This means for a 500 row file, you can access positions rawData[0], rawData[1], ... rawData[499].
string[] rawData = new string[File.ReadAllLines(CSV_FilePath).Length];
With the next row you discard that array, and instead take the cells from the top of the file (the headers):
rawData = streamReader.ReadLine().Split(',');
This line states "read a single line from the file, and split it by comma". You then assign that result to rawData, replacing its old value. So the reason you need this again in the loop is because you're interested in more than the first row of the file.
Finally, you're looping through each row in the file and replacing rawData with the cells from that row. Finally, you add each row to the DataTable:
rawData = streamReader.ReadLine().Split(',');
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]);
Note that File.ReadAllLines(...) reads the entire file into memory as an array of strings. You're also using StreamReader to read through the file line-by-line, meaning that you are reading the entire file twice. This is not very efficient and you should avoid this where possible. In this case, we didn't need to do that at all.
Also note that your approach to reading a CSV file is fairly naïve. Depending on the software used to create them, some CSV files have cells that span more than one line in the file, some include quoted sections for text, and sometimes those quoted sections include commas which would throw off your split code. Your code also doesn't deal with the possibility of a file being badly formatted such that a row may have less cells than expected, or that there may be a trailing empty row at the end of the file. Generally it's better to use a dedicated CSV parser such as CsvHelper rather than trying to roll your own.

Is it possible to efficiently add elements into an XML document on-the-fly?

What I mean by that title is that I am attempting to construct an XML file as records are being read from a database. In a way, I guess you'd say I am doing a manual version of DataSet.WriteToXML(). But I need to control the data as it flows through, hence the manual writing of records.
My problem is that no matter what I've tried to do with XDocument, with and without Streams, I find that the elements I add are simply piling up in memory. Even when I try to "batch" the records and call an occasional XDocument.Save(), I still wind up with an OutOfMemoryException at some point.
So first off, I am wondering of there is any such thing as "read a record, write a record"...a straight-line pass through from a DataTable.Row to an XML Node/Sub-nodes (a record in XML). Even though I suspect that any "yes" answer would come with a major performance penalty from all the I/O it would involve.
My latest iteration (there have been too many to list):
XDocument xdoc = null;
var settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
using (var fs = new FileStream(tmpXml, FileMode.Open, FileAccess.ReadWrite))
xdoc = XDocument.Load(XmlReader.Create(fs));
foreach (DataTable dt in ds.Tables)
{
using (var reader = sql.ExecuteReader(String.Format(#"SELECT * FROM ""{0}""", dt.TableName)))
{
if (reader.HasRows)
{
while (reader.Read())
{
var tableElement = new XElement(dt.TableName);
xdoc.Root.Add(tableElement);
foreach (DataColumn dc in dt.Columns)
{
if (dt.TableName.EndsWith("_Option") && reader.GetOrdinal(dc.ColumnName) == 13)
tableElement.Add(new XElement(dc.ColumnName, ""));
else
tableElement.Add(new XElement(dc.ColumnName, reader[dc.ColumnName]));
}
}
}
}
}
So the basic question is, what's the most efficient way to take the rows I read, and get them into the physical XML file as quickly as possible, preventing the XDocument from piling up data in memory?
EDIT: I should point out that the XML file exists already, with only the table schema written to it. The goal here is to open that file, write the actual data records to it, and save it.

Shortest way to save DataTable to Textfile

I just found a few answers for this, but found them all horribly long with lots of iterations, so I came up with my own solution:
Convert table to string:
string myTableAsString =
String.Join(Environment.NewLine, myDataTable.Rows.Cast<DataRow>().
Select(r => r.ItemArray).ToArray().
Select(x => String.Join("\t", x.Cast<string>())));
Then simply save string to text file, for example:
StreamWriter myFile = new StreamWriter("fileName.txt");
myFile.WriteLine(myFile);
myFile.Close();
Is there a shorter / better way?
You have your DataTable named as myDataTable, you can add it to DataSet as:
var dataSet = new DataSet();
dataSet.AddTable(myDataTable);
// Write dataset to xml file or stream
dataSet.WriteXml("filename.xml");
And you can also read from xml file or stream:
dataSet.ReadXml("filename.xml");
#Leonardo sorry but i can 't comment so i post.
Sometimes you can ask the dataset and then work with it. Like this:
foreach (DataRow row in ds.Tables[0].Rows)
{
foreach (object item in row.ItemArray)
{
myStreamWriter.Write((string)item + "\t");
}
myStreamWriter.WriteLine();
}
That 's another way but i don 't know which 'll give you a better metric.
If you consider XML as text you can do: myDatatable.WriteXml("mydata.xml") and myDatatable.ReadXml("mydata.xml")
You get an error unless you save it with the schema:
myDataTable.WriteXml("myXmlPath.xml", XmlWriteMode.WriteSchema);
myDatatable.ReadXml("myXmlPath.xml");
There is more info on saving/loading with schema here:
DataTable does not support schema inference from Xml.?

Determine if input file is usable by program

I have a C# program that looks through directories for .txt files and loads each into a DataTable.
static IEnumerable<string> ReadAsLines(string fileName)
{
using (StreamReader reader = new StreamReader(fileName))
while (!reader.EndOfStream)
yield return reader.ReadLine();
}
public DataTable GetTxtData()
{
IEnumerable<string> reader = ReadAsLines(this.File);
DataTable txtData = new DataTable();
string[] headers = reader.First().Split('\t');
foreach (string columnName in headers)
txtData.Columns.Add(columnName);
IEnumerable<string> records = reader.Skip(1);
foreach (string rec in records)
txtData.Rows.Add(rec.Split('\t'));
return txtData;
}
This works great for regular tab-delimited files. However, the catch is that not every .txt file in the folders I need to use contains tab-delimited data. Some .txt files are actually SQL queries, notes, etc. that have been saved as plain text files, and I have no way of determining that beforehand. Trying to use the above code on such files clearly won't lead to the expected result.
So my question is this: How can I tell whether a .txt file actually contains tab-delimited data before I try to read it into a DataTable using the above code?
Just searching the file for any tab character won't work because, for example, a SQL query saved as plain text might have tabs for code formatting.
Any guidance here at all would be much appreciated!
If each line contains the same number of elements, then simply read each line, and verify that you get the correct number of fields in each record. If not error out.
if (headers.Count() != CORRECTNUMBER)
{
// ERROR
}
foreach (string rec in records)
{
string[] recordData = rec.Split('\t');
if (recordData.Count() != headers.Count())
{
// ERROR
}
txtData.Rows.Add(recordData);
}
To do this you need a set of "signature" logic providers which can check a given sample of the file for "signature" content. This is similar to how virus scanners work.
Consider you would create a set of classes where the ISignature was implemented by set of classes;
class TSVFile : ISignature
{
enumFileType ISignature.Evaluate(IEnumerable<byte> inputStream);
}
class SQLFile : ISignature
{
enumFileType ISignature.Evaluate(IEnumerable<byte> inputStream);
}
each one would read an appropriate number of bytes in and return the known file type, if it can be evaluated. Each file parser would need its own logic to determine how many bytes to read and on what basis to make its evaluation.

Print an array/list to excel in c#

I am able to save a single value into excel but I need help to save a full list/array into an excel sheet.
Code I have so far:
var MovieNames = session.Query<Movie>()
.ToArray();
List<string> MovieList = new List<string>();
foreach (var movie in MovieNames)
{
MovieList.Add(movie.MovieName);
}
//If I want to print a single value or a string,
//I can use the following to print/save to excel
// How can I do this if I want to print that entire
//list thats generated in "MovieList"
return File(new System.Text.UTF8Encoding().GetBytes(MovieList), "text/csv", "demo.csv");
You could use FileHelpers to serialize some strongly typed object into CSV. Just promise me to never roll your own CSV parser.
If you mean you want to create a .csv file with all movie names in one column so you can open it in Excel then simply loop over it:
byte[] content;
using (var ms = new MemoryStream())
{
using (var writer = new StreamWriter(ms))
{
foreach (var movieName in MovieList)
writer.WriteLine(movieName);
}
content = ms.ToArray();
}
return File(content, "text/csv", "demo.csv");
Edit
You can add more columns and get fancier with your output but then you run into the problem that you have check for special characters which need escaping (like , and "). If you want to do more than just a simple output then I suggest you follow #Darins suggestion and use the FileHelpers utilities. If you can't or don't want to use them then this article has an implementation of a csv writer.

Categories