I know there is more similar question but I was not able to find the answer to mine. I have two CSV files. Both files contain image metadata for the same images, however, the first file image IDs are outdated. So I need to take the IDs from the second file and replace outdated IDs with new ones. I was thinking to compare image Longitude, Latitude, and Altitude rows values, and where it matches in both files I take image id from the second file. The IDs would be used in the new object. And the sequence of lines in files is different and the first file contains more lines than the second one.
The files structure looks as follows:
First file:
ImgID,Longitude,Latitude,Altitude
01,44.7282372307,27.5786807185,14.1536407471
02,44.7287939869,27.5777060219,13.2340240479
03,44.7254687824,27.582636255,16.5887145996
04,44.7254294913,27.5826908925,16.5794525146
05,44.728785278,27.5777185252,13.2553100586
06,44.7282279311,27.5786933339,14.1576690674
07,44.7253847039,27.5827526969,16.6026000977
08,44.7287777782,27.5777295052,13.2788238525
09,44.7282196988,27.5787045314,14.1649169922
10,44.7253397041,27.5828151049,16.6300048828
11,44.728769439,27.5777417846,13.3072509766
Second file:
ImgID,Longitude,Latitude,Altitude
5702,44.7282372307,27.5786807185,14.1536407471
5703,44.7287939869,27.5777060219,13.2340240479
5704,44.7254687824,27.582636255,16.5887145996
5705,44.7254294913,27.5826908925,16.5794525146
5706,44.728785278,27.5777185252,13.2553100586
5707,44.7282279311,27.5786933339,14.1576690674
How this can be done in C#? Is there is some handy library to work with?
I would use the CSVHelper library for CSV read/write as it is a complete nice library. For this, you should declare a class to hold your data, and its property names must match your CSV file's column names.
public class ImageData
{
public int ImgID { get; set; }
public double Longitude { get; set; }
public double Latitude { get; set; }
public double Altitude { get; set; }
}
Then to see if two lines are equal, what you need to do is see if each property in each line in one file matches the other. You could do this by simply comparing properties, but I'd rather write a comparer for this, like so:
public class ImageDataComparer : IEqualityComparer<ImageData>
{
public bool Equals(ImageData x, ImageData y)
{
return (x.Altitude == y.Altitude && x.Latitude == y.Latitude && x.Longitude == y.Longitude);
}
public int GetHashCode(ImageData obj)
{
unchecked
{
int hash = (int)2166136261;
hash = (hash * 16777619) ^ obj.Altitude.GetHashCode();
hash = (hash * 16777619) ^ obj.Latitude.GetHashCode();
hash = (hash * 16777619) ^ obj.Longitude.GetHashCode();
return hash;
}
}
}
Simple explanation is that we override the Equals() method and dictate that two instances of ImageData class are equal if the three property values are matching. I will show the usage in a bit.
The CSV read/write part is pretty easy (the library's help page has some good examples and tips, please read it). I can write two methods for reading and writing like so:
public static List<ImageData> ReadCSVData(string filePath)
{
List<ImageData> records;
using (var reader = new StreamReader(filePath))
{
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Configuration.HasHeaderRecord = true;
records = csv.GetRecords<ImageData>().ToList();
}
}
return records;
}
public static void WriteCSVData(string filePath, List<ImageData> records)
{
using (var writer = new StreamWriter(filePath))
{
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteRecords(records);
}
}
}
You can actually write generic <T> read/write methods so the two methods are usable with different classes, if that's something useful for you.
Next is the crucial part. First, read the two files to memory using the methods we just defined.
var oldData = ReadCSVData(Path.Combine(Directory.GetCurrentDirectory(), "OldFile.csv"));
var newData = ReadCSVData(Path.Combine(Directory.GetCurrentDirectory(), "NewFile.csv"));
Now, I can go through each line in the 'old' data, and see if there's a corresponding record in 'new' data. If so, I grab the ID from the new data and replace the ID of old data with it. Notice the usage of the comparer we wrote.
foreach (var line in oldData)
{
var replace = newData.FirstOrDefault(x => new ImageDataComparer().Equals(x, line));
if (replace != null && replace.ImgID != line.ImgID)
{
line.ImgID = replace.ImgID;
}
}
Next, simply overwrite the old data file.
WriteCSVData(Path.Combine(Directory.GetCurrentDirectory(), "OldFile.csv"), oldData);
Results
I'm using a simplified version of your data to easily verify our results.
Old Data
ImgID,Longitude,Latitude,Altitude
1,1,2,3
2,2,3,4
3,3,4,5
4,4,5,6
5,5,6,7
6,6,7,8
7,7,8,9
8,8,9,10
9,9,10,11
10,10,11,12
11,11,12,13
New Data
ImgID,Longitude,Latitude,Altitude
5702,1,2,3
5703,2,3,4
5704,3,4,5
5705,4,5,6
5706,5,6,7
5707,6,7,8
Now our expected results should be that the first 6 lines of the old files should have the ids updated, and that's what we get:
Updated Old Data
ImgID,Longitude,Latitude,Altitude
5702,1,2,3
5703,2,3,4
5704,3,4,5
5705,4,5,6
5706,5,6,7
5707,6,7,8
7,7,8,9
8,8,9,10
9,9,10,11
10,10,11,12
11,11,12,13
An alternate way to do it, if for some reason you didn't want to use the CSVHelper, is to write a method that compares two lines of data and determines if they're equal (by ignoring the first column data):
public static bool DataLinesAreEqual(string first, string second)
{
if (first == null || second == null) return false;
var xParts = first.Split(',');
var yParts = second.Split(',');
if (xParts.Length != 4 || yParts.Length != 4) return false;
return xParts.Skip(1).SequenceEqual(yParts.Skip(1));
}
Then we can read all the lines from both files into arrays, and then we can update our first file lines with those from the second file if our method says they're equal:
var csvPath1 = #"c:\temp\csvData1.csv";
var csvPath2 = #"c:\temp\csvData2.csv";
// Read lines from both files
var first = File.ReadAllLines(csvPath1);
var second = File.ReadAllLines(csvPath2);
// Select the updated line where necessary
var updated = first.Select(f => second.FirstOrDefault(s => DataLinesAreEqual(f, s)) ?? f);
// Write the updated result back to the first file
File.WriteAllLines(csvPath1, updated);
Related
I want to be able to filter out a CSV file and perform data validation on the filtered data. I imagine for loops, but the file has 2 million cells and it would take a long time. I am using Lumenworks CSVReader for accessing the file using C#.
I found this method csvfile.Where<> but I have no idea what to put in the parameters. Sorry I am still new to coding as well.
[EDIT] This is my code for loading the file. Thanks for all the help!
//Creating C# table from CSV data
var csvTable = new DataTable();
var csvReader = new CsvReader(newStreamReader(System.IO.File.OpenRead(filePath[0])), true);
csvTable.Load(csvReader);
//grabs header from the CSV data table
string[] headers = csvReader.GetFieldHeaders(); //this method gets the headers of the CSV file
string filteredData[] = csvReader.Where // this is where I would want to implement the where method, or some sort of way to filter the data
//I can access the rows and columns with this
csvTable.Rows[0][0]
csvTable.Columns[0][0]
//After filtering (maybe even multiple filters) I want to add up all the filtered data (assuming they are integers)
var dataToValidate = 0;
foreach var data in filteredData{
dataToValidate += data;
}
if (dataToValidate == 123)
//data is validated
I would read some of the documentation for the package you are using:
https://github.com/phatcher/CsvReader
https://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader
To specifically answer the filtering question, so it only contains the data you are searching for consider the following:
var filteredData = new List<List<string>>();
using (CsvReader csv = new CsvReader(new StreamReader(System.IO.File.OpenRead(filePath[0])), true));
{
string searchTerm = "foo";
while (csv.ReadNextRecord())
{
var row = new List<string>();
for (int i = 0; i < csv.FieldCount; i++)
{
if (csv[i].Contains(searchTerm))
{
row.Add(csv[i]);
}
}
filteredData.Add(row);
}
}
This will give you a list of a list of string that you can enumerate over to do your validation
int dataToValidate = 0;
foreach (var row in filteredData)
{
foreach (var data in row)
{
// do the thing
}
}
--- Old Answer ---
Without seeing the code you are using to load the file, it might be a bit difficult to give you a full answer, ~2 Million cells may be slow no matter what what.
Your .Where comes from System.Linq
https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.where?view=net-6.0
A simple example using .Where
//Read the file and return a list of strings that match the where clause
public List<string> ReadCSV()
{
List<string> data = File.ReadLines(#"C:\Users\Public\Documents\test.csv");
.Select(line => line.Split(','))
// token[x] where x is the column number, assumes ID is column 0
.Select(tokens => new CsvFileStructure { Id = tokens[0], Value = tokens[1] })
// Where filters based on whatever you are looking for in the CSV
.Where(csvFileStructure => csvFileStructure.Id == "1")
.ToList();
return data;
}
// Map of your data structure
public class CsvFileStructure
{
public long Id { get; set; }
public string Name { get; set; }
public string Value { get; set; }
}
Modified from this answer:
https://stackoverflow.com/a/10332737/7366061
There is no csvreader.Where method. The "where" is part of Linq in C#. The link below shows an example of computing columns in a csv file using Linq:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/how-to-compute-column-values-in-a-csv-text-file-linq
I'm trying to make a basic login for my console app. I store the user data in a .txt file like this:
ID;Name;IsAdmin. The txt has several lines.
In the app I want to store user data in a struct User array. I can't seem to find the method to read the file, split and put the different data to the right place. This is what I have so far:
Loading user data to struct array
public static void LoadIDs()
{
int entries = FileHandling.CountRows(usersPath);
User[] users = new User[entries]; //Length depends on how many lines are in the .txt
for (int i = 0; i < users.Length; i++)
{
users[i] = new User(1,"a",false); //ID(int), name, isAdmin [This is where I want to put the data from the .txt]
}
}
Reading and spliting the text
public static string ReadFileToArray(string path)
{
String input = File.ReadAllText(path);
foreach (var record in input.Split('\n'))
{
foreach (var data in record.Split(';'))
{
return data;
}
}
return null;
}
I know that this doesn't work at all this way but my knowledge is limited yet and I cannot think of other solutions.
You have a better tool to store your users. Instead of an array (that forces you to know the length of the data loaded) you can use a List where you can add your elements while you read them.
Another point to change is the File.ReadAllText in File.ReadLines. This will allow to read line by line your file directly in the loop
public List<User> BuildUserList(string path)
{
List<User> result = new List<User>();
foreach (var record in File.ReadLines(path)
{
string[] data = record.Split(';'))
User current = new User();
current.ID = Convert.ToInt32(data[0]);
current.Name = data[1];
current.IsAdmin = Convert.ToBoolean(data[2]);
result.Add(current);
}
return result;
}
Now you can use the list like an array if you need
List<User> users = BuildUserList("yourfile.txt");
if(users.Count > 0)
{
Console.WriteLine("Name=" + users[0].Name);
}
If I were to assume your file especially each line having Id;Name;Admin values, I would write something like below to extract it. Please note that there are simple syntax out there but following logic will be helpful for beginners to understand how this could be achieved.
List<User> userList = new List<User>();
// Read the file located at c:\test.txt (this might be different in your case)
System.IO.StreamReader file = new System.IO.StreamReader(#"c:\test.txt");
string line;
while ((line = file.ReadLine()) != null)
{
//following logic will read each line and split by the separator before
// creating a new User instance. Remember to add more defensive logic to
// cover all cases
var extract = line.Split(';');
userList.Add(new User()
{
Id = extract[0],
Name = extract[1],
IsAdmin = extract[2]
});
}
file.Close();
//at this stage you will have List of User and converting it to array using following call
var userArray = userList.ToArray();
And just as another variant, a linq-solution could look like this:
var users = (
from string line in System.IO.File.ReadAllLines(#"..filepath..")
let parts = line.Split(';')
where parts.Length == 3
select new User() {
ID = Convert.ToInt32(parts[0]),
Name = parts[1],
IsAdmin = Convert.ToBoolean(parts[2])}
).ToArray();
This can be elegant and short, error handling may be a bit more difficult.
This will read your file lazily, so it can handle extremely huge files with ease (assuming the rest of your code can):
public IEnumerable<User> ReadUsers(string path)
{
return File.ReadLines(path)
.Select(l=>l.Split(';'))
.Select(l=> new User
{
Id = int.Parse(l[0]),
Name = l[1],
IsAdmin = bool.Parse(l[2])
});
}
or
public IEnumerable<User> ReadUsers(string path)
{
return File.ReadLines(path)
.Select(l=>l.Split(';'))
.Select(l=> new User(int.Parse(l[0]), l[1], bool.Parse(l[2])));
}
I'm trying to learn some C# here. My goal is to create and write on multiple custom files which name varies based on a part of the string to be written. Below some examples:
Let's say strings to be written are basically rows of a csv file:
2019-10-28 16:14:14;;15.5;0;;3;false;false;0;111;123;;;10;false;1;2.5;;;;0;
2019-10-28 16:13:11;;18;0;;1;false;false;222;333;123;;;10;false;1;1;;;;0;G
2019-10-29 16:13:11;;18;0;;3;false;false;true;
As you may notice, first field of each string is a date, and that's and that is the key field to choose the name of the file to write to.
First two fields have same date, so both strings will be printed on a single file, the third one in a second file since it has different date.
Expected Result:
First File:
2019-10-28 16:14:14;;15.5;0;;3;false;false;0;111;123;;;10;false;1;2.5;;;;0;
2019-10-28 16:13:11;;18;0;;1;false;false;222;333;123;;;10;false;1;1;;;;0;
Second File:
2019-10-29 16:13:11;;18;0;;3;false;false;true;
Now I have multiple rows like those, and I'd like to print them on different files based on their first value.
I managed to create a class which might represent each row:
class Value {
public DateTime date = DateTime.Now;
public decimal cod = 0;
public decimal quantity = 0;
public decimal price = 0;
//Other irrelevant fields
}
And I also tried to develop a method to write a single Value on given File:
private static void WriteValue(Value content, string folder, string fileName) {
using(StreamWriter writer = new StreamWriter(Path.Combine(folder, fileName), true, Encoding.ASCII)) {
writer.Write(content.dataora.ToString("yyyyMMdd"));
writer.Write("0000");
writer.Write("I");
writer.Write("C");
writer.Write(content.codpro.ToString().PadLeft(14, '0'));
writer.Write(Convert.ToInt64(content.qta * 100).ToString().PadLeft(8, '0'));
writer.WriteLine();
}
}
And a Method to write Values them into files
static void WriteValues(List<Value> fileContent) {
//Once I got all Values of File in a List of Values, I try to write them in files
}
if(fileContent.Count > 0) {
foreach(Value riga in fileContent) {
//Temp Dates, used to compare previous Date in order to know if I have to write Value in new File or it can be written on same File
string dataTemp = riga.dataora.ToString("yyyy-MM-dd");
string lastData = string.Empty;
string FileName = "ordinivoa_999999." + DateTime.Now.ToString("yyMMddHHmmssfff");
//If lastData is Empty we are writing first value
if (string.IsNullOrEmpty(lastData)) {
WriteValue(riga, toLinfaFolder, FileName);
lastData = dataTemp;
}
//Else if lastData is equal as last managed date we write on same file
else if (lastData == dataTemp) {
WriteValue(riga, toLinfaFolder, FileName);
}
else {
//Else current date of Value is new, so we write it in another file
string newFileName = "ordinivoa_999999." + DateTime.Now.AddMilliseconds(1).ToString("yyMMddHHmmssfff");
WriteValue(riga, toLinfaFolder, newFileName);
lastData = dataTemp;
}
}
}
}
My issue is method above has strange behavior, writes first equal dates on a single file, which is good, but writes all other values in a single file, even if we have different dates.
How to make sure each value gets printed on in a single file only if has same date value?
You can group equal dates easily with a LINQ query
private static void WriteValues(List<Value> fileContent)
{
var dateGroups = fileContent
.GroupBy(v => $"ordinivoa_999999.{v.date:yyMMddHHmmssfff}");
foreach (var group in dateGroups) {
string path = Path.Combine(toLinfaFolder, group.Key);
using (var writer = new StreamWriter(path, true, Encoding.ASCII)) {
foreach (Value item in group) {
//TODO: write item to file
writer.WriteLine(...
}
}
}
}
Since a DateTime stores values in units of one ten-millionth of a second, two dates looking equal once formatted, might still be different. So I suggest grouping on the filename to avoid this effect. I used string interpolation to create and format the file name.
Don't open and close the file for each text line.
At the top of your code file you need a
using System.Linq;
You are on the right path declaring a class, but you're also doing a whole bunch of unnecessary stuff. Using LINQ this can be simplified by a great deal.
First I define a class, and since all you want to do is write each record, I would use a DateTime field, and a string field for the entire raw record.
class MyRecordOfSomeType
{
public DateTime Date { get; set; }
public string RawData { get; set; }
}
The DateTime filed is so that it'll come in handy when you're doing LINQ.
Now we iterate through your data, split using ;, then create your class instance list.
var data = new List<string>()
{
"2019-10-28 16:14:14;;15.5;0;;3;false;false;0;111;123;;;10;false;1;2.5;;;;0;",
"2019-10-28 16:13:11;;18;0;;1;false;false;222;333;123;;;10;false;1;1;;;;0;G",
"2019-10-29 16:13:11;;18;0;;3;false;false;true;"
};
var records = new List<MyRecordOfSomeType>();
foreach (var item in data)
{
var parts = item.Split(';');
DateTime.TryParse(parts[0], out DateTime result);
var rec = new MyRecordOfSomeType() { Date = result, RawData = item };
records.Add(rec);
}
Then we group by date. Note that it's important to group by the Date component of the DateTime structure, otherwise it will consider the Time component as well and you'll have more files than you need.
var groups = records.GroupBy(x => x.Date.Date);
Finally, iterate your groups, and write contents of each group to a new file.
foreach (var group in groups)
{
var fileName = string.Format("ordinivoa_999999_{0}.csv", group.Key.ToString("yyMMddHHmmssfff"));
File.WriteAllLines(fileName, group.Select(x => x.RawData));
}
So I've been reading that I shouldn't write my own CSV reader/writer, so I've been trying to use the CsvHelper library installed via nuget. The CSV file is a grey scale image, with the number of rows being the image height and the number columns the width. I would like to read the values row-wise into a single List<string> or List<byte>.
The code I have so far is:
using CsvHelper;
public static List<string> ReadInCSV(string absolutePath)
{
IEnumerable<string> allValues;
using (TextReader fileReader = File.OpenText(absolutePath))
{
var csv = new CsvReader(fileReader);
csv.Configuration.HasHeaderRecord = false;
allValues = csv.GetRecords<string>
}
return allValues.ToList<string>();
}
But allValues.ToList<string>() is throwing a:
CsvConfigurationException was unhandled by user code
An exception of type 'CsvHelper.Configuration.CsvConfigurationException' occurred in CsvHelper.dll but was not handled in user code
Additional information: Types that inherit IEnumerable cannot be auto mapped. Did you accidentally call GetRecord or WriteRecord which acts on a single record instead of calling GetRecords or WriteRecords which acts on a list of records?
GetRecords is probably expecting my own custom class, but I'm just wanting the values as some primitive type or string. Also, I suspect the entire row is being converted to a single string, instead of each value being a separate string.
According to #Marc L's post you can try this:
public static List<string> ReadInCSV(string absolutePath) {
List<string> result = new List<string>();
string value;
using (TextReader fileReader = File.OpenText(absolutePath)) {
var csv = new CsvReader(fileReader);
csv.Configuration.HasHeaderRecord = false;
while (csv.Read()) {
for(int i=0; csv.TryGetField<string>(i, out value); i++) {
result.Add(value);
}
}
}
return result;
}
If all you need is the string values for each row in an array, you could use the parser directly.
var parser = new CsvParser( textReader );
while( true )
{
string[] row = parser.Read();
if( row == null )
{
break;
}
}
http://joshclose.github.io/CsvHelper/#reading-parsing
Update
Version 3 has support for reading and writing IEnumerable properties.
The whole point here is to read all lines of CSV and deserialize it to a collection of objects. I'm not sure why do you want to read it as a collection of strings. Generic ReadAll() would probably work the best for you in that case as stated before. This library shines when you use it for that purpose:
using System.Linq;
...
using (var reader = new StreamReader(path))
using (var csv = new CsvReader(reader))
{
var yourList = csv.GetRecords<YourClass>().ToList();
}
If you don't use ToList() - it will return a single record at a time (for better performance), please read https://joshclose.github.io/CsvHelper/examples/reading/enumerate-class-records
Please try this. This had worked for me.
TextReader reader = File.OpenText(filePath);
CsvReader csvFile = new CsvReader(reader);
csvFile.Configuration.HasHeaderRecord = true;
csvFile.Read();
var records = csvFile.GetRecords<Server>().ToList();
Server is an entity class. This is how I created.
public class Server
{
private string details_Table0_ProductName;
public string Details_Table0_ProductName
{
get
{
return details_Table0_ProductName;
}
set
{
this.details_Table0_ProductName = value;
}
}
private string details_Table0_Version;
public string Details_Table0_Version
{
get
{
return details_Table0_Version;
}
set
{
this.details_Table0_Version = value;
}
}
}
You are close. It isn't that it's trying to convert the row to a string. CsvHelper tries to map each field in the row to the properties on the type you give it, using names given in a header row. Further, it doesn't understand how to do this with IEnumerable types (which string implements) so it just throws when it's auto-mapping gets to that point in testing the type.
That is a whole lot of complication for what you're doing. If your file format is sufficiently simple, which yours appear to be--well known field format, neither escaped nor quoted delimiters--I see no reason why you need to take on the overhead of importing a library. You should be able to enumerate the values as needed with System.IO.File.ReadLines() and String.Split().
//pseudo-code...you don't need CsvHelper for this
IEnumerable<string> GetFields(string filepath)
{
foreach(string row in File.ReadLines(filepath))
{
foreach(string field in row.Split(',')) yield return field;
}
}
static void WriteCsvFile(string filename, IEnumerable<Person> people)
{
StreamWriter textWriter = File.CreateText(filename);
var csvWriter = new CsvWriter(textWriter, System.Globalization.CultureInfo.CurrentCulture);
csvWriter.WriteRecords(people);
textWriter.Close();
}
I have a class as follows :
public class Test
{
public int Id {get;set;}
public string Name { get; set; }
public string CreatedDate {get;set;}
public string DueDate { get; set; }
public string ReferenceNo { get; set; }
public string Parent { get; set; }
}
and I have a list of Test objects
List<Test>testobjs=new List();
Now I would like to convert it into csv in following format:
"1,John Grisham,9/5/2014,9/5/2014,1356,0\n2,Stephen King,9/3/2014,9/9/2014,1367,0\n3,The Rainmaker,4/9/2014,18/9/2014,1";
I searched for "Converting list to csv c#" and I got solutions as follows:
string.Join(",", list.Select(n => n.ToString()).ToArray())
But this will not put the \n as needed i.e for each object
Is there any fastest way other than string building to do this? Please help...
Use servicestack.text
Install-Package ServiceStack.Text
and then use the string extension methods ToCsv(T)/FromCsv()
Examples:
https://github.com/ServiceStack/ServiceStack.Text
Update:
Servicestack.Text is now free also in v4 which used to be commercial. No need to specify the version anymore! Happy serializing!
Because speed was mentioned in the question, my interest was piqued on just what the relative performances might be, and just how fast I could get it.
I know that StringBuilder was excluded, but it still felt like probably the fastest, and StreamWriter has of course the advantage of writing to either a MemoryStream or directly to a file, which makes it versatile.
So I knocked up a quick test.
I built a list half a million objects identical to yours.
Then I serialized with CsvSerializer, and with two hand-rolled tight versions, one using a StreamWriter to a MemoryStream and the other using a StringBuilder.
The hand rolled code was coded to cope with quotes but nothing more sophisticated. This code was pretty tight with the minimum I could manage of intermediate strings, no concatenation... but not production and certainly no points for style or flexibility.
But the output was identical in all three methods.
The timings were interesting:
Serializing half a million objects, five runs with each method, all times to the nearest whole mS:
StringBuilder 703 734 828 671 718 Avge= 730.8
MemoryStream 812 937 874 890 906 Avge= 883.8
CsvSerializer 1,734 1,469 1,719 1,593 1,578 Avge= 1,618.6
This was on a high end i7 with plenty of RAM.
Other things being equal, I would always use the library.
But if a 2:1 performance difference became critical, or if RAM or other issues turned out to exaggerate the difference on a larger dataset, or if the data were arriving in chunks and was to be sent straight to disk, I might just be tempted...
Just in case anyone's interested, the core of the code (for the StringBuilder version) was
private void writeProperty(StringBuilder sb, string value, bool first, bool last)
{
if (! value.Contains('\"'))
{
if (!first)
sb.Append(',');
sb.Append(value);
if (last)
sb.AppendLine();
}
else
{
if (!first)
sb.Append(",\"");
else
sb.Append('\"');
sb.Append(value.Replace("\"", "\"\""));
if (last)
sb.AppendLine("\"");
else
sb.Append('\"');
}
}
private void writeItem(StringBuilder sb, Test item)
{
writeProperty(sb, item.Id.ToString(), true, false);
writeProperty(sb, item.Name, false, false);
writeProperty(sb, item.CreatedDate, false, false);
writeProperty(sb, item.DueDate, false, false);
writeProperty(sb, item.ReferenceNo, false, false);
writeProperty(sb, item.Parent, false, true);
}
If you don't want to load library's than you can create the following method:
private void SaveToCsv<T>(List<T> reportData, string path)
{
var lines = new List<string>();
IEnumerable<PropertyDescriptor> props = TypeDescriptor.GetProperties(typeof(T)).OfType<PropertyDescriptor>();
var header = string.Join(",", props.ToList().Select(x => x.Name));
lines.Add(header);
var valueLines = reportData.Select(row => string.Join(",", header.Split(',').Select(a => row.GetType().GetProperty(a).GetValue(row, null))));
lines.AddRange(valueLines);
File.WriteAllLines(path, lines.ToArray());
}
and than call the method:
SaveToCsv(testobjs, "C:/PathYouLike/FileYouLike.csv")
Your best option would be to use an existing library. It saves you the hassle of figuring it out yourself and it will probably deal with escaping special characters, adding header lines etc.
You could use the CSVSerializer from ServiceStack. But there are several other in nuget.
Creating the CSV will then be as easy as string csv = CsvSerializer.SerializeToCsv(testobjs);
You could use the FileHelpers library to convert a List of objects to CSV.
Consider the given object, add the DelimitedRecord Attribute to it.
[DelimitedRecord(",")]
public class Test
{
public int Id {get;set;}
public string Name { get; set; }
public string CreatedDate {get;set;}
public string DueDate { get; set; }
public string ReferenceNo { get; set; }
public string Parent { get; set; }
}
Once the List is populated, (as per question it is testobjs)
var engine = new FileHelperEngine<Test>();
engine.HeaderText = engine.GetFileHeader();
string dirPath = Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) + "\\" + ConfigurationManager.AppSettings["MyPath"];
if (!Directory.Exists(dirPath))
{
Directory.CreateDirectory(dirPath);
}
//File location, where the .csv goes and gets stored.
string filePath = Path.Combine(dirPath, "MyTestFile_" + ".csv");
engine.WriteFile(filePath, testobjs);
This will just do the job for you. I'd been using this to generate data reports for a while until I switched to Python.
PS: Too late to answer but hope this helps somebody.
Use Cinchoo ETL
Install-Package ChoETL
or
Install-Package ChoETL.NETStandard
Sample shows how to use it
List<Test> list = new List<Test>();
list.Add(new Test { Id = 1, Name = "Tom" });
list.Add(new Test { Id = 2, Name = "Mark" });
using (var w = new ChoCSVWriter<Test>(Console.Out)
.WithFirstLineHeader()
)
{
w.Write(list);
}
Output CSV:
Id,Name,CreatedDate,DueDate,ReferenceNo,Parent
1,Tom,,,,
2,Mark,,,,
For more information, go to github
https://github.com/Cinchoo/ChoETL
Sample fiddle: https://dotnetfiddle.net/M7v7Hi
LINQtoCSV is the fastest and lightest I've found and is available on GitHub. Lets you specify options via property attributes.
Necromancing this one a bit; ran into the exact same scenario as above, went down the road of using FastMember so we didn't have to adjust the code every time we added a property to the class:
[HttpGet]
public FileResult GetCSVOfList()
{
// Get your list
IEnumerable<MyObject> myObjects =_service.GetMyObject();
//Get the type properties
var myObjectType = TypeAccessor.Create(typeof(MyObject));
var myObjectProperties = myObjectType.GetMembers().Select(x => x.Name);
//Set the first row as your property names
var csvFile = string.Join(',', myObjectProperties);
foreach(var myObject in myObjects)
{
// Use ObjectAccessor in order to maintain column parity
var currentMyObject = ObjectAccessor.Create(myObject);
var csvRow = Environment.NewLine;
foreach (var myObjectProperty in myObjectProperties)
{
csvRow += $"{currentMyObject[myObjectProperty]},";
}
csvRow.TrimEnd(',');
csvFile += csvRow;
}
return File(Encoding.ASCII.GetBytes(csvFile), "text/csv", "MyObjects.csv");
}
Should yield a CSV with the first row being the names of the fields, and rows following. Now... to read in a csv and create it back into a list of objects...
Note: example is in ASP.NET Core MVC, but should be very similar to .NET framework. Also had considered ServiceStack.Text but the license was not easy to follow.
For the best solution, you can read this article: Convert List of Object to CSV File C# - Codingvila
using Codingvila.Models;
using System;
using System.Collections.Generic;
using System.ComponentModel.DataAnnotations;
using System.Linq;
using System.Text;
using System.Web;
using System.Web.Mvc;
namespace Codingvila.Controllers
{
public class HomeController : Controller
{
public ActionResult Index()
{
CodingvilaEntities entities = new CodingvilaEntities();
var lstStudents = (from Student in entities.Students
select Student);
return View(lstStudents);
}
[HttpPost]
public FileResult ExportToCSV()
{
#region Get list of Students from Database
CodingvilaEntities entities = new CodingvilaEntities();
List<object> lstStudents = (from Student in entities.Students.ToList()
select new[] { Student.RollNo.ToString(),
Student.EnrollmentNo,
Student.Name,
Student.Branch,
Student.University
}).ToList<object>();
#endregion
#region Create Name of Columns
var names = typeof(Student).GetProperties()
.Select(property => property.Name)
.ToArray();
lstStudents.Insert(0, names.Where(x => x != names[0]).ToArray());
#endregion
#region Generate CSV
StringBuilder sb = new StringBuilder();
foreach (var item in lstStudents)
{
string[] arrStudents = (string[])item;
foreach (var data in arrStudents)
{
//Append data with comma(,) separator.
sb.Append(data + ',');
}
//Append new line character.
sb.Append("\r\n");
}
#endregion
#region Download CSV
return File(Encoding.ASCII.GetBytes(sb.ToString()), "text/csv", "Students.csv");
#endregion
}
}
}