How to save Large Data to file - c#

I have list of maybe 50,000 entries that are populated in datagrid in wpf. Now I want to save the data in the list to a file that may be text, or preferably CSV. As list is too big. There is a problem that my implemented method that may be simple text file writing or the method to copy the contents from the datagrid to clipboard and then back to string, and then that string to file using StreamReader. It consumes approx 4-5 minutes even it is in background worker.
Is there any way that I can save huge list to file quickly?
I am using DataGrid in WPF
CODE
dataGrid1.SelectAllCells();
dataGrid1.ClipboardCopyMode = DataGridClipboardCopyMode.IncludeHeader;
ApplicationCommands.Copy.Execute(null, dataGrid1);
String result = (string)Clipboard.GetData(DataFormats.CommaSeparatedValue);
///Never reach to step Below thread stays on above line
dataGrid1.UnselectAllCells();
Clipboard.Clear();
StreamWriter file = new System.IO.StreamWriter(SavePageRankToPDF.FileName);
file.WriteLine(result);
file.Close();

Instead of using the clipboard, why not iterate through the datatable and build the csv file.
Update
Here are some examples:
Convert DataTable to CSV stream
Converting DataSet\DataTable to CSV

One thing that will help is to not load ALL of your data into the datagrid when using it for display purposes. It'd be a good idea to use paging: only load the data into the datagrid that will be needed for calculations or display purposes. If the user wants to see/use more data, go back to your data source and get more of the data. Not only will your app run faster, you'll use much less memory.

Related

Realtime updating textfield messages in winforms

I have a service that add items to a class and then serialize a class into a file.
Then I need to create a simple form C# with a textfield (multiline) showing realtime from the deserialized file.
I will have timer every second that will read my List of a property of the class and show it in a textfield.
My question here is:
Is there a way that instead of reading again and again the file I can just get the latest rows added and just append to the textfield?
var file = File.OpenRead("abc.txt");
file.Seek(1000, SeekOrigin.Begin);
Reading a file, use "Seek" to skip old data.
just a suggestion:
I think, you should rather use some sort of message queue for this purpose.
Again per your post, instead of directly fetching from file, it would be good if you rather store the deserialized content in a stack. That way, you can always get the content from top of the stack.

c# ssis - script component within data flow

I'm trying to parse values from a text file using c#, then append it as a column to the existing data set coming from the same text file. Example file data:
As of 1/31/2015
1 data data data
2 data data data
So I want to use a script component within an ssis data flow to append the 1/31/2015 value as a 4th column. This package iterates through several files, so I would like this to take place within each dataflow. I have no issues with getting the rest of the data into the database into individual columns, but I parse most of those out using tsql after fast loading everything as one big column into the database.
edit:
here is the .NET code I used to get it started, I know this is probably far from optimal, but I actually want to parse two values from the resulting string, and that would be easy to do with regular expressions
string every = "";
string[] lines = File.ReadLines(filename).Take(7).ToArray();
string.Join(",", lines);
every = lines.ToString();
MessageBox.Show(every).ToString();

Best way to save and restore a DataTable c#

I have developed a WinForm c# application , now adding a recovery options so if it closes unexpectedly etc Everything can be recovered on a new run.
I have managed to recover almost everything (list,Int,Strings etc...)
only issue i am facing is restoring a DataTable. During the run on my application records are added to this DataTable and at the end user can export this to csv.
I tried to add the DataTable to Properties.Settings.Default... But it does not work on the new run i always see it as Null .
Any suggestion on best way to save and restore a DataTable keeping in mind they records can go over 10-15 k during a run .
Thank you
Properties.Settings can store string data, so you can serialize your DataTable and store it. Later you can deserialize the string to get DataTable. You can use JSON.Net like:
var serializedDt = JsonConvert.SerializeObject(dt);
//store the string
to retrieve back:
DataTable yourDataTable = JsonConvert.DeserializeObject<DataTable>(serializedDt);
One more thing to add, if you are expecting large data, then you may look at options to store data in a database at client side, like Sqlite.
Serialze the object, but place it in a recovery file. During recovery start just read the file, and you won't have to worry about space.

Read csv logfiles with different headers/columns

I need to read multiple csv files and merge them. The merged data is used for generating a chart (with the .NET chart control).
So far I've done this with a simple streamreader and added everything to one DataTable:
while (sr.Peek() > -1)
{
strLine = sr.ReadLine();
strLine = strLine.TrimEnd(';');
strArray = strLine.Split(delimiter);
dataTableMergedData.Rows.Add(strArray);
}
But now there is the problem, that the logfiles can change. As you can see here, newer logfiles have got additional columns:
My current procedure doesn't work now and I'm asking for advice how to do this. Performance is important due to the fact, that every logfile contains about 1500 lines and up to 100 columns and the logfiles get merged up to a one-year-period (equals 365 files).
I would do it that way: Creating a DataTable, which should contain all data at the end and reading each logfile into a seperate DataTable. After each read operation I would add the seperate DataTable to the "big" DataTable, check if columns have changed and add the new columns if they did.
But I'm afraid that using DataTables would affect the performance.
Note: I'm doing this with winforms, but I think that doesn't matter anyway.
Edit: Tried CsvReader but this is about 4 times slower than my current solution.
After hours of testing I did it the way I described it in the question:
Firstly I created a DataTable which should contain all the data at the end. Then I am going through all logfiles by a foreach-loop and for every logfile I create another DataTable and fill it with the csv-data from the logfile. This table gets added to the first DataTable and no matter if they have different columns, they get added properly.
This may costs some performance compared to a simple StreamReader, but It is more easier to extend and still faster than the LumenWorks CsvReader.

Best way to populate large file list in c# DataGridView

I have a program which fills a DGV with file details such as name, date etc, and also a few extra custom columns that give info about the files. This works fine until there is a huge amount of files in which case the DGV seems to get slower as it populates.
From reading about DGV, it seems the best way to fill these with large amounts of data is to bind the contents to a database source.
So, the question is, would the most effective way for me to do this be to parse the files (and fill in my own custom data) then write these to a temp database, then use this to fill the DGV? Or am I making heavy work of something much simpler?
Thanks for any advice.
You can speed up the responsiveness of a DGV by using VirutalMode
http://msdn.microsoft.com/en-us/library/system.windows.forms.datagridview.virtualmode.aspx
If you have a huge amount of rows, like 10 000 and more,
to avoid performance leak - do the following before data binding:
dataGridView1.RowHeadersWidthSizeMode = DataGridViewRowHeadersWidthSizeMode.EnableResizing; //or even better .DisableResizing. Most time consumption enum is DataGridViewRowHeadersWidthSizeMode.AutoSizeToAllHeaders
dataGridView1.RowHeadersVisible = false; // set it to false if not needed
after data binding you may enable it.

Categories