How should I manage large DataTables?

How should I manage large DataTables? - c#

For reasons that don't make a lot of sense (Read: Not my decision) I need to keep a large number of rows, about ~90,000, in a DataTable and I do not have the option of using a database.
I need to be able to search the DataTable efficiently to find rows that match some basic criteria. For example, I might be looking at a row that has the value 2 in two specific columns.
What is the best way to do this?
Edit: Please take a look at https://chat.stackoverflow.com/transcript/message/62648#62648 for more details; after I work on this I will try and summarize the extra details from the chat here as well as provide my solution.

You could easily use DataTable.Select()

The solution I ended up using for this painfully awkward and inconvenient situation was to use DataTable.Select(), populate a new DataTable and then use the same operation to select the rows I needed from the refined DataTable.
I think that this solution is clumsy, but then again the constraints on the problem were somewhat unrealistic seeing as I was on a tight schedule as well.

Related

Is there way to manipulate entire row (or column) of data using SmartXLS

I have looked at their documentation and noticed that there are not many ways to manipulate entire rows or columns. They provide ways to manipulate range of data but not the whole.
I have been checking out http://www.smartxls.com/sample-list.htm but not really having any luck.
Would anybody know how? Knowing how many columns there are or rows there are would let me use those range methods but I do not think I see anything of that sort
Thanks.

I am now using Workbook.ExportDataTable then use Rows and Columns .Count to know the end.
If there are better ways, let me know.

Use linq to iterate through large DB tables

I have two tables: Foo and Bar. For each row in Foo, I now want to add a row in Bar which references the respective Foo record. Foo will likely contain several millions of records.
Normally this answer would have been perfect: linq to sql - loop through table data and set value. But as it says on the tin, using the following line is not particularly ideal for large tables.
List<User> users = dc.Users.ToList();
Since caching the entire table in a List<> is not going to work, what other options do I have? Is there an elegant way to "page through" the records, for instance? Since I am quite sure that this is a relatively common problem, I think it's likely that there is a best practice for this too. I have not been able to find it, however.

Your talking about several million rows of data, then Linq is not your friend.
Consider using a stored procedure or, if you like, DbContext.ExecuteCommand.
Both will result in a huge performance gain.

You can work with predefined batches using .Skip() and .Take() methods. Another thing to consider is using a trigger so that you don't need to worry about the second table at all.

Crystal Reports - How to get rid of empty fields?

Let's say I have two fields (one under another). By default I want to have both of them rendered, but when the first field is empty, I would like the second one to take its place. Is there any convenient way to achieve this?
Note: My problem is much more complex, and a "scalable" solution would be highly appreciated.
I've found a cumbersome method, but I'm still looking for something better.

I finally came up with a better solution. The solution is to create a table in a DataSet, and insert a subreport selecting the table as its data source. Now you can add rows dynamically to the table (depending on some conditions).

Best Practise to get data for DataGrid from Web Service

I have WPF DataGrid which get his data from Web Service. End user has ability to customize visible columns in DataGrid.
1st approach:
I get this data in xml and after convert xml to the dataTable and give it like ItemsSource for DataGrid.
2nd approach:
Also I can get this data like class array from service (for example Customer[])
Problem:
I use 1st approach with extra steps for the purpose not get redundant data from service.
In 2nd approach if user see only two columns in DataGrid (one column for one property in class) he get all class with all his filled properties (redundant data). in 1st approach he get only data xml which will be visible in datagrid in UI.
But I use MVMM approach in my project and I dont want to use xml and dataTable approach. I think I have to use 2nd approach, but in this case I get redundant data

In 2nd approach if user see only two columns in DataGrid (one column for one property in class) he get all class with all his filled properties (redundant data)
If the above is the only thing that is stopping you with your second approach, then C# v4.0 has Named and Optional Arguments feature. Which works as
Console.WriteLine(Calculate(weight: 123, height: 64));
even if the actual Calculate() has 99 arguments, with any order.
Please note, I assume, that by redundant you mean, unwanted data.

I would take the second approach even though this may transport a little bit more data. If you really want control over what fields are fetched, this will probably make your application more complex then necessary.
Have you verified you have performance problems with the second approach?

It is a just another trade-off we have always faced when developing softwares.
In your specific case,
First approach has performance advantage by transfering much less (not sure if really much) data on network and flexibilty by not using strongly typed data approach.
Second approach is looking better for manageability and easy development in the long term.
To choose the right approach you should consider and weighting non-functional requirements such as performance, extensibility, manageability etc.

How do I compare two datasets for equality

I have two datasets each with one data table pulled from different sources and I need to know if there are any differences in the data contained in the data tables. I'm trying to avoid looping and comparing each individual record or column, although there may be no other way. All I need to know is if there is a difference in the data, I do not need to know the details of any difference.
I have tried the below code, but it appears that dataset.Merge does not update rowstatus so dataset.HasChanges() always returns false. Any help is appreciated:
var currentDataSet = GetSomeData();
var historicalDataSet = GetSomeHistoricalData();
historicalDataSet.Merge(currentDataSet);
if (historicalDataSet.HasChanges()) DoSomeStuff();

I don't know of any built-in support for this and I wouldn't expect it either. So you'll have to do this by yourself in some way.
The most obvious way would be a brute force, table by table and row by row approach.
If you can rely on certain factors to be the same, ie exactly the same naming, ordering of records etc then you could test if saving both as XML and comparing the results might be an efficient trick.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How should I manage large DataTables? - c#

You could easily use DataTable.Select()

Related

Is there way to manipulate entire row (or column) of data using SmartXLS

Use linq to iterate through large DB tables

Crystal Reports - How to get rid of empty fields?

Best Practise to get data for DataGrid from Web Service

How do I compare two datasets for equality

Categories

Resources