c# doing a custom sort on a datatable - c#

i have data in a datatable that needs to be sorted on the first column this way:
A02 BLANK0010
D02 BLANK0007
B04 BLANK0011
G05 BLANK0012
C06 BLANK0014
E08 BLANK0013
F10 BLANK0016
H12 BLANK0015
B02 G112486
C02 G125259
E02 G125257
F02 G112492
G02 G125095
H02 G112489
A03 G125090
B03 G112499
C03 G125256
D03 G002007
E03 G112494
F03 G002005
G03 G112495
H03 G002008
A04 G115717
if i do a regular sort, it will just sort like this: A02, A03, A04. but i need A02, B02, C02... etc
how can i do this>? here my code so far:
DataView view = dt.DefaultView;
view.Sort = "position";

You'll want to do a custom sort. See the following question for hints: DataView.Sort - more than just asc/desc (need custom sort)
You might want to break the first column into two separate columns.

Maybe need refactorings but solves the problem.
//group by the rows by splitting values of column
var groupBy = table.AsEnumerable()
.GroupBy(o =>
Regex.Replace(o["position"].ToString(), #"[0-9]", ""));
var dataRows = Sort(groupBy);
And Here is the Sort Method:
//yield the first row of each group
private static IEnumerable<DataRow> Sort(IEnumerable<IGrouping<string, DataRow>> groupByCollection)
{
//sort each character group(e.g. A,B) by integer part of their values
var groupings =
groupByCollection.Select(
o =>
new
{
o.Key,
Value = o.OrderBy(a => Regex.Replace(a["position"].ToString(), "[a-z]", "", RegexOptions.IgnoreCase)).ToArray()
}).ToArray();
int i = 0, j;
for (j = 0; j < groupings[i].Value.Length; j++,i=0)
for (i = 0; i < groupings.Length; i++)
{
yield return groupings[i].Value[j];
}
}

As possible way: add additional column that will first letter of first column and then sort by that column and first column.

A little primitive, but effective way (in this case):
dv.Sort = substring(field, 2, 3) + substring(field, 1, 1)

Related

How do Index an Array of type IEnumerable<Data Row>?

I want to index an Array of IEnumerable<DataRow> and print out the Data into a table. I get the below error and I'm not sure hoe to overcome it.
cannot convert type System.Data.DataRow to string
IEnumerable<DataRow> query = from result in
DtSet.Tables["Results"].AsEnumerable()
where result.Field<string
("test").Contains("50")
select result;
var queryArray = query.ToArray();`
for (int i = 0; i < queryArray.Count(); i++)
{
table.Rows[i + 1].Cells[0].Paragraphs.First().Append(queryArray[i]);
}
Consider:
var queryArray = query.ToArray();
(You also have a small problem in that you're trying to stuff a datarow into your destination paragraph; this might just append "system.data.datarow" to your paragraph)
But really you could just delete that line and:
int i = 1:
foreach(var q in query)
table.Rows[i++].Cells[0].Paragraphs.First().Append(q["your column name"].ToString());
That is to say; enumerate the IEnumerable, using a separate indexer variable to keep track of where you are in (the excel sheet?)
Side note; I put a call into extract a single column from the data row; you could alternatively make this a part of your select LINQ statement, converting the datarow to a string enuneabke instead
It looks like using a dataview rowfilter might save you some effort here, something like:
DataView dv = new DataView(DtSet.Tables["Results"]);
dv.RowFilter = "test LIKE '%50%'";
foreach (DataRowView drv in dv)
{
//do the stuff...
}
Microsoft Documentation

filtering large DataTable in for loop

I have a DataTable with Row.Count=2.000.000 and two columns containing integer values.
So what i need is filtering the datatable in a loop, efficiently.
I'm doing it with;
for (int i= 0; i< HugeDataTable.Rows.Count; i++)
{
tempIp= int.Parse(HugeDataTable.Rows[i]["col1"].ToString());
var filteredUsers = tumu.Select("col1= " + tempIp.ToString()).Select(dr => dr.Field<int>("col2")).ToList();
HashSet<int> filtered = new HashSet<int>(filteredUsersByJob2);
Boolean[] userVector2 = userVectorBase
.Select(item => filtered.Contains(item))
.ToArray();
...
}
What should I do to improve performance. I need every little trick. Datatable index, linq search are what i came up with google search. I d like hear your suggestions.
Thank you.
You may use Parallel.For
Parallel.For(0, table.Rows.Count, rowIndex => {
var row = table.Rows[rowIndex];
// put your per-row calculation here});
Please have a look at this post
You're using a double for loop. If your tumu contains a lot of rows it will be very slow.
Fix: make a dictionary with all users before your for loop. In your for loop check the dictionary.
Something like this:
Dictionary<string, id> usersByCode;//Init + fill it in
for (int i= 0; i< HugeDataTable.Rows.Count; i++)
{
tempIp= int.Parse(HugeDataTable.Rows[i]["col1"].ToString());
if(usersByCode.Contains(tempId)
{
//Do something
}
}

How to compare two DataSet columns values in C#?

In below code i want to compare two dataset column's values but its not match then also getting true this condition.so how to really compare?
if (dsEmp.Tables[0].Columns["EmpName"].ToString() == dsAllTables.Tables[2].Columns["EmpName"].ToString())
{
}
You are comparing two column-names, so "EmpName" with "EmpName" which is always true. Tables[0].Columns["EmpName"] returns a DataColumn with that name and ToString returns the name of the column which is "EmpName". So that's pointless.
If you instead want to know if two tables contain the same EmpName value in one of their rows you can use LINQ:
var empRowsEmpName = dsEmp.Tables[0].AsEnumerable().Select(r => r.Field<string>("EmpName"));
var allRowsEmpName = dsAllTables.Tables[2].AsEnumerable().Select(r => r.Field<string>("EmpName"));
IEnumerable<string> allIntersectingEmpNames = empRowsEmpName.Intersect(allRowsEmpName);
if (allIntersectingEmpNames.Any())
{
}
Now you even know which EmpName values are contained in both tables. You could use a foreach-loop:
foreach(string empName in allIntersectingEmpNames)
Console.WriteLine(empName);
If you want to find out if a specific value is contained in both:
bool containsName = allIntersectingEmpNames.Contains("SampleName");
If you just want to get the first matching:
string firstIntersectingEmpName = allIntersectingEmpNames.FirstOrDefault();
if(firstIntersectingEmpName != null){
// yes, there was at least one EmpName that was in both tables
}
If you have a single row, this should work:
if (dsEmp.Tables[0].Row[0]["EmpName"].ToString() == dsAllTables.Tables[2].rows[0]["EmpName"].ToString())
{
}
For multiple rows you have to iterate through table:
for (int i = 0; i <= dsEmp.Tables[0].Rows.Count; i++)
{
for (int j = 0; j <= dsAllTables.Tables[0].Rows.Count; j++)
{
if (dsEmp.Tables[0].Rows[i]["EmpName"].ToString() == dsAllTables.Tables[2].Rows[j]["EmpName"].ToString())
{
}
}
}
I have two datatables - dtbl and mtbl, and I use this to return records that have a difference, as another DataTable.
//compare the two datatables and output any differences into a new datatable, to return
var differences = dtbl.AsEnumerable().Except(mtbl.AsEnumerable(), DataRowComparer.Default);
return differences.Any() ? differences.CopyToDataTable() : new DataTable();

Count positive values in datatable

Is there an elegant way to count how many values are positive in a datatable without having to go through every element and check it? I've looked at DataTable.Compute method and some LINQ examples too but they all require a column name and I need it for the whole table.
Try this :
No need (for the code) to know the column names :
dt.AsEnumerable().Select(row1 => dt.Columns.Cast<DataColumn>()
.ToDictionary(column => column.ColumnName, column => row1[column.ColumnName]))
.SelectMany(f=>f.Values)
.Count(f=>decimal.Parse(f.ToString())>0);
Example :
6 positive :
Have you considered using DataTable.Select?
int TotalPositiveValues = 0;
foreach (DataColumn NextColumn in MyDataTable.Columns)
{
DataRow[] PositiveRows = MyDataTable.Select(NextColumn.ColumnName + " >=0");
int TotalPositiveValues += PositiveRows.Length;
}
EDIT: provided all values in your DataTable qualify as numbers.

sorting List<string[]> by many columns

I have List which I would like to sort by many columns. For example, string[] has 5 elements (5 columns) and List has 10 elements (10 rows). For example I would like to start sorting by 1st column, then by 3rd and then by 4th.
How could it be done in the easiest way with C#?
I thought about such algorithm:
Delete values corresponding to those columns that I don't want to use for sorting
Find for each of columns that are left, the longest string that can be used to store their value
Change each row to string, where each cell occupies as many characters as there is maximum number of characters for the value for the given column
Assign int with index for each of those string values
Sort these string values
Sort the real data, with help of already sorted indices
But I think this algorithm is very bad. Could you suggest me any better way, if possible, that uses already existing features of C# and .NET?
List<string[]> list = .....
var newList = list.OrderBy(x => x[1]).ThenBy(x => x[3]).ThenBy(x => x[4]).ToList();
Something like this:
var rows = new List<string[]>();
var sortColumnIndex = 2;
rows.Sort((a, b) => return a[sortColumnIndex].CompareTo(b[sortColumnIndex]));
This will perform an in-place sort -- that is, it will sort the contents of the list.
Sorting on multiple columns is possible, but requires more logic in your comparer delegate.
If you're happy to create another collection, you can use the Linq approach given in another answer.
EDIT here's the multi-column, in-place sorting example:
var rows = new List<string[]>();
var sortColumnIndices = new[] { 1, 3, 4 };
rows.Sort((a, b) => {
for (var index in sortColumnIndices)
{
var result = a[index].CompareTo(b[index]);
if (result != 0)
return result;
}
return 0;
});

Categories