Jarray GroupBy using multiple columns - c#

I have this issue that I am currently stuck with in C#.
I have about 31 columns of data within Jobject inside Jarray (JArray tableJson = new JArray();)
I would like to group them into three columns.
So far I can only group by one of the columns
eg :
var tableJsonGroup = tableJson.GroupBy(x => x["FirstColumn"]).ToList();
I want to do something like this (it does not work) :
var tableJsonGroup = tableJson.GroupBy(x => new {x["FirstColumn"], x["SecondColumn"], x["FifthColumn"]}).ToList();
How do I do this?
Thank you.

As we can see, this overload of the GroupBy extension takes a delegate which is used to enumerate the tableJson and generate a key for each item.
Items that generate the same key will be grouped together and returned as an IGrouping. There is no special treatment based on the type of the source. This doesn't do anything different, whether you have an array of ints or an array of complex objects.
So, if you want to group by a combination of columns you need to provide a function that returns a unique, repeatable, key for that combination of columns.
This can be simply achieved by compounding those columns into an anonymous type, which has a built in implementation for equality and hashing that suits our purposes, like in this answer.
var groupedTableJson = tableJson.GroupBy(x =>
new {
FirstColumn: x["FirstColumn"],
SecondColumn: x["SecondColumn"],
FifthColumn: x["FifthColumn"]
});
Your answer is almost right but, you don't provide names for properties of your anonymous type. However, since you don't explain what "does not work", it is hard to be sure.

Related

Dynamic Linq "OrderBy" on runtime values

I've been looking at Dynamic Linq today (installed into VS via NuGet)...but all the examples I have found so far assume OrderBy is to be done on a known property or column name; however I am trying to OrderBy a field which is not strongly typed; but actually a key value of a row object which is derived from a Dictionary; e.g.
class RowValues : Dictionary<string, string>
{
...
}
So the list to be ordered is specifically a list of RowValues objects, filled with Name,Value pairs. For a given list of RowValues, the OrderBy field could by any of keys of the named value pairs entries (fyi: I want the orderby field to be specified in an xml config file ultimately so the ordering can be changed without re-deployment of binaries etc).
I've got a hunch the solution lies in writing a custom ordering function passed to the OrderBy??? This function would obviously know how to get a specific value from the RowValues object given a field name from the xml config....?? The answers I have seen so far show passing a string which contains a custom order by clause into the OrderBy, which is close to where I want to be, but how in my case would the runtime know where to find the fields referred to in the OrderBy string??
Input will be very much appreciated, or have I completely misunderstand the Dynamic Linq functions?
If you're using dynamic LINQ, it would just be:
var sortColumn = GetConfigValue(...);
var sorted = RowValues.OrderBy(sortColumn);
You could of course use a concatenated string to create a multiple sort ("column1, column2 DESC"). As far as I'm aware, there's no custom sort function unless you're using regular LINQ.
Also, I would make sure you know the performance characteristics of Dynamic LINQ.
Edit:
Is this what you're looking for? This will order it based on the value of the "Key" entry in the dictionary. If you need multiple sort by-s, you can use it in a loop with .ThenBy()
void Main()
{
List<RowValues> v = new List<RowValues>();
var key = "Key"; //GetFromConfig();
var v1 = new RowValues();
v1.Add("Key", "1");
v1.Add("3", "5");
var v2 = new RowValues();
v2.Add("Key", "3");
v2.Add("2", "2");
var v3 = new RowValues();
v3.Add("Key", "2");
v3.Add("2", "2");
v.Add(v1);
v.Add(v2);
v.Add(v3);
v.OrderBy(r => r[key]).Dump();
}
class RowValues : Dictionary<string, string>
{
}
Kyle, thanks again. Apologies for late reply, I have moved on from this issue now but out of interest and courtesy I wanted to come back and agree your code is much closer to where I wanted to get to, but we have lost the dynamic linq aspect. So, where you are calling the OrderBy and ordering on the key, I would want to pass a string containing the order command e.g "r[key] desc". The reason being I would want to leave the determination as to which direction to order until runtime. I suspect ths would be accomplished using an expression tree possibly? e.g: here

How to skip one (maybe more) columns of data in a list in linq

I'm reading input from two excel worksheets (using Linq-To-Excel) into two lists. One of the worksheets has an unwanted column of data (column name is known). The other columns in both worksheets however contain exactly the same type of data.
First part of my question is:
How can I exclude only that unwamted column of data in the select statement (without having to write select.column names for the other 25 or so columns? I intend to do this for the below purposes:
Make both the lists of the same type
Merge the two lists
Possibly move this block of code to a call procedure, as eventually I'll have to read from many more worksheets
ExcelQueryFactory excel = new ExcelQueryFactory(FilePath);
List<STC> stResults = (from s
in excel.Worksheet<STC>("StaticResults")
select s)
.ToList();
List<DYN> dynResults = (from s
in excel.Worksheet<DYN>("DynamicResults")
select s) //how can I EXCLUDE just one of the columns here??
.ToList();
I'm new to c# and linq. So please pardon my ignorance :-)
The second part of my question is:
The above data that I'm extracting is a bit on the fat side (varying from 100,000 to 300,000 rows). I have to keep giving repeated linq queries on the lists above (in the range of 1000 to 4000 times) using a for loop. Is there a better way to implement this, as its taking a huge toll on the performance.
EDIT_1:
Regarding the input files:
StaticResults file has 28 Columns (STC Class has 28 properties)
DynamicResults file has 29 Columns (28 columns with the same properties/column names as static plus one additional property, which is not required) (DYN is a derived class from STC)
Use anonymous type while selecting result from linq.
ExcelQueryFactory excel = new ExcelQueryFactory(FilePath);
List<STC> stResults = (from s
in excel.Worksheet<STC>("StaticResults")
select s)
.ToList();
List<DYN> dynResults = (from s
in excel.Worksheet<DYN>("DynamicResults")
select new {Property1 = s.xxx, Property2 = S.yyy) //get the props based on the type of S
.ToList();
Accidentally figured out the solution to my first question. Probably nothing great about it, but nevertheless thought would share it on here.
Got rid of the second class DYN
Made the second list is of type STC
This way both the lists generated extract only those properties/columns that are required (properties declared in the class that is). The extra column(s) not required are skipped (As I didn't define those as properties in the class. This is, I think, courtesy of linq-to-excel. I'd like to know more about that, if someone can put some more insight into it).

Determine which elements in a list are NOT in another list

I have two IList<CustomObject>, where CustomObject has a Name property that's a string. Call the first one set, and the second one subset. set contains a list of things that I just displayed to the user in a multiselect list box. The ones the user selected have been placed in subset (so subset is guaranteed to be a subset of set, hence the clever names ;) )
What is the most straightforward way to generate a third IList<CustomObject>, inverseSubset, containing all the CustomObjects the user DIDN'T select, from these two sets?
I've been trying LINQ things like this
IEnumerable<CustomObject> inverseSubset = set.Select<CustomObject,CustomObject>(
sp => !subset.ConvertAll<string>(p => p.Name).Contains(sp.Name));
...based on answers to vaguely similar questions, but so far nothing is even compiling, much less working :P
Use the LINQ Except for this:
Produces the set difference of two sequences.
Aha, too much SQL recently - I didn't want Select, I wanted Where:
List<string> subsetNames = subset.ConvertAll<string>(p => p.Name);
IEnumerable<CustomObject> inverseSubset =
set.Where<CustomObject>(p => !subsetNames.Contains(p.Name));

Collecting metadata into table

I have tabluar data that passes through a C# program that I need to collect some metadata on before finishing. The metadata is always counts based on fields of the data. Also, I need them all grouped by one field in the data. Periodically, I need to add new counts to this collection of metadata.
I've been researching it for a little while, and I think what makes sense is to rework my program to store the data as a DataTable, then run LINQ queries on the table. The problem I'm having is being able to put the different counts into one table-like structure and then write that out.
I might run a query like this:
var query01 =
from record in records.AsEnumerable()
group record by record.Field<String>("Association Key") into associationsGroup
select new { AssociationKey = associationsGroup.Key, Count = associationsGroup.Count<DataRow>() };
To get a count of all of the records grouped by the field Association Key. I'm going to want another count, grouped in the same way:
var query02 =
from record in records.AsEnumerable()
where record.Field<String>("Number 9") == "yes"
group record by record.Field<String>("Association Key") into associationsGroup
select new { AssociationKey = associationsGroup.Key, Number9Count = associationsGroup.Count<DataRow>() };
And so on.
I thought about trying Union chain the queries but I was having trouble getting them to union since I'm projecting into anonymous types. I couldn't figure out how to do it differently to make a union work better.
So, how can I collect my metadata into one table-like structure?
Not going to union because you have different types. Add Number9Count and Count to both annonymous types and try union again.
I ended up solving the problem by creating a class that holds the set of records I need as a DataTable. A user can add queries through a method, taking an argument Func<DataRow, bool>. The method constructs the query supplying that argument as the where clause, maintaining the same grouping and properties in the resulting anonymous-typed object.
When retrieving the results, the class iterates over each query stored and enters the results into a new DataTable.

Linq operations against a List of Hashtables?

I'm working with a set of legacy DAO code that returns an IList, where each Hashtable represents the row of a dynamically executed SQL query. For example, the List might contain the following records/hashtables:
Hashtable1:
Key:Column15, Value:"Jack"
Key:Column16, Value:"Stevens"
Key:Column18, Value:"7/23/1973"
Key:Column25, Value:"Active"
Hashtable2:
Key:Column15, Value:"Melanie"
Key:Column16, Value:"Teal"
Key:Column18, Value:"null"
Key:Column25, Value:"Inactive"
Hashtable3:
Key:Column15, Value:"Henry"
Key:Column16, Value:"Black"
Key:Column18, Value:"3/16/1913"
Key:Column25, Value:"Active"
Use of a static type instead of a Hashtable is out of the question because the result of the query is unknown at run time; both the number of columns and the nature of those columns is completely dynamic.
I'd like to be able to perform Linq based operations on this data set (grouping, ordering etc), but I absolutely can't get my head around what the syntax might look like. As a simple example, let's say I want to sort the list by Column15 descending. The best syntax I've come up with is:
var rawGridData = (List<Hashtable>) _listDao.GetListGridContents(listID, null, null);
var sortedGridData = rawGridData.OrderBy(s => s.Keys.Cast<string>().Where(k => k == "Column15"));
However, this yields an exception when sortedGridData is enumerated: "At least one object must implement IComparable."
I've been struggling with this problem for days and am near my wit's end...please help!
This should get you started:
var sortedGridData = rawGridData.OrderBy(r => r["Column15"])
This maps each "record" to the value in "Column15" and then orders the resulting projection. This is easily generalizable.

Categories