Given:
class C
{
public string Field1;
public string Field2;
}
template = new [] { "str1", "str2", ... }.ToList() // presents allowed values for C.Field1 as well as order
list = new List<C> { ob1, ob2, ... }
Question:
How can I perform Linq's
list.OrderBy(x => x.Field1)
which will use template above for order (so objects with Field1 == "str1" come first, than objects with "str2" and so on)?
In LINQ to Object, use Array.IndexOf:
var ordered = list
.Select(x => new { Obj = x, Index = Array.IndexOf(template, x.Field1)})
.OrderBy(p => p.Index < 0 ? 1 : 0) // Items with missing text go to the end
.ThenBy(p => p.Index) // The actual ordering happens here
.Select(p => p.Obj); // Drop the index from the result
This wouldn't work in EF or LINQ to SQL, so you would need to bring objects into memory for sorting.
Note: The above assumes that the list is not exhaustive. If it is, a simpler query would be sufficient:
var ordered = list.OrderBy(x => Array.IndexOf(template, x.Field1));
I think IndexOf might work here:
list.OrderBy(_ => Array.IndexOf(template, _.Field1))
Please note that it will return -1 when object is not present at all, which means it will come first. You'll have to handle this case. If your field is guaranteed to be there, it's fine.
As others have said, Array.IndexOf should do the job just fine. However, if template is long and or list is long, it might be worthwhile transforming your template into a dictionary. Something like:
var templateDict = template.Select((item,idx) => new { item, idx })
.ToDictionary(k => k.item, v => v.idx);
(or you could just start by creating a dictionary instead of an array in the first place - it's more flexible when you need to reorder stuff)
This will give you a dictionary keyed off the string from template with the index in the original array as your value. Then you can sort like this:
var ordered = list.OrderBy(x => templateDict[x.Field1]);
Which, since lookups in a dictionary are O(1) will scale better as template and list grow.
Note: The above code assumes all values of Field1 are present in template. If they are not, you would have to handle the case where x.Field1 isn't in templateDict.
var orderedList = list.OrderBy(d => Array.IndexOf(template, d.MachingColumnFromTempalate) < 0 ? int.MaxValue : Array.IndexOf(template, d.MachingColumnFromTempalate)).ToList();
I've actually written a method to do this before. Here's the source:
public static IOrderedEnumerable<T> OrderToMatch<T, TKey>(this IEnumerable<T> source, Func<T, TKey> sortKeySelector, IEnumerable<TKey> ordering)
{
var orderLookup = ordering
.Select((x, i) => new { key = x, index = i })
.ToDictionary(k => k.key, v => v.index);
if (!orderLookup.Any())
{
throw new ArgumentException("Ordering collection cannot be empty.", nameof(ordering));
}
T[] sourceArray = source.ToArray();
return sourceArray
.OrderBy(x =>
{
int index;
if (orderLookup.TryGetValue(sortKeySelector(x), out index))
{
return index;
}
return Int32.MaxValue;
})
.ThenBy(x => Array.IndexOf(sourceArray, x));
}
You can use it like this:
var ordered = list.OrderToMatch(x => x.Field1, template);
If you want to see the source, the unit tests, or the library it lives in, you can find it on GitHub. It's also available as a NuGet package.
Related
Basically I have an object with 2 different properties, both int and I want to get one list with all values from both properties. As of now I have a couple of linq queries to do this for me, but I am wondering if this could be simplified somehow -
var componentsWithDynamicApis = result
.Components
.Where(c => c.DynamicApiChoicesId.HasValue ||
c.DynamicApiSubmissionsId.HasValue);
var choiceApis = componentsWithDynamicApis
.Select(c => c.DynamicApiChoicesId.Value);
var submissionApis = componentsWithDynamicApis
.Select(c => c.DynamicApiSubmissionsId.Value);
var dynamicApiIds = choiceApis
.Union(submissionApis)
.Distinct();
Not every component will have both Choices and Submissions.
By simplify, I assume you want to combine into fewer statements. You can also simplify in terms of execution by reducing the number of times you iterate the collection (the current code does it 3 times).
One way is to use a generator function (assuming the type of items in your result.Components collection is Component):
IEnumerable<int> GetIds(IEnumerable<Component> components)
{
foreach (var component in components)
{
if (component.DynamicApiChoicesId.HasValue) yield return component.DynamicApiChoicesId.Value;
if (component.DynamicApiSubmissionsId.HasValue) yield return component.DynamicApiSubmissionsId.Value;
}
}
Another option is to use SelectMany. The trick there is to create a temporary enumerable holding the appropriate values of DynamicApiChoicesId and DynamicApiSubmissionsId. I can't think of a one-liner for this, but here is one option:
var dynamicApiIds = result
.Components
.SelectMany(c => {
var temp = new List<int>();
if (c.DynamicApiChoicesId.HasValue) temp.Add(c.DynamicApiChoicesId.Value);
if (c.DynamicApiSubmissionsId.HasValue) temp.Add(c.DynamicApiSubmissionsId.Value);
return temp;
})
.Distinct();
#Eldar's answer gave me an idea for an improvement on option #2:
var dynamicApiIds = result
.Components
.SelectMany(c => new[] { c.DynamicApiChoicesId, c.DynamicApiSubmissionsId })
.Where(c => c.HasValue)
.Select(c => c.Value)
.Distinct();
Similar to some of the other answers, but I think this covers all your bases with a very minimal amount of code.
var dynamicApiIds = result.Components
.SelectMany(c => new[] { c.DynamicApiChoicesId, c.DynamicApiSubmissionsId}) // combine
.OfType<int>() // remove nulls
.Distinct();
To map each element in the source list onto more than one element on the destination list, you can use SelectMany.
var combined = componentsWithDynamicApis
.SelectMany(x => new[] { x.DynamicApiChoicesId.Value, x.DynamicApiSubmissionsId.Value })
.Distinct();
I have not tested it but you can use SelectMany with filtering out the null values like below :
var componentsWithDynamicApis = result
.Components
.Select(r=> new [] {r.DynamicApiChoicesId,r.DynamicApiSubmissionsId})
.SelectMany(r=> r.Where(p=> p!=null).Cast<int>()).Distinct();
I would like to sort a List<string> in a particular way. Below is a unit test showing the input, the specific way (which I am calling a "hierarchy" - feel free to correct my terminology so that I may learn), and the desired output. The code should be self explanatory.
[Test]
public void CustomSortByHierarchy()
{
List<string> input = new List<string>{"TJ", "DJ", "HR", "HR", "TJ"};
List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };
List<string> sorted = input.Sort(hierarchy); // <-- does not compile. How do I sort by the hierarchy?
// ...and if the sort worked as desired, these assert statements would return true:
Assert.AreEqual("HR", sorted[0]);
Assert.AreEqual("HR", sorted[1]);
Assert.AreEqual("TJ", sorted[2]);
Assert.AreEqual("TJ", sorted[3]);
Assert.AreEqual("DJ", sorted[4]);
}
Another way to do it:
var hierarchy = new Dictionary<string, int>{
{ "HR", 1},
{ "TJ", 2},
{ "DJ", 3} };
var sorted = strings.OrderBy(s => hierarchy[s]).ToList();
There are so many ways to do this.
It's not great to create a static dictionary - especially when you have a static list of the values already in the order that you want (i.e. List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };). The problem with a static dictionary is that it is static - to change it you must recompile your program - and also it's prone to errors - you might mistype a number. It's best to dynamically create the dictionary. That way you can adjust your hierarchy at run-time and use it to order your input.
Here's the basic way to create the dictionary:
Dictionary<string, int> indices =
hierarchy
.Select((value, index) => new { value, index })
.ToDictionary(x => x.value, x => x.index);
Then it's an easy sort:
List<string> sorted = input.OrderBy(x => indices[x]).ToList();
However, if you have a missing value in the hierarchy then this will blow up with a KeyNotFoundException exception.
Try with this input:
List<string> input = new List<string> { "TJ", "DJ", "HR", "HR", "TJ", "XX" };
You need to decide if you are removing missing items from the list or concatenating them at the end of the list.
To remove you'd do this:
List<string> sorted =
input
.Where(x => indices.ContainsKey(x))
.OrderBy(x => indices[x])
.ToList();
Or to sort to the end you'd do this:
List<string> sorted =
input
.OrderBy(x => indices.ContainsKey(x) ? indices[x] : int.MaxValue)
.ThenBy(x => x) // groups missing items together and is optional
.ToList();
If you simply want to remove items from input that aren't in hierarchy then there are a couple of other options that might be appealing.
Try this:
List<string> sorted =
(
from x in input
join y in hierarchy.Select((value, index) => new { value, index })
on x equals y.value
orderby y.index
select x
).ToList();
Or this:
ILookup<string, string> lookup = input.ToLookup(x => x);
List<string> sorted = hierarchy.SelectMany(x => lookup[x]).ToList();
Personally, I like this last one. It's a two liner and it doesn't rely on indices at all.
I have JSON file which contains orders as arrays against same key as
[
{
"order":["Order1"]
},
{
"order":["Order2"]
},
{
"order":["Order2","Order3"]
},
{
"order":["Order1","Order2"]
},
{
"order":["Order2","Order3"]
}
]
I want it to order by most occurred orders combination.
Kindly help me out in this.
NOTE: It is not a simple array of string kindly look at the json before you mark it as probable duplicate.
This can be done as follows. First, introduce a data model for your orders as follows:
public class Order
{
public string[] order { get; set; }
}
Next, define the following equality comparer for enumerables:
public class IEnumerableComparer<TEnumerable, TElement> : IEqualityComparer<TEnumerable> where TEnumerable : IEnumerable<TElement>
{
//Adapted from IEqualityComparer for SequenceEqual
//https://stackoverflow.com/questions/14675720/iequalitycomparer-for-sequenceequal
//Answer https://stackoverflow.com/a/14675741 By Cédric Bignon https://stackoverflow.com/users/1284526/c%C3%A9dric-bignon
public bool Equals(TEnumerable x, TEnumerable y)
{
return Object.ReferenceEquals(x, y) || (x != null && y != null && x.SequenceEqual(y));
}
public int GetHashCode(TEnumerable obj)
{
// Will not throw an OverflowException
unchecked
{
return obj.Where(e => e != null).Select(e => e.GetHashCode()).Aggregate(17, (a, b) => 23 * a + b);
}
}
}
Now you can deserialize the JSON containing the orders listed above and sort the unique orders by descending frequency as follows:
var items = JsonConvert.DeserializeObject<List<Order>>(jsonString);
//Adapted from LINQ: Order By Count of most common value
//https://stackoverflow.com/questions/20046563/linq-order-by-count-of-most-common-value
//Answer https://stackoverflow.com/a/20046812 by King King https://stackoverflow.com/users/1679602/king-king
var query = items
//If order items aren't already sorted, you need to do so first.
//use StringComparer.OrdinalIgnoreCase or StringComparer.Ordinal or StringComparer.CurrentCulture as required.
.Select(i => i.order.OrderBy(s => s, StringComparer.Ordinal).ToArray())
//Adapted from writing a custom comparer for linq groupby
//https://stackoverflow.com/questions/37733773/writing-a-custom-comparer-for-linq-groupby
//Answer https://stackoverflow.com/a/37734601 by Gert Arnold https://stackoverflow.com/users/861716/gert-arnold
.GroupBy(s => s, new IEnumerableComparer<string [], string>())
.OrderByDescending(g => g.Count())
.Select(g => new Order { order = g.Key } );
var sortedItems = query.ToList();
Demo fiddle here.
Alternatively, if you want to preserve duplicates rather than merging them, you can do:
var query = items
//If order items aren't already sorted, you may need to do so first.
//use StringComparer.OrdinalIgnoreCase or StringComparer.Ordinal or StringComparer.CurrentCulture as required.
.Select(i => i.order.OrderBy(s => s, StringComparer.Ordinal).ToArray())
//Adapted from writing a custom comparer for linq groupby
//https://stackoverflow.com/questions/37733773/writing-a-custom-comparer-for-linq-groupby
//Answer https://stackoverflow.com/a/37734601 by Gert Arnold https://stackoverflow.com/users/861716/gert-arnold
.GroupBy(s => s, new IEnumerableComparer<string [], string>())
.OrderByDescending(g => g.Count())
.SelectMany(g => g)
.Select(a => new Order { order = a });
Demo fiddle #2 here.
Notes:
I define the equality comparer using two generic types IEnumerableComparer<TEnumerable, TElement> : IEqualityComparer<TEnumerable> where TEnumerable : IEnumerable<TElement> rather than just IEnumerableComparer<string> as shown in this answer to IEqualityComparer for SequenceEqual by Cédric Bignon in order to prevent the string [] sort key from being upcast to IEnumerable<string> via type inferencing in the .GroupBy(s => s, new IEnumerableComparer<string>()) lambda expression.
If you are sure the orders are already sorted, or ["Order3", "Order1"] differs from ["Order1", "Order3"], then replace i.order.OrderBy(s => s, StringComparer.Ordinal).ToArray() with just i.order.
I have a simple class:
class Balls
{
public int BallType;
}
And i have a really simple list:
var balls = new List<Balls>()
{
new Balls() { BallType = 1},
new Balls() { BallType = 1},
new Balls() { BallType = 1},
new Balls() { BallType = 2}
};
I've used GroupBy on this list and I want to get back the key which has the highest count/amount:
After I used x.GroupBy(q => q.BallType) I tried to use .Max(), but it returns 3 and I need the key which is 1.
I also tried to use Console.WriteLine(x.GroupBy(q => q.Balltype).Max().Key); but it throws System.ArgumentException.
Here's what I came up with:
var mostCommonBallType = balls
.GroupBy(k => k.BallType)
.OrderBy(g => g.Count())
.Last().Key
You group by the BallType, order by the count of items in the group, get the last value (since order by is in an ascending order, the most common value would be the last) and then return it's key
Some came up with the idea to order the sequence:
var mostCommonBallType = balls
.GroupBy(k => k.BallType)
.OrderBy(g => g.Count())
.Last().Key
Apart from that it is more efficient to OrderByDescending and then take the FirstOrDefault, you also get in trouble if your collection of Balls is empty.
If you use a different overload of GroupBy, you won't have these problems
var mostCommonBallType = balls.GroupBy(
// KeySelector:
k => k.BallType,
// ResultSelector:
(ballType, ballsWithThisBallType) => new
{
BallType = ballType,
Count = ballsWithThisBallType.Count(),
})
.OrderByDescending(group => group.Count)
.Select(group => group.BallType)
.FirstOrDefault();
This solves the previously mentioned problems. However, if you only need the 1st element, why would you order the 2nd and the 3rd element? Using Aggregate instead of OrderByDescending will enumerate only once:
Assuming your collection is not empty:
var result = ... GroupBy(...)
.Aggregate( (groupWithHighestBallCount, nextGroup) =>
(groupWithHighestBallCount.Count >= nextGroup.Count) ?
groupWithHighestBallCount : nextGroup)
.Select(...).FirstOrDefault();
Aggregate takes the first element of your non-empty sequence, and assigns it to groupWithHighestBallCount. Then it iterates over the rest of the sequence, and compare this nextGroup.Count with the groupWithHighestBallCount.Count. It keeps the one with the hightes value as the next groupWithHighestBallCount. The return value is the final groupWithHighestBallCount.
See that Aggregate only enumerates once?
I want to access the first, second, third elements in a list. I can use built in .First() method for accessing first element.
My code is as follows:
Dictionary<int, Tuple<int, int>> pList = new Dictionary<int, Tuple<int, int>>();
var categoryGroups = pList.Values.GroupBy(t => t.Item1);
var highestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.First();
var 2ndHighestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.GetNth(1);
var 3rdHighestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.GetNth(2);
twObjClus.WriteLine("--------------------Cluster Label------------------");
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
highestCount.Category, highestCount.Count);
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
2ndHighestCount.Category, 2ndHighestCount.Count);
// Error here i.e. "Can't use 2ndHighestCount.Category here"
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
3rdHighestCount.Category, 3rdHighestCount.Count);
// Error here i.e. "Can't use 3rdHighestCount.Category here"
twObjClus.WriteLine("\n");
I have written extension method GetNth() as:
public static IEnumerable<T> GetNth<T>(this IEnumerable<T> list, int n)
{
if (n < 0)
throw new ArgumentOutOfRangeException("n");
if (n > 0){
int c = 0;
foreach (var e in list){
if (c % n == 0)
yield return e;
c++;
}
}
}
Can I write extension methods as .Second(), .Third() similar to
built in method .First() to access second and third indices?
If what you're looking for is a single object, you don't need to write it yourself, because a built-in method for that already exists.
foo.ElementAt(1)
will get you the second element, etc. It works similarly to First and returns a single object.
Your GetNth method seems to be returning every Nth element, instead of just the element at index N. I'm assuming that's not what you want since you said you wanted something similar to First.
Since #Eser gave up and doesn't want to post the correct way as an answer, here goes:
You should rather do the transforms once, collect the results into an array, and then get the three elements from that. The way you're doing it right now results in code duplication as well as grouping and ordering being done multiple times, which is inefficient.
var highestCounts = pList.Values
.GroupBy(t => t.Item1)
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.Take(3)
.ToArray();
// highestCounts[0] is the first count
// highestCounts[1] is the second
// highestCounts[2] is the third
// make sure to handle cases where there are less than 3 items!
As an FYI, if you some day need just the Nth value and not the top three, you can use .ElementAt to access values at an arbitrary index.