I'm new to lambda expressions but I think they look great in code, but I'm having problems understanding how to convert a foreach loop into a lambda. I can't understand the other examples I've seen here.
The code I'm trying to convert is:
{
var a = arr.ToList();
var b = a.Distinct().ToList();
if(a.SequenceEqual(b)){return -1;}
foreach(int i in b)
{
a.Remove(i);
}
return a.First();
}
Specifically the
foreach(int i in b)
{
a.Remove(i);
}
Also as I'm here, is if(a.SequenceEqual(b)){return -1;} an okay thing to do? I felt bad using like 4 lines when it could be only 1.
If the goal is to find the first item in the collection that has a duplicate, or to return -1 when there are no duplicates, then it can be done with two lines
int FindFirstDuplicate(IEnumerable<int> arr)
{
HashSet<int> seen = new();
return arr.SkipWhile(e => seen.Add(e)).DefaultIfEmpty(-1).First();
}
But these two lines cannot be written as expression-bodied function.
If there is a goal to write expression-bodied function then the easiest the way is to use GroupBy
int FindFirstDuplicate(IEnumerable<int> arr) =>
arr.GroupBy(e => e)
.Where(g => g.Count() > 1)
.Select(g => g.First())
.DefaultIfEmpty(-1)
.First();
But this option has a very suboptimal memory consumption.
If collection always has a very small number of elements then function can be written without GroupBy
int FindFirstDuplicate(IEnumerable<int> arr) =>
arr.SkipWhile((e, i) => arr.Skip(i + 1).All(x => x != e))
.DefaultIfEmpty(-1)
.First();
But such implementation has quadratic complexity, as it has nested loop. So, it will be extremely slow for thousands records.
There is one more option that has similar performance and memory consumption as the very first function.
int FindFirstDuplicate(IEnumerable<int> arr) =>
Enumerable.Repeat(new HashSet<int>(), arr.Count())
.Zip(arr)
.SkipWhile(e => e.First.Add(e.Second))
.Select(e => e.Second)
.DefaultIfEmpty(-1)
.First();
Repeat and Zip are needed to pass the HashSet object to SkipWhile method.
And the most straightforward way with foreach loop for comparison with other options
int FindFirstDuplicate(IEnumerable<int> arr)
{
HashSet<int> seen = new();
foreach (int item in arr)
{
if (!seen.Add(item))
{
return item;
}
}
return -1;
}
Just to restate what your code does in English: you what to remove 1 of each unique value from a list, leaving duplicate values in place with the order unchanged.
For example if your input list was: [ 4, 7, 1, 3, 7, 4, 8, 7 ] you expect the [contents of a] to be [ 7, 4, 7 ], and therefore the result would be 7 (<-- edited thanks to mjwills).
This is probably bad form because it uses a mutable accumulator, but it would work:
arr.Aggregate(
new { Seen = new HashSet<Int32>(), Result = new List<Int32>() },
(acc, val) => {
if (!acc.Seen.Add(val)) {
// If we've seen this value before, add it to the list
acc.Result.Add(val);
}
return acc;
}).Result
Obviously this does not remove elements from the original array, which is because it is not valid to remove elements from an array while you are using an enumerator with it.
If you don't care about the original order of the items then this is super simple:
List<string> result = arr.GroupBy(x => x).SelectMany(xs => xs.Skip(1)).ToList();
Done. That removes one of each instance from the list.
If the original order is important then this is what you need:
List<string> result =
arr
.Select((x, n) => new { x, n })
.GroupBy(y => y.x)
.SelectMany(ys => ys.Skip(1))
.OrderBy(y => y.n)
.Select(y => y.x)
.ToList();
If I run that on var arr = new string[] { "x", "y", "z", "z", "x", }; then I get "z", "x" as expected.
Well, I think Except operator is more suitable here:
return a.Except(b).FirstOrDefault(-1);
Try this-
a.GroupBy(v => v).Where(c => c.Count() > 1).FirstOrDefault().Key;
Related
I would like to sort a List<string> in a particular way. Below is a unit test showing the input, the specific way (which I am calling a "hierarchy" - feel free to correct my terminology so that I may learn), and the desired output. The code should be self explanatory.
[Test]
public void CustomSortByHierarchy()
{
List<string> input = new List<string>{"TJ", "DJ", "HR", "HR", "TJ"};
List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };
List<string> sorted = input.Sort(hierarchy); // <-- does not compile. How do I sort by the hierarchy?
// ...and if the sort worked as desired, these assert statements would return true:
Assert.AreEqual("HR", sorted[0]);
Assert.AreEqual("HR", sorted[1]);
Assert.AreEqual("TJ", sorted[2]);
Assert.AreEqual("TJ", sorted[3]);
Assert.AreEqual("DJ", sorted[4]);
}
Another way to do it:
var hierarchy = new Dictionary<string, int>{
{ "HR", 1},
{ "TJ", 2},
{ "DJ", 3} };
var sorted = strings.OrderBy(s => hierarchy[s]).ToList();
There are so many ways to do this.
It's not great to create a static dictionary - especially when you have a static list of the values already in the order that you want (i.e. List<string> hierarchy = new List<string>{"HR", "TJ", "DJ" };). The problem with a static dictionary is that it is static - to change it you must recompile your program - and also it's prone to errors - you might mistype a number. It's best to dynamically create the dictionary. That way you can adjust your hierarchy at run-time and use it to order your input.
Here's the basic way to create the dictionary:
Dictionary<string, int> indices =
hierarchy
.Select((value, index) => new { value, index })
.ToDictionary(x => x.value, x => x.index);
Then it's an easy sort:
List<string> sorted = input.OrderBy(x => indices[x]).ToList();
However, if you have a missing value in the hierarchy then this will blow up with a KeyNotFoundException exception.
Try with this input:
List<string> input = new List<string> { "TJ", "DJ", "HR", "HR", "TJ", "XX" };
You need to decide if you are removing missing items from the list or concatenating them at the end of the list.
To remove you'd do this:
List<string> sorted =
input
.Where(x => indices.ContainsKey(x))
.OrderBy(x => indices[x])
.ToList();
Or to sort to the end you'd do this:
List<string> sorted =
input
.OrderBy(x => indices.ContainsKey(x) ? indices[x] : int.MaxValue)
.ThenBy(x => x) // groups missing items together and is optional
.ToList();
If you simply want to remove items from input that aren't in hierarchy then there are a couple of other options that might be appealing.
Try this:
List<string> sorted =
(
from x in input
join y in hierarchy.Select((value, index) => new { value, index })
on x equals y.value
orderby y.index
select x
).ToList();
Or this:
ILookup<string, string> lookup = input.ToLookup(x => x);
List<string> sorted = hierarchy.SelectMany(x => lookup[x]).ToList();
Personally, I like this last one. It's a two liner and it doesn't rely on indices at all.
I have two lists of strings:
List<string> a;
List<string> b;
With a.RemoveAll() I remove empty elements or specific values in list a.
Now I want to remove elements at the same index in list b related to the indexes that were removed with a.RemoveAll()
How can I accomplish that?
I already have coded a workaround with for ()... loops but the code looks more than awkward. There should be a better solution
Let me try to rephrase your question. You want to remove the elements at certain indices of b. Which indices exactly? Those indices of a that would have been removed if a.RemoveAll(somePredicate) were called. Did I understand correctly?
You can use LINQ:
a.Select((x, y) => (element: x, index: y))
.Where(x => somePredicate(x.element) && x.index < b.Count)
.OrderByDescending(x => x.index)
.ToList().ForEach(x => b.RemoveAt(x.index));
If you don't mind creating a new list, it can be shorter:
var newList = b.Select((x, y) => (element: x, index: y))
.Where(x => somePredicate(a[x.index])).ToList();
var myList = new List<string> { "A", "x", "x", "B" };
if you want to find indexes of "x"
var indexes = Enumerable.Range(0, myList.Count).Where(x => myList[x] == "x").ToList();
Given:
class C
{
public string Field1;
public string Field2;
}
template = new [] { "str1", "str2", ... }.ToList() // presents allowed values for C.Field1 as well as order
list = new List<C> { ob1, ob2, ... }
Question:
How can I perform Linq's
list.OrderBy(x => x.Field1)
which will use template above for order (so objects with Field1 == "str1" come first, than objects with "str2" and so on)?
In LINQ to Object, use Array.IndexOf:
var ordered = list
.Select(x => new { Obj = x, Index = Array.IndexOf(template, x.Field1)})
.OrderBy(p => p.Index < 0 ? 1 : 0) // Items with missing text go to the end
.ThenBy(p => p.Index) // The actual ordering happens here
.Select(p => p.Obj); // Drop the index from the result
This wouldn't work in EF or LINQ to SQL, so you would need to bring objects into memory for sorting.
Note: The above assumes that the list is not exhaustive. If it is, a simpler query would be sufficient:
var ordered = list.OrderBy(x => Array.IndexOf(template, x.Field1));
I think IndexOf might work here:
list.OrderBy(_ => Array.IndexOf(template, _.Field1))
Please note that it will return -1 when object is not present at all, which means it will come first. You'll have to handle this case. If your field is guaranteed to be there, it's fine.
As others have said, Array.IndexOf should do the job just fine. However, if template is long and or list is long, it might be worthwhile transforming your template into a dictionary. Something like:
var templateDict = template.Select((item,idx) => new { item, idx })
.ToDictionary(k => k.item, v => v.idx);
(or you could just start by creating a dictionary instead of an array in the first place - it's more flexible when you need to reorder stuff)
This will give you a dictionary keyed off the string from template with the index in the original array as your value. Then you can sort like this:
var ordered = list.OrderBy(x => templateDict[x.Field1]);
Which, since lookups in a dictionary are O(1) will scale better as template and list grow.
Note: The above code assumes all values of Field1 are present in template. If they are not, you would have to handle the case where x.Field1 isn't in templateDict.
var orderedList = list.OrderBy(d => Array.IndexOf(template, d.MachingColumnFromTempalate) < 0 ? int.MaxValue : Array.IndexOf(template, d.MachingColumnFromTempalate)).ToList();
I've actually written a method to do this before. Here's the source:
public static IOrderedEnumerable<T> OrderToMatch<T, TKey>(this IEnumerable<T> source, Func<T, TKey> sortKeySelector, IEnumerable<TKey> ordering)
{
var orderLookup = ordering
.Select((x, i) => new { key = x, index = i })
.ToDictionary(k => k.key, v => v.index);
if (!orderLookup.Any())
{
throw new ArgumentException("Ordering collection cannot be empty.", nameof(ordering));
}
T[] sourceArray = source.ToArray();
return sourceArray
.OrderBy(x =>
{
int index;
if (orderLookup.TryGetValue(sortKeySelector(x), out index))
{
return index;
}
return Int32.MaxValue;
})
.ThenBy(x => Array.IndexOf(sourceArray, x));
}
You can use it like this:
var ordered = list.OrderToMatch(x => x.Field1, template);
If you want to see the source, the unit tests, or the library it lives in, you can find it on GitHub. It's also available as a NuGet package.
I want to access the first, second, third elements in a list. I can use built in .First() method for accessing first element.
My code is as follows:
Dictionary<int, Tuple<int, int>> pList = new Dictionary<int, Tuple<int, int>>();
var categoryGroups = pList.Values.GroupBy(t => t.Item1);
var highestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.First();
var 2ndHighestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.GetNth(1);
var 3rdHighestCount = categoryGroups
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.GetNth(2);
twObjClus.WriteLine("--------------------Cluster Label------------------");
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
highestCount.Category, highestCount.Count);
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
2ndHighestCount.Category, 2ndHighestCount.Count);
// Error here i.e. "Can't use 2ndHighestCount.Category here"
twObjClus.WriteLine("\n");
twObjClus.WriteLine("Category:{0} Count:{1}",
3rdHighestCount.Category, 3rdHighestCount.Count);
// Error here i.e. "Can't use 3rdHighestCount.Category here"
twObjClus.WriteLine("\n");
I have written extension method GetNth() as:
public static IEnumerable<T> GetNth<T>(this IEnumerable<T> list, int n)
{
if (n < 0)
throw new ArgumentOutOfRangeException("n");
if (n > 0){
int c = 0;
foreach (var e in list){
if (c % n == 0)
yield return e;
c++;
}
}
}
Can I write extension methods as .Second(), .Third() similar to
built in method .First() to access second and third indices?
If what you're looking for is a single object, you don't need to write it yourself, because a built-in method for that already exists.
foo.ElementAt(1)
will get you the second element, etc. It works similarly to First and returns a single object.
Your GetNth method seems to be returning every Nth element, instead of just the element at index N. I'm assuming that's not what you want since you said you wanted something similar to First.
Since #Eser gave up and doesn't want to post the correct way as an answer, here goes:
You should rather do the transforms once, collect the results into an array, and then get the three elements from that. The way you're doing it right now results in code duplication as well as grouping and ordering being done multiple times, which is inefficient.
var highestCounts = pList.Values
.GroupBy(t => t.Item1)
.OrderByDescending(g => g.Count())
.Select(g => new { Category = g.Key, Count = g.Count() })
.Take(3)
.ToArray();
// highestCounts[0] is the first count
// highestCounts[1] is the second
// highestCounts[2] is the third
// make sure to handle cases where there are less than 3 items!
As an FYI, if you some day need just the Nth value and not the top three, you can use .ElementAt to access values at an arbitrary index.
I have a list of bool, and a list of strings. I want to use IEnumerable.Zip to combine the lists, so if the value at each index of the first list is true, the result contains the corresponding item from the second list.
In other words:
List<bool> listA = {true, false, true, false};
List<string> listB = {"alpha", "beta", "gamma", "delta"};
IEnumerable<string> result = listA.Zip(listB, [something]);
//result contains "alpha", "gamma"
The simplest solution I could come up with is:
listA.Zip(listB, (a, b) => a ? b : null).Where(a => a != null);
...but I suspect there's a simpler way to do this. Is there?
I think this is simpler:
listA
.Zip(listB, (a, b) => new { a, b } )
.Where(pair => pair.a)
.Select(pair => pair.b);
That logically separates the steps. First, combine the lists. Next, filter. No funky conditionals, just read it top to bottom and immediately get it.
You can even name it properly:
listA
.Zip(listB, (shouldIncludeValue, value) => new { shouldIncludeValue, value } )
.Where(pair => pair.shouldIncludeValue)
.Select(pair => pair.value);
I love self-documenting, obvious code.
This is as short as I could get it:
var items = listB.Where((item, index) => listA[index]);
Where has an overload that provides the index. You can use that to pull the corresponding item in the bool list.
listA.Zip(listB, (a, b) => new { a, b }).Where(x => x.a).Select(x => x.b);
It uses anonymous type to handle Zip method subresults.
You don't need to use Zip if you can index into listA:
var res = listB.Where((a, idx) => listA[idx]);