Use Linq to find consecutively repeating elements - c#

Let's assume I have a list with objects of type Value. Value has a Name property:
private List<Value> values = new List<Value> {
new Value { Id = 0, Name = "Hello" },
new Value { Id = 1, Name = "World" },
new Value { Id = 2, Name = "World" },
new Value { Id = 3, Name = "Hello" },
new Value { Id = 4, Name = "a" },
new Value { Id = 5, Name = "a" },
};
Now I want to get a list of all "repeating" values (elements where the name property was identical with the name property of the previous element).
In this example I want a list with the two elements "world" and "a" (id = 2 and 5) to be returned.
Is this event possible with linq?
Of course I could so smth. like this:
List<Value> tempValues = new List<Value>();
String lastName = String.Empty();
foreach (var v in values)
{
if (v.Name == lastName) tempValues.Add(v);
lastName = v.Name;
}
but since I want to use this query in a more complex context, maybe there is a "linqish" solution.

There won't be anything built in along those lines, but if you need this frequently you could roll something bespoke but fairly generic:
static IEnumerable<TSource> WhereRepeated<TSource>(
this IEnumerable<TSource> source)
{
return WhereRepeated<TSource,TSource>(source, x => x);
}
static IEnumerable<TSource> WhereRepeated<TSource, TValue>(
this IEnumerable<TSource> source, Func<TSource, TValue> selector)
{
using (var iter = source.GetEnumerator())
{
if (iter.MoveNext())
{
var comparer = EqualityComparer<TValue>.Default;
TValue lastValue = selector(iter.Current);
while (iter.MoveNext())
{
TValue currentValue = selector(iter.Current);
if (comparer.Equals(lastValue, currentValue))
{
yield return iter.Current;
}
lastValue = currentValue;
}
}
}
}
Usage:
foreach (Value value in values.WhereRepeated(x => x.Name))
{
Console.WriteLine(value.Name);
}
You might want to think about what to do with triplets etc - currently everything except the first will be yielded (which matches your description), but that might not be quite right.

You could implement a Zip extension, then Zip your list with .Skip(1) and then Select the rows that match.
This should work and be fairly easy to maintain:
values
.Skip(1)
.Zip(items, (first,second) => first.Name==second.Name?first:null)
.Where(i => i != null);
The slight disadvantage of this method is that you iterate through the list twice.

I think this would work (untested) -- this will give you both the repeated word and it's index. For multiple repeats you could traverse this list and check for consecutive indices.
var query = values.Where( (v,i) => values.Count > i+1 && v == values[i+1] )
.Select( (v,i) => new { Value = v, Index = i } );

Here's another simple approach that should work if the IDs are always sequential as in your sample:
var data = from v2 in values
join v1 in values on v2.Id equals v1.Id + 1
where v1.Name == v2.Name
select v2;

I know this question is ancient but I was just working on the same thing so ....
static class utils
{
public static IEnumerable<T> FindConsecutive<T>(this IEnumerable<T> data, Func<T,T,bool> comparison)
{
return Enumerable.Range(0, data.Count() - 1)
.Select( i => new { a=data.ElementAt(i), b=data.ElementAt(i+1)})
.Where(n => comparison(n.a, n.b)).Select(n => n.a);
}
}
Should work for anything - just provide a function to compare the elements

You could use the GroupBy extension to do this.

Something like this
var dupsNames =
from v in values
group v by v.Name into g
where g.Count > 1 // If a group has only one element, just ignore it
select g.Key;
should work. You can then use the results in a second query:
dupsNames.Select( d => values.Where( v => v.Name == d ) )
This should return a grouping with key=name, values = { elements with name }
Disclaimer: I did not test the above, so I may be way off.

Related

First found duplicate element using LINQ (Not contiguous)

How do I print the first duplicate elements from an array?
var arr = new int[]{ 3, 2, 5, 1, 5, 4, 2, 15 };
Currently this method print 2 instead of 5.
public int FirstDuplicate(int[] arr)
{
var firstDup = arr
.GroupBy(x => x)
.Where(grp => grp.Count() == 2)
.Select(grp => grp.Key)
.FirstOrDefault();
if (firstDup > 0) return firstDup;
return -1;
}
You can write an extension metod that will return all duplicates from an IEnumerable<T> like
public static class EnumerableExtensions
{
public static IEnumerable<T> Duplicates<T>( this IEnumerable<T> source )
{
var hashset = new HashSet<T>();
foreach ( var item in source )
{
if ( hashset.Contains(item) )
yield return item;
else
hashset.Add(item);
}
}
}
and then use it
var arr = new int[]{ 3, 2, 5, 5, 4, 2, 15 };
var firstDuplicate = arr.Duplicates().First();
see .net fiddle example
This worked for me. I took advantage of comparing values with array indexes, the Distinct() method, and the first element of the resulting array.
var arr = new int[] { 3, 2, 5, 5, 4, 2, 15 };
var adjacentDuplicate = arr.Skip(1) // Skip first
.Where((value,index) => value == arr[index])
.Distinct()
.ToArray(); // Convert to array
if (adjacentDuplicate.Any())
{
Console.WriteLine(adjacentDuplicate[0]); // Print first duplicate
}
else
{
// No duplicates found.
}
Based on Sir Rufo answer, I would make two extensions
public static IEnumerable<T> Duplicates<T>(this IEnumerable<T> source)
{
var hashset = new HashSet<T>();
foreach (var item in source)
{
if (!hashset.Add(item))
{
yield return item;
}
}
}
public static IEnumerable<T?> AsNullable<T>(this IEnumerable<T> source) where T : struct
{
return source.Select(x => (T?)x);
}
You can use it like
var duplicate = arr
.Duplicates()
.AsNullable()
.FirstOrDefault();
The AsNullableconverts int into int? without hard coding the type. When the result is null, there is no duplicity. You can use it in more situations, like calculating the Max of potentially empty sequence of non nullable values (you can define it for IQueryable too). The advantage of this extension is, that when you use it, you know for sure, that null is not valid value in the source. And you would not shoot yourself into the leg when the null suddenly becomes a possible value.

How do I get previous element of a List<T>

i have a list like
var myList = new List<object> {
new { Day = "Sunday", ID = 15 },
new { Day = "Monday", ID = 20 },
new { Day = "Tuesday", ID = 80 }
};
now i would like to get the previous day of a given ID.
e.g. 80 leads to Monday and Sunday should be the result for 20. The List is ordered by ID!
Ist there a easy way to determine the day-value? ideal would be a linq solution.
var result = myList.LastOrDefault(item => item.ID < 80);
A quick n dirty
way:
myList.Where(c => c.ID < currentId).OrderByDescending(c => c.ID)
.Select(c => c.Day).FirstOrDefault();
myList.TakeWhile(o => o.ID <= givenID).Select(o => o.Day).LastOrDefault();
This is assuming myList is ordered. Otherwise you can just throw in an OrderBy()
(Based on Calculate difference from previous item with LINQ)
for a pure linq way to find the prev value
var value = myList.SelectWithPrevious((prev, cur) => new { prev = prev, cur= cur}).Where((w) => w.cur.ID == 80).First();
Console.WriteLine(value.prev.Day);
Extension
public static IEnumerable<TResult> SelectWithPrevious<TSource, TResult> (this IEnumerable<TSource> source,Func<TSource, TSource, TResult> projection)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
TSource previous = iterator.Current;
while (iterator.MoveNext())
{
yield return projection(previous, iterator.Current);
previous = iterator.Current;
}
}
}
This does give more than you needed. But, it may help in future.
var item = myList.Single(l => l.ID == 80);
var i = myList.IndexOf(item);
var j = (i - 1) % myList.Count;
var prevItem = myList[j];
This will get the item that you're searching for, find it's index in the list, subtract one (the modulo allows you to wrap around the end of the list) and then get the previous item from the list by index.
Edit: The use of the index rather than something like FindLast allows you to not need the ID values to be in ascending order with the days. e.g. Monday could be 100 and Tuesday 15 and it wouldn't make a difference.
Do not use var and object. Try this:
private class ItemList
{
/// <summary>
/// Day
/// </summary>
public string Day { get; set; }
/// <summary>
/// ID
/// </summary>
public int ID { get; set; }
}
And:
List<ItemList> myList = new List<ItemList> {
new ItemList() { Day="S", ID=10 },
new ItemList() { Day="M", ID=20 },
new ItemList() { Day="T", ID=30 },
};
foreach(ItemList key in myList)
{
ItemList result = myList.FindLast(x=>x.ID<key.ID);
if(result!=null)
{
MessageBox.Show("Prev ID: " + result.ID.ToString());
}
}

How to Order an out of sequence sequence

Consider this list of elements which has three properties, Value and Sequence Number and Group:
Value Sequence Number Group
Header, 1, null
Invoice, 2, 1
Invoice Line, 3, 1
Trailer, 4, null
The goal is only to sort by the sequence number. The value is irrelevant.
In the above, the obvious answer is to order by sequence number.
However, elements can repeat:
Header, 1, null
InvoiceA, 2, 1
Line Item, 3, 1
InvoiceB, 2, 2
Line Item, 3, 2
Trailer, 4, null
The above is the desired sequence. What Linq statement will produce the above?
Sorting by Sequence no longer works. Sorting by Group, then Sequence does not work.
The application of this is in EDI where the order of the data is significant.
So the first "trick" here is that you want all items with a null group to be separate groups, rather than having all null items combined into a single group.
This is actually fairly easy. We can just create an IEqualityComparer that compares items based on some other comparer, but that always considers two null items to be different, instead of being the same (typically two null items would be considered "equal").
public class SeparateNullComparer<T> : IEqualityComparer<T>
{
private IEqualityComparer<T> comparer;
public SeparateNullComparer(IEqualityComparer<T> comparer = null)
{
this.comparer = comparer ?? EqualityComparer<T>.Default;
}
public bool Equals(T x, T y)
{
if (x == null || y == null)
return false;
return comparer.Equals(x, y);
}
public int GetHashCode(T obj)
{
return comparer.GetHashCode(obj);
}
}
We can now group the items using this comparer so that all non-null items will be grouped together, whereas all of the null items will have their own groups.
Now how do we order the groups? We need to order these groups based on their sequence numbers, but we have a sequence of them, not just one, so we need a way of comparing two sequences to see which sequence comes first. We do this by checking the first item in each sequence, and then continually checking the next until one comes first or one ends and the other doesn't:
public class SequenceComparer<T> : IComparer<IEnumerable<T>>
{
private IComparer<T> comparer;
public SequenceComparer(IComparer<T> compareer = null)
{
this.comparer = comparer ?? Comparer<T>.Default;
}
public int Compare(IEnumerable<T> x, IEnumerable<T> y)
{
using (var first = x.GetEnumerator())
using (var second = x.GetEnumerator())
{
while (true)
{
var firstHasMore = first.MoveNext();
var secondHasMore = second.MoveNext();
if (!firstHasMore && !secondHasMore)
return 0;
var lengthComparison = firstHasMore.CompareTo(secondHasMore);
if (lengthComparison != 0)
return lengthComparison;
var nextComparison = comparer.Compare(first.Current, second.Current);
if (nextComparison != 0)
return nextComparison;
}
}
}
}
Combine that with flattening all of the groups back out when we're done, and we just need to put it all together:
var query = data.GroupBy(item => item.Group, new SeparateNullComparer<int?>())
.Select(group => group.OrderBy(item => item.SequenceNumber)
.ToList())
.OrderBy(group => group, new SequenceComparer<Foo>())
.ThenBy(group => group.First().Group)
.SelectMany(x => x);
You can also rely on the fact that GroupBy maintains the original order of items within groups, allowing you to order the data by SequenceNumber before grouping, instead of after. It'll do basically the same thing. It turns out to be a prettier query, but you just need to "know" that GroupBy maintains the proper ordering:
var query = data.OrderBy(item => item.SequenceNumber)
.GroupBy(item => item.Group, new SeparateNullComparer<int?>())
.OrderBy(group => group, new SequenceComparer<Foo>())
.ThenBy(group => group.Key)
.SelectMany(x => x);
If it doesn't have to be a linq query, you could write a single comparer that looks like this:
public class ValSeqGroupComparer : IComparer<ValSeqGroup>
{
public int Compare(ValSeqGroup x, ValSeqGroup y)
{
if (x == y) return 0;
// If only one has a group or there is no group in either
if (x.Group.HasValue ^ y.Group.HasValue || !x.Group.HasValue)
return x.Seq.CompareTo(y.Seq);
if (x.Group.Value != y.Group.Value)
return x.Group.Value.CompareTo(y.Group.Value);
return x.Seq.CompareTo(y.Seq);
}
}
Then using it like this:
[TestMethod]
public void One()
{
List<ValSeqGroup> items = new List<ValSeqGroup>()
{
new ValSeqGroup("x", 1, null),
new ValSeqGroup("x", 4, null),
new ValSeqGroup("x", 2, 1),
new ValSeqGroup("x", 2, 2),
new ValSeqGroup("x", 3, 1),
new ValSeqGroup("x", 3, 2)
};
items.Sort(new ValSeqGroupComparer());
foreach (var item in items)
{
Console.WriteLine("{0} {1} {2}", item.Value, item.Seq,item.Group);
}
}
You can achieve this by sorting the elements by Sequence Number (OrderBy(x => x.SequenceNumber)).
After that you can sort elements with exising group number (.Where(x => x.Group != null).OrderBy(x => x.Group))
In the end you have to insert null elements in list at the corresponding index.
var elements = new List<Element>
{
new Element{SequenceNumber = 1, Group = null}, new Element{SequenceNumber = 4, Group = null},new Element{SequenceNumber = 3, Group = 1},new Element{SequenceNumber = 3, Group = 3}, new Element{SequenceNumber = 3, Group = 2},new Element{SequenceNumber = 2, Group = 3},new Element{SequenceNumber = 2, Group = 1},new Element{SequenceNumber = 2, Group = 2}
};
// first sort
var sortedElements = elements.OrderBy(x => x.SequenceNumber).ToList();
// save null elements
var elementsWithNull = sortedElements
.Where(x => x.Group == null).ToList();
// group sorting
sortedElements = sortedElements
.Where(x => x.Group != null)
.OrderBy(x => x.Group).ToList();
// insert elements with null in sorted list
foreach (var element in elementsWithNull)
{
var firstIndexOfSequence = 0;
for (firstIndexOfSequence = 0;firstIndexOfSequence < sortedElements.Count && sortedElements[firstIndexOfSequence].SequenceNumber >= element.SequenceNumber; firstIndexOfSequence++)
{
// just to get index of the element with null group to know where to insert
}
sortedElements.Insert(firstIndexOfSequence, element);
}

All possible collections of present giving permutations

So Christmas is coming up and every year my family pulls names from a hat for who should buy present for who, and invariably there are concerns, mostly around spouses buying presents for each other.
Assume the families are like so:
List<List<string>> families = new List<List<string>> ()
{
new List<string>() { "A1", "A2" },
new List<string>() { "B1", "B2" },
new List<string>() { "C1", "C2" },
new List<string>() { "D1", "D2" }
};
People in family A can't buy for the others in their family, likewise for families B, C, D.
We can easily get a family from a given person with:
public static IEnumerable<string> FamilyOf(this List<List<string>> families, string person)
{
return families.Where(family => family.Contains(person)).First();
}
... and we can get all valid pairs with:
var everyone = families.SelectMany(family => family);
var pairs = from giver in everyone
from receiver in everyone
where !families.FamilyOf(giver).Contains(receiver)
select Tuple.Create(giver, receiver);
How can I turn this into the possible collections of permutations of valid givers/receivers that includes everyone? From that I'll just select a random collection.
I wrote a bit of code to solve your problem, but it can sometimes throw an exception, when it gets a bit "unlucky" with picking the pairs. For example if the algorithm pairs A1B2 B1C2 C1A2 -> so only D1 and D2 are left, which causes an exception since it doesn't meet your pairing requirement anymore.
Anyway here is the code, which you might want to expand to prevent it from throwing an exception:
var everyone = families.SelectMany(family => family).ToList();
everyone.Shuffle();
var randPairs = families.SelectMany(family => family)
.Select(p => new {
Giver = p,
Receiver = everyone.PopRandom(x => !p.Contains(x[0]))
});
And the two extension methods for IList:
public static T PopRandom<T>(this IList<T> list, Func<T, bool> predicate)
{
var predicatedList = list.Where(x => predicate(x));
int count = predicatedList.Count();
if (count == 0)
{
throw new Exception();
}
T item = predicatedList.ElementAt(Rand.Next(count));
while (item != null && !predicate(item))
{
item = predicatedList.ElementAt(Rand.Next(list.Count));
}
list.Remove(item);
return item;
}
public static void Shuffle<T>(this IList<T> list)
{
int n = list.Count;
while (n > 1)
{
n--;
int k = Rand.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}

Find the List index of the object containing the closest property value

How can I find the List index of the object containing the closest property value?
Sample, class MyData contains a property Position. class MyDataHandler has a List of MyData and the positions are: 1, 3, 14, 15, 22.
MyDataHandler has a method called GetClosestIndexAt, If the input value is 13, the method must return index 2.
Sample code:
public class MyData
{
public double Position { get; set; }
public string Name { get; set; }
}
public class MyDataHandler
{
private List<MyData> myDataList = new List<MyData>();
public MyDataHandler()
{
FillMyData(myDataList);
}
public int GetClosestIndexAt(double position)
{
int index = -1;
//How to get the index of the closest MyDataList.Position to position value.
//index = ?????
return index;
}
private void FillMyData(List<MyData> MyDataList)
{
//fill the data...
}
}
You can do it using LINQ, like this:
var res = myDataList
.Select((v, i) => new {Position = v.Position, Index = i}) // Pair up the position and the index
.OrderBy(p => Math.Abs(p.Position - position)) // Order by the distance
.First().Index; // Grab the index of the first item
The idea is to pair the position with its index in the list, order by the distance from the specific position, grab the first item, and get its index.
You need to deal with the situation when there's no elements in myDataList separately. Here is a demo on ideone.
Use overloaded Enumerable.Select method which projects each element of a sequence into a new form by incorporating the element's index:
myDataList.Select((d,i) => new { Position = d.Position, Index = i })
.OrderBy(x => Math.Abs(x.Position - position))
.Select(x => x.Index)
.DefaultIfEmpty(-1) // return -1 if there is no data in myDataList
.First();
Better solution with MinBy operator of MoreLinq (available from NuGet):
public int GetClosestIndexAt(double position)
{
if (!myDataList.Any())
return -1;
return myDataList.Select((d,i) => new { Position = d.Position, Index = i })
.MinBy(x => Math.Abs(x.Position - position))
.Index;
}
You can create your own MinBy extension if you don't want to use library:
public static TSource MinBy<TSource, TKey>(
this IEnumerable<TSource> source, Func<TSource, TKey> selector)
{
using (IEnumerator<TSource> sourceIterator = source.GetEnumerator())
{
if (!sourceIterator.MoveNext())
throw new InvalidOperationException("Empty sequence");
var comparer = Comparer<TKey>.Default;
TSource min = sourceIterator.Current;
TKey minKey = selector(min);
while (sourceIterator.MoveNext())
{
TSource current = sourceIterator.Current;
TKey currentKey = selector(current);
if (comparer.Compare(currentKey, minKey) >= 0)
continue;
min = current;
minKey = currentKey;
}
return min;
}
}
As I said in the comments, I believe the most efficient way is to avoid unnecessary sorting of whole data just to get the first element. We can just select it by searching for the element with minimum difference, calculated separately. It requires two list iteration but no sorting. Given:
var myDataList = new List<MyData>()
{
new MyData() { Name = "Name1", Position = 1.0 },
new MyData() { Name = "Name3", Position = 3.0 },
new MyData() { Name = "Name14", Position = 14.0 },
new MyData() { Name = "Name15", Position = 15.0 },
new MyData() { Name = "Name22", Position = 22.0 },
};
double position = 13.0;
you can write:
var result =
myDataList.Select((md, index) => new
{
Index = index,
Diff = Math.Abs(md.Position - position)
})
.Where(a => a.Diff == myDataList.Min(md => Math.Abs(md.Position - position)))
.First()
.Index;

Categories