How to Order an out of sequence sequence - c#

Consider this list of elements which has three properties, Value and Sequence Number and Group:
Value Sequence Number Group
Header, 1, null
Invoice, 2, 1
Invoice Line, 3, 1
Trailer, 4, null
The goal is only to sort by the sequence number. The value is irrelevant.
In the above, the obvious answer is to order by sequence number.
However, elements can repeat:
Header, 1, null
InvoiceA, 2, 1
Line Item, 3, 1
InvoiceB, 2, 2
Line Item, 3, 2
Trailer, 4, null
The above is the desired sequence. What Linq statement will produce the above?
Sorting by Sequence no longer works. Sorting by Group, then Sequence does not work.
The application of this is in EDI where the order of the data is significant.

So the first "trick" here is that you want all items with a null group to be separate groups, rather than having all null items combined into a single group.
This is actually fairly easy. We can just create an IEqualityComparer that compares items based on some other comparer, but that always considers two null items to be different, instead of being the same (typically two null items would be considered "equal").
public class SeparateNullComparer<T> : IEqualityComparer<T>
{
private IEqualityComparer<T> comparer;
public SeparateNullComparer(IEqualityComparer<T> comparer = null)
{
this.comparer = comparer ?? EqualityComparer<T>.Default;
}
public bool Equals(T x, T y)
{
if (x == null || y == null)
return false;
return comparer.Equals(x, y);
}
public int GetHashCode(T obj)
{
return comparer.GetHashCode(obj);
}
}
We can now group the items using this comparer so that all non-null items will be grouped together, whereas all of the null items will have their own groups.
Now how do we order the groups? We need to order these groups based on their sequence numbers, but we have a sequence of them, not just one, so we need a way of comparing two sequences to see which sequence comes first. We do this by checking the first item in each sequence, and then continually checking the next until one comes first or one ends and the other doesn't:
public class SequenceComparer<T> : IComparer<IEnumerable<T>>
{
private IComparer<T> comparer;
public SequenceComparer(IComparer<T> compareer = null)
{
this.comparer = comparer ?? Comparer<T>.Default;
}
public int Compare(IEnumerable<T> x, IEnumerable<T> y)
{
using (var first = x.GetEnumerator())
using (var second = x.GetEnumerator())
{
while (true)
{
var firstHasMore = first.MoveNext();
var secondHasMore = second.MoveNext();
if (!firstHasMore && !secondHasMore)
return 0;
var lengthComparison = firstHasMore.CompareTo(secondHasMore);
if (lengthComparison != 0)
return lengthComparison;
var nextComparison = comparer.Compare(first.Current, second.Current);
if (nextComparison != 0)
return nextComparison;
}
}
}
}
Combine that with flattening all of the groups back out when we're done, and we just need to put it all together:
var query = data.GroupBy(item => item.Group, new SeparateNullComparer<int?>())
.Select(group => group.OrderBy(item => item.SequenceNumber)
.ToList())
.OrderBy(group => group, new SequenceComparer<Foo>())
.ThenBy(group => group.First().Group)
.SelectMany(x => x);
You can also rely on the fact that GroupBy maintains the original order of items within groups, allowing you to order the data by SequenceNumber before grouping, instead of after. It'll do basically the same thing. It turns out to be a prettier query, but you just need to "know" that GroupBy maintains the proper ordering:
var query = data.OrderBy(item => item.SequenceNumber)
.GroupBy(item => item.Group, new SeparateNullComparer<int?>())
.OrderBy(group => group, new SequenceComparer<Foo>())
.ThenBy(group => group.Key)
.SelectMany(x => x);

If it doesn't have to be a linq query, you could write a single comparer that looks like this:
public class ValSeqGroupComparer : IComparer<ValSeqGroup>
{
public int Compare(ValSeqGroup x, ValSeqGroup y)
{
if (x == y) return 0;
// If only one has a group or there is no group in either
if (x.Group.HasValue ^ y.Group.HasValue || !x.Group.HasValue)
return x.Seq.CompareTo(y.Seq);
if (x.Group.Value != y.Group.Value)
return x.Group.Value.CompareTo(y.Group.Value);
return x.Seq.CompareTo(y.Seq);
}
}
Then using it like this:
[TestMethod]
public void One()
{
List<ValSeqGroup> items = new List<ValSeqGroup>()
{
new ValSeqGroup("x", 1, null),
new ValSeqGroup("x", 4, null),
new ValSeqGroup("x", 2, 1),
new ValSeqGroup("x", 2, 2),
new ValSeqGroup("x", 3, 1),
new ValSeqGroup("x", 3, 2)
};
items.Sort(new ValSeqGroupComparer());
foreach (var item in items)
{
Console.WriteLine("{0} {1} {2}", item.Value, item.Seq,item.Group);
}
}

You can achieve this by sorting the elements by Sequence Number (OrderBy(x => x.SequenceNumber)).
After that you can sort elements with exising group number (.Where(x => x.Group != null).OrderBy(x => x.Group))
In the end you have to insert null elements in list at the corresponding index.
var elements = new List<Element>
{
new Element{SequenceNumber = 1, Group = null}, new Element{SequenceNumber = 4, Group = null},new Element{SequenceNumber = 3, Group = 1},new Element{SequenceNumber = 3, Group = 3}, new Element{SequenceNumber = 3, Group = 2},new Element{SequenceNumber = 2, Group = 3},new Element{SequenceNumber = 2, Group = 1},new Element{SequenceNumber = 2, Group = 2}
};
// first sort
var sortedElements = elements.OrderBy(x => x.SequenceNumber).ToList();
// save null elements
var elementsWithNull = sortedElements
.Where(x => x.Group == null).ToList();
// group sorting
sortedElements = sortedElements
.Where(x => x.Group != null)
.OrderBy(x => x.Group).ToList();
// insert elements with null in sorted list
foreach (var element in elementsWithNull)
{
var firstIndexOfSequence = 0;
for (firstIndexOfSequence = 0;firstIndexOfSequence < sortedElements.Count && sortedElements[firstIndexOfSequence].SequenceNumber >= element.SequenceNumber; firstIndexOfSequence++)
{
// just to get index of the element with null group to know where to insert
}
sortedElements.Insert(firstIndexOfSequence, element);
}

Related

First found duplicate element using LINQ (Not contiguous)

How do I print the first duplicate elements from an array?
var arr = new int[]{ 3, 2, 5, 1, 5, 4, 2, 15 };
Currently this method print 2 instead of 5.
public int FirstDuplicate(int[] arr)
{
var firstDup = arr
.GroupBy(x => x)
.Where(grp => grp.Count() == 2)
.Select(grp => grp.Key)
.FirstOrDefault();
if (firstDup > 0) return firstDup;
return -1;
}
You can write an extension metod that will return all duplicates from an IEnumerable<T> like
public static class EnumerableExtensions
{
public static IEnumerable<T> Duplicates<T>( this IEnumerable<T> source )
{
var hashset = new HashSet<T>();
foreach ( var item in source )
{
if ( hashset.Contains(item) )
yield return item;
else
hashset.Add(item);
}
}
}
and then use it
var arr = new int[]{ 3, 2, 5, 5, 4, 2, 15 };
var firstDuplicate = arr.Duplicates().First();
see .net fiddle example
This worked for me. I took advantage of comparing values with array indexes, the Distinct() method, and the first element of the resulting array.
var arr = new int[] { 3, 2, 5, 5, 4, 2, 15 };
var adjacentDuplicate = arr.Skip(1) // Skip first
.Where((value,index) => value == arr[index])
.Distinct()
.ToArray(); // Convert to array
if (adjacentDuplicate.Any())
{
Console.WriteLine(adjacentDuplicate[0]); // Print first duplicate
}
else
{
// No duplicates found.
}
Based on Sir Rufo answer, I would make two extensions
public static IEnumerable<T> Duplicates<T>(this IEnumerable<T> source)
{
var hashset = new HashSet<T>();
foreach (var item in source)
{
if (!hashset.Add(item))
{
yield return item;
}
}
}
public static IEnumerable<T?> AsNullable<T>(this IEnumerable<T> source) where T : struct
{
return source.Select(x => (T?)x);
}
You can use it like
var duplicate = arr
.Duplicates()
.AsNullable()
.FirstOrDefault();
The AsNullableconverts int into int? without hard coding the type. When the result is null, there is no duplicity. You can use it in more situations, like calculating the Max of potentially empty sequence of non nullable values (you can define it for IQueryable too). The advantage of this extension is, that when you use it, you know for sure, that null is not valid value in the source. And you would not shoot yourself into the leg when the null suddenly becomes a possible value.

Return TRUE for ALL when specific property is smaller then other property for both lists

Its hard to explain so I make a simplified data sample here:
I have here 2 lists of different complex type:
list1:
{ Id = 1 , Value = 1 }; {Id = 2 , Value = 2 }; { Id = 3 , Value = 1.5}
list2
{ Id = 1 , Value = 1 }; {Id = 2 , Value = 2 }; { Id = 3 , Value = 1.5}
A comparison of both lists should return TRUE as each value of Value property is equal in both lists.
If just one of the Value properties value differs then the whole result must be FALSE.
How can I do that with linq preferd?
Try this, with LINQ's Zip method:
var result = list1.Zip(list2, (l1, l2) => l1.Value == l2.Value).All(x => x);
If you need to perform this check by Id property, then GroupJoin is what you are looking for.
It lets you group two different collections by selector and then join them:
bool ComplexCollectionValuesAreEqual(List<ComplexItem1> list1, List<ComplexItem2> list2)
{
try
{
var grouped = list1.GroupJoin(list2, x => x.Id, x => x.Id,
(outer, inners) => outer.Value == inners.Single().Value);
return grouped.All(x => x);
}
catch (InvalidOperationException) // for .Single() fail case
{
return false;
}
}
You can apply other comparison logic in the last lambda of GroupJoin, for example outer.Value <= inners.Single().Value to check if all Values in item1 are equal or less than corresponding values in item2.
Note that in this implementation it will return false, if there is no object with such ID in list2 collection. You may want to throw exceptions instead, if you always except it to exist.
bool isTrue = list1.Select((z, i) => z.Id != list2[i].Id || z.Value != list2[i].Value).Count() == 0;
If the collections are not sorted
bool isTrue = list1.Where(x => list2.First(y => y.Id == x.Id).Value != x.Value ).Count() == 0;
If the items should not have duplicates and count should be equal
bool isTrue = list1.Where(x => list2.First(y => y.Id == x.Id).Value == x.Value).Count() == list2.Count && list2.Count == list1.Count;

LINQ Distinct Count returns 1

Im making a delegate Func inside my method to check if schedualCode fits in a certain place in a list, where the limit is 3.
i want to count the distinct values of schedualCode in my list. my problem is that schedualCodeCount returns 1. when it should return 2.
this is my code
Func<string, bool> CheckTimeLimit = delegate (string schedualCode)
{
// check enrolled period count (where limit is 3)
//int periodCount = currentEnrollments.GroupBy(t => t.Times)
//.Select(t => t.Key.Select(key => key.PeriodCode == time.PeriodCode).Distinct()).Count();
var allTimes = currentEnrollments.SelectMany(key => key.Times).ToList();
List<string> schedualCodes = allTimes.Where(key => key.SchedualCode == schedualCode && key.ViewOnSchedual)
.Select(key => key.SchedualCode).ToList();
//schedualCodes List returns a list of count = 2 , and 2 strings exactly the same of value = "A1"
// Getting the distinct count of "A1"
int schedualCodeCount = schedualCodes.Distinct().Count();
// schedualCodeCount gets the value = 1, where it should be 2
// time fits if true
return schedualCodeCount < 3;
};
You are misunderstanding what Distinct does. You have two identical items, Distinct will remove the duplicates leaving you with 1. What you probably want to do is Group and then get the counts of each group.
For example:
var list = new List<string>() { "A1", "A1" };
Console.WriteLine(list.Count); // 2, obviously
var distinct = list.Distinct(); // select only the *distinct* values
Console.WriteLine(distinct.Count()); // 1 - because there is only 1 distinct value
var groups = list.GroupBy(s => s); // group your list (there will only be one
// in this case)
foreach (var g in groups) // for each group
{
// Display the number of items with the same key
Console.WriteLine(g.Key + ":" + g.Count());
}

How can I group by the difference between rows in a column with linq and c#?

I want to create a new group when the difference between the values in rows are greater then five.
Example:
int[] list = {5,10,15,40,45,50,70,75};
should give me 3 groups:
1,[ 5,10,15 ]
2,[40,45,50]
3,[70,75]
Is it possible to use Linq here?
Thx!
Exploiting side effects (group) is not a good practice, but can be helpful:
int[] list = { 5, 10, 15, 40, 45, 50, 70, 75 };
int step = 5;
int group = 1;
var result = list
.Select((item, index) => new {
prior = index == 0 ? item : list[index - 1],
item = item,
})
.GroupBy(pair => Math.Abs(pair.prior - pair.item) <= step ? group : ++group,
pair => pair.item);
Test:
string report = string.Join(Environment.NewLine, result
.Select(chunk => String.Format("{0}: [{1}]", chunk.Key, String.Join(", ", chunk))));
Outcome:
1: [5, 10, 15]
2: [40, 45, 50]
3: [70, 75]
Assuming collection has an indexer defined, can be something like this:
const int step = 5;
int currentGroup = 1;
var groups = list.Select((item, index) =>
{
if (index > 0 && item - step > list[index - 1])
{
currentGroup++;
}
return new {Group = currentGroup, Item = item};
}).GroupBy(i => i.Group).ToList();
In my opinion, just write a function to do it. This is easier to understand and more readable than the Linq examples given in other answers.
public static List<List<int>> Group(this IEnumerable<int> sequence, int groupDiff) {
var groups = new List<List<int>>();
List<int> currGroup = null;
int? lastItem = null;
foreach (var item in sequence) {
if (lastItem == null || item - lastItem.Value > groupDiff) {
currGroup = new List<int>{ item };
groups.Add(currGroup);
} else {
// add item to current group
currGroup.Add(item);
}
lastItem = item;
}
return groups;
}
And call it like this
List<List<int>> groups = Group(list, 5);
Assumption: list is sorted. If it is not sorted, just sort it first and use the above code.
Also: if you need groups to be an int[][] just use the Linq Method ToArray() to your liking.

Use Linq to find consecutively repeating elements

Let's assume I have a list with objects of type Value. Value has a Name property:
private List<Value> values = new List<Value> {
new Value { Id = 0, Name = "Hello" },
new Value { Id = 1, Name = "World" },
new Value { Id = 2, Name = "World" },
new Value { Id = 3, Name = "Hello" },
new Value { Id = 4, Name = "a" },
new Value { Id = 5, Name = "a" },
};
Now I want to get a list of all "repeating" values (elements where the name property was identical with the name property of the previous element).
In this example I want a list with the two elements "world" and "a" (id = 2 and 5) to be returned.
Is this event possible with linq?
Of course I could so smth. like this:
List<Value> tempValues = new List<Value>();
String lastName = String.Empty();
foreach (var v in values)
{
if (v.Name == lastName) tempValues.Add(v);
lastName = v.Name;
}
but since I want to use this query in a more complex context, maybe there is a "linqish" solution.
There won't be anything built in along those lines, but if you need this frequently you could roll something bespoke but fairly generic:
static IEnumerable<TSource> WhereRepeated<TSource>(
this IEnumerable<TSource> source)
{
return WhereRepeated<TSource,TSource>(source, x => x);
}
static IEnumerable<TSource> WhereRepeated<TSource, TValue>(
this IEnumerable<TSource> source, Func<TSource, TValue> selector)
{
using (var iter = source.GetEnumerator())
{
if (iter.MoveNext())
{
var comparer = EqualityComparer<TValue>.Default;
TValue lastValue = selector(iter.Current);
while (iter.MoveNext())
{
TValue currentValue = selector(iter.Current);
if (comparer.Equals(lastValue, currentValue))
{
yield return iter.Current;
}
lastValue = currentValue;
}
}
}
}
Usage:
foreach (Value value in values.WhereRepeated(x => x.Name))
{
Console.WriteLine(value.Name);
}
You might want to think about what to do with triplets etc - currently everything except the first will be yielded (which matches your description), but that might not be quite right.
You could implement a Zip extension, then Zip your list with .Skip(1) and then Select the rows that match.
This should work and be fairly easy to maintain:
values
.Skip(1)
.Zip(items, (first,second) => first.Name==second.Name?first:null)
.Where(i => i != null);
The slight disadvantage of this method is that you iterate through the list twice.
I think this would work (untested) -- this will give you both the repeated word and it's index. For multiple repeats you could traverse this list and check for consecutive indices.
var query = values.Where( (v,i) => values.Count > i+1 && v == values[i+1] )
.Select( (v,i) => new { Value = v, Index = i } );
Here's another simple approach that should work if the IDs are always sequential as in your sample:
var data = from v2 in values
join v1 in values on v2.Id equals v1.Id + 1
where v1.Name == v2.Name
select v2;
I know this question is ancient but I was just working on the same thing so ....
static class utils
{
public static IEnumerable<T> FindConsecutive<T>(this IEnumerable<T> data, Func<T,T,bool> comparison)
{
return Enumerable.Range(0, data.Count() - 1)
.Select( i => new { a=data.ElementAt(i), b=data.ElementAt(i+1)})
.Where(n => comparison(n.a, n.b)).Select(n => n.a);
}
}
Should work for anything - just provide a function to compare the elements
You could use the GroupBy extension to do this.
Something like this
var dupsNames =
from v in values
group v by v.Name into g
where g.Count > 1 // If a group has only one element, just ignore it
select g.Key;
should work. You can then use the results in a second query:
dupsNames.Select( d => values.Where( v => v.Name == d ) )
This should return a grouping with key=name, values = { elements with name }
Disclaimer: I did not test the above, so I may be way off.

Categories