how to split list using linq - c#

i have List of int which consists of value 0,0,0,1,2,3,4,0,0 now i like to split this into 3 lists like this list A consists 0,0,0 and List B consists 1,2,3,4 and List C consists 0,0.I know how split using if and for,but how can i do this using linq. usual format i need split in starting some zeros and in middle some values and in last some zeros i need to split this first zeros in one list ,middle values in one list and end zeros in another list as i say in example above here using linq and also i like to take the index that values.

first one.
myList.TakeWhile(x => x==0)
second one.
myList.SkipWhile(x => x==0).TakeWhile(x => x!= 0)
third one.
myList.SkipWhile(x => x==0).SkipWhile(x => x!= 0)

If you want to split by zero sequence then try this code:
static void Main(string[] argv)
{
var list = new[] { 0, 0, 0, 1, 2, 3, 4, 0, 0 };
int groupIndex = 0;
var result = list.Select(
(e, i) =>
{
if (i == 0)
{
return new {val = e, group = groupIndex};
}
else
{
groupIndex =
(e != 0 && list[i - 1] == 0) || (e == 0 && list[i - 1] != 0)
?
groupIndex + 1
: groupIndex;
return new {val = e, group = groupIndex};
}
}
).GroupBy(e => e.group).Select(e => e.Select(o => o.val).ToList()).ToList();
foreach (var item in result)
{
foreach (var val in item)
{
Console.Write(val + ";");
}
Console.WriteLine();
Console.WriteLine("Count:" + item.Count);
Console.WriteLine();
}
Console.ReadLine();
}
Output is:
0;0;0;
Count:3
1;2;3;4;
Count:4
0;0;
Count:2
It is really not clear what is a criteria of split from your question. If I gave wrong answer then explain your question.

You can use the Skip and Take methods exposed by Linq to Objects to grab certain elements of a sequence.
var myList = new int[] {0,0,0,1,2,3,4,0,0};
var list1 = myList.Take(3);
var list2 = myList.Skip(3).Take(4);
var list3 = myList.Skip(7);

You can use Take(n) or Skip(n) in linq
List<int> list = new List<int>();
list.Add(0);
list.Add(0);
list.Add(0);
list.Add(1);
list.Add(2);
list.Add(3);
list.Add(4);
list.Add(0);
list.Add(0);
var listOne = list.Take(3);
var listSecond = list.Skip(3).Take(4);
var listThird = list.Skip(7);

Related

Remove the repeating items and return the order number

I want to remove the repeating items of a list.I can realize it whit Distinct() easily.But i also need to get the order number of the items which have been removed.I can't find any function in linq to solve the problem and finally realize it with the following code:
public List<string> Repeat(List<string> str)
{
var Dlist = str.Distinct();
List<string> repeat = new List<string>();
foreach (string aa in Dlist)
{
int num = 0;
string re = "";
for (int i = 1; i <= str.LongCount(); i++)
{
if (aa == str[i - 1])
{
num = num + 1;
re = re + " - " + i;
}
}
if (num > 1)
{
repeat.Add(re.Substring(3));
}
}
return repeat;
}
Is there any other way to solve the problem more simple? Or is there any function in linq I missed?Any advice will be appreciated.
This query does exactly the same as your function, if I'm not mistaken:
var repeated = str.GroupBy(s => s).Where(group => group.Any())
.Select(group =>
{
var indices = Enumerable.Range(1, str.Count).Where(i => str[i-1] == group.Key).ToList();
return string.Join(" - ", group.Select((s, i) => indices[i]));
});
It firstly groups the items of the original list, so that every item with the same content is in a group. Then it searches for all indices of the items in the group in the original list, so that we have all the indices of the original items of the group. Then it joins the indices to a string, so that the resulting format is similiar to the one you requested. You could also transform this statement lambda to an anonymous lambda:
var repeated = str.GroupBy(s => s).Where(group => group.Any())
.Select(group => string.Join(" - ",
group.Select((s, i) =>
Enumerable.Range(1, str.Count).Where(i2 => str[i2 - 1] == group.Key).ToList()[i])));
However, this significantly reduces performance.
I tested this with the following code:
public static void Main()
{
var str = new List<string>
{
"bla",
"bla",
"baum",
"baum",
"nudel",
"baum",
};
var copy = new List<string>(str);
var repeated = str.GroupBy(s => s).Where(group => group.Any())
.Select(group => string.Join(" - ",
group.Select((s, i) =>
Enumerable.Range(1, str.Count).Where(i2 => str[i2 - 1] == group.Key).ToList()[i])));
var repeated2 = Repeat(str);
var repeated3 = str.GroupBy(s => s).Where(group => group.Any())
.Select(group =>
{
var indices = Enumerable.Range(1, str.Count).Where(i => str[i-1] == group.Key).ToList();
return string.Join(" - ", group.Select((s, i) => indices[i]));
});
Console.WriteLine(string.Join("\n", repeated) + "\n");
Console.WriteLine(string.Join("\n", repeated2) + "\n");
Console.WriteLine(string.Join("\n", repeated3));
Console.ReadLine();
}
public static List<string> Repeat(List<string> str)
{
var distinctItems = str.Distinct();
var repeat = new List<string>();
foreach (var item in distinctItems)
{
var added = false;
var reItem = "";
for (var index = 0; index < str.LongCount(); index++)
{
if (item != str[index])
continue;
added = true;
reItem += " - " + (index + 1);
}
if (added)
repeat.Add(reItem.Substring(3));
}
return repeat;
}
Which has the followin output:
1 - 2
3 - 4 - 6
5
1 - 2
3 - 4 - 6
5
1 - 2
3 - 4 - 6
5
Inside your repeat method you can use following way to get repeated items
var repeated = str.GroupBy(s=>s)
.Where(grp=>grp.Count()>1)
.Select(y=>y.Key)
.ToList();

How can I group by the difference between rows in a column with linq and c#?

I want to create a new group when the difference between the values in rows are greater then five.
Example:
int[] list = {5,10,15,40,45,50,70,75};
should give me 3 groups:
1,[ 5,10,15 ]
2,[40,45,50]
3,[70,75]
Is it possible to use Linq here?
Thx!
Exploiting side effects (group) is not a good practice, but can be helpful:
int[] list = { 5, 10, 15, 40, 45, 50, 70, 75 };
int step = 5;
int group = 1;
var result = list
.Select((item, index) => new {
prior = index == 0 ? item : list[index - 1],
item = item,
})
.GroupBy(pair => Math.Abs(pair.prior - pair.item) <= step ? group : ++group,
pair => pair.item);
Test:
string report = string.Join(Environment.NewLine, result
.Select(chunk => String.Format("{0}: [{1}]", chunk.Key, String.Join(", ", chunk))));
Outcome:
1: [5, 10, 15]
2: [40, 45, 50]
3: [70, 75]
Assuming collection has an indexer defined, can be something like this:
const int step = 5;
int currentGroup = 1;
var groups = list.Select((item, index) =>
{
if (index > 0 && item - step > list[index - 1])
{
currentGroup++;
}
return new {Group = currentGroup, Item = item};
}).GroupBy(i => i.Group).ToList();
In my opinion, just write a function to do it. This is easier to understand and more readable than the Linq examples given in other answers.
public static List<List<int>> Group(this IEnumerable<int> sequence, int groupDiff) {
var groups = new List<List<int>>();
List<int> currGroup = null;
int? lastItem = null;
foreach (var item in sequence) {
if (lastItem == null || item - lastItem.Value > groupDiff) {
currGroup = new List<int>{ item };
groups.Add(currGroup);
} else {
// add item to current group
currGroup.Add(item);
}
lastItem = item;
}
return groups;
}
And call it like this
List<List<int>> groups = Group(list, 5);
Assumption: list is sorted. If it is not sorted, just sort it first and use the above code.
Also: if you need groups to be an int[][] just use the Linq Method ToArray() to your liking.

Combine TakeWhile and SkipWhile to partition collection

I would like to partition collection on item, which matches specific condition. I can do that using TakeWhile and SkipWhile, which is pretty easy to understand:
public static bool IsNotSeparator(int value) => value != 3;
var collection = new [] { 1, 2, 3, 4, 5 };
var part1 = collection.TakeWhile(IsNotSeparator);
var part2 = collection.SkipWhile(IsNotSeparator);
But this would iterate from start of collection twice and if IsNotSeparator takes long it might be performance issue.
Faster way would be to use something like:
var part1 = new List<int>();
var index = 0;
for (var max = collection.Length; index < max; ++index) {
if (IsNotSeparator(collection[i]))
part1.Add(collection[i]);
else
break;
}
var part2 = collection.Skip(index);
But that's really less more readable than first example.
So my question is: what would be the best solution to partition collection on specific element?
What I though of combining those two above is:
var collection = new [] { 1, 2, 3, 4, 5 };
var part1 = collection.TakeWhile(IsNotSeparator).ToList();
var part2 = collection.Skip(part1.Count);
This is a quick example of how you would do the more general method (multiple splits, as mentioned in the comments), without LINQ (it's possible to convert it to LINQ, but I am not sure if it will be any more readable, and I am in a slight hurry right now):
public static IEnumerable<IEnumerable<T>> Split<T>(this IList<T> list, Predicate<T> match)
{
if (list.Count == 0)
yield break;
var chunkStart = 0;
for (int i = 1; i < list.Count; i++)
{
if (match(list[i]))
{
yield return new ListSegment<T>(list, chunkStart, i - 1);
chunkStart = i;
}
}
yield return new ListSegment<T>(list, chunkStart, list.Count - 1);
}
The code presumes a class named ListSegment<T> : IEnumerable<T> which simply iterates from from to to over the original list (no copying, similar to how ArraySegment<T> works (but is unfortunately limited to arrays).
So the code will return as many chunks as there are matches, i.e. this code:
var collection = new[] { "A", "B", "-", "C", "D", "-", "E" };
foreach (var chunk in collection.Split(i => i == "-"))
Console.WriteLine(string.Join(", ", chunk));
would print:
A, B
-, C, D
-, E
How about using the Array Copy methods:
var separator = 3;
var collection = new [] { 1, 2, 3, 4, 5 };
var i = Array.IndexOf(collection,separator);
int[] part1 = new int[i];
int[] part2 = new int[collection.Length - i];
Array.Copy(collection, 0, part1, 0, i );
Array.Copy(collection, i, part2, 0, collection.Length - i );
Alternatively to be more efficient use ArraySegment:
var i = Array.IndexOf(collection,separator);
var part1 = new ArraySegment<int>( collection, 0, i );
var part2 = new ArraySegment<int>( collection, i, collection.Length - i );
ArraySegment is a wrapper around an array that delimits a range of elements in that array. Multiple ArraySegment instances can refer to the same original array and can overlap.
Edit - add combination of original question with ArraySegment so as not to iterate collection twice.
public static bool IsNotSeparator(int value) => value != 3;
var collection = new [] { 1, 2, 3, 4, 5 };
var index = collection.TakeWhile(IsNotSeparator).Count();
var part1 = new ArraySegment<int>( collection, 0, index );
var part2 = new ArraySegment<int>( collection, index, collection.Length - index );

Iterating through c# array & placing objects sequentially into other arrays

Okay, so this seems simple, but I can't think of a straightforward solution;
Basically I have an object array in C# that contains, say, 102 elements. I then also have 4 other empty arrays. I want to iterate through the original array and distribute the 100 elements evenly, then distribute 101 and 102 to the 1st and 2nd new arrays respectively.
int i = 1,a=0, b=0, c=0, d = 0;
foreach (ReviewStatus data in routingData)
{
if (i == 1)
{
threadOneWork[a] = data;
a++;
}
if (i == 2)
{
threadTwoWork[b] = data;
b++;
}
if (i == 3)
{
threadThreeWork[c] = data;
c++;
}
if (i == 4)
{
threadFourWork[d] = data;
d++;
i = 0;
}
i++;
}
Now the above code definitely works, but I was curious, does anybody know of a better way to do this??
var workArrays = new[] {
threadOneWork,
threadTwoWork,
threadThreeWork,
threadFourWork,
};
for(int i=0; i<routingData.Length; i++) {
workArrays[i%4][i/4] = routingData[i];
}
Put the four arrays into an array of arrays, and use i%4 as an index. Assuming that thread###Work arrays have enough space to store the data, you can do this:
var tw = new[] {threadOneWork, threadTwoWork, threadThreeWork, threadFourWork};
var i = 0;
foreach (ReviewStatus data in routingData) {
tw[i%4][i/tw.Length] = data;
i++;
}
Linq is your friend! Use modulo to group the items via the total number of arrays in your case 4.
For example the code splits them up into four different lists:
var Items = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Items.Select( ( i, index ) => new {
category = index % 4,
value = i
} )
.GroupBy( itm => itm.category, itm => itm.value )
.ToList()
.ForEach( gr => Console.WriteLine("Group {0} : {1}", gr.Key, string.Join(",", gr)));
/* output
Group 0 : 1,5,9
Group 1 : 2,6,10
Group 2 : 3,7
Group 3 : 4,8
*/

Merge elements in IEnumerable according to a condition

I was looking for some fast and efficient method do merge items in array. This is my scenario. The collection is sorted by From. Adjacent element not necessarily differ by 1, that is there can be gaps between the last To and the next From, but they never overlap.
var list = new List<Range>();
list.Add(new Range() { From = 0, To = 1, Category = "AB" });
list.Add(new Range() { From = 2, To = 3, Category = "AB" });
list.Add(new Range() { From = 4, To = 5, Category = "AB" });
list.Add(new Range() { From = 6, To = 8, Category = "CD" });
list.Add(new Range() { From = 9, To = 11, Category = "AB" }); // 12 is missing, this is ok
list.Add(new Range() { From = 13, To = 15, Category = "AB" });
I would like the above collection to be merged in such way that the first three (this number can vary, from at least 2 elements to as many as the condition is satisfied) elements become one element. Cannot merge elements with different category.
new Range() { From = 0, To = 5, Category = "AB" };
So that the resulting collection would have 4 elements total.
0 - 5 AB
6 - 8 CD
9 - 11 AB // no merging here, 12 is missing
13 - 15 AB
I have a very large collection with over 2.000.000 items and I would like to this as efficiently as possible.
Here's a generic, reusable solution rather than an ad hoc, specific solution.
(Updated based on comments)
IEnumerable<T> Merge<T>(this IEnumerable<T> coll,
Func<T,T,bool> canBeMerged, Func<T,T,T>mergeItems)
{
using(IEnumerator<T> iter = col.GetEnumerator())
{
if (iter.MoveNext())
{
T lhs = iter.Current;
while(iter.MoveNext())
{
T rhs = iter.Current;
if (canBeMerged(lhs, rhs)
lhs=mergeItems(lhs, rhs);
else
{
yield return lhs;
lhs= rhs;
}
}
yield return lhs;
}
}
}
You will have to provide method to determine if the item can be merged, and to merge them.
These really should be part of your Range class, so it would be called like them:
list.Merge((l,r)=> l.IsFollowedBy(r), (l,r)=> l.CombineWith(r));
If you don't have these method, then you would have to call it like:
list.Merge((l,r)=> l.Category==r.Category && l.To +1 == r.From,
(l,r)=> new Range(){From = l.From, To=r.To, Category = l.Category});
Well, from the statement of the problem I think it is obvious that you cannot avoid iterating through the original collection of 2 million items:
var output = new List<Range>();
var currentFrom = list[0].From;
var currentTo = list[0].To;
var currentCategory = list[0].Category;
for (int i = 1; i < list.Count; i++)
{
var item = list[i];
if (item.Category == currentCategory && item.From == currentTo + 1)
currentTo = item.To;
else
{
output.Add(new Range { From = currentFrom, To = currentTo,
Category = currentCategory });
currentFrom = item.From;
currentTo = item.To;
currentCategory = item.Category;
}
}
output.Add(new Range { From = currentFrom, To = currentTo,
Category = currentCategory });
I’d be interested to see if there is a solution more optimised for performance.
Edit: I assumed that the input list is sorted. If it is not, I recommend sorting it first instead of trying to fiddle this into the algorithm. Sorting is only O(n log n), but if you tried to fiddle it in, you easily get O(n²), which is worse.
list.Sort((a, b) => a.From < b.From ? -1 : a.From > b.From ? 1 : 0);
As an aside, I wrote this solution because you asked for one that is performance-optimised. To this end, I didn’t make it generic, I didn’t use delegates, I didn’t use Linq extension methods, and I used local variables of primitive types and tried to avoid accessing object fields as much as possible.
Here's another one :
IEnumerable<Range> Merge(IEnumerable<Range> input)
{
input = input.OrderBy(r => r.Category).ThenBy(r => r.From).ThenBy(r => r.To).ToArray();
var ignored = new HashSet<Range>();
foreach (Range r1 in input)
{
if (ignored.Contains(r1))
continue;
Range tmp = r1;
foreach (Range r2 in input)
{
if (tmp == r2 || ignored.Contains(r2))
continue;
Range merged;
if (TryMerge(tmp, r2, out merged))
{
tmp = merged;
ignored.Add(r1);
ignored.Add(r2);
}
}
yield return tmp;
}
}
bool TryMerge(Range r1, Range r2, out Range merged)
{
merged = null;
if (r1.Category != r2.Category)
return false;
if (r1.To + 1 < r2.From || r2.To + 1 < r1.From)
return false;
merged = new Range
{
From = Math.Min(r1.From, r2.From),
To = Math.Max(r1.To, r2.To),
Category = r1.Category
};
return true;
}
You could use it directly:
var mergedList = Merge(list);
But that would be very inefficient it you have many items as the complexity is O(n²). However, since only items in the same category can be merged, you can group them by category and merge each group, then flatten the result:
var mergedList = list.GroupBy(r => r.Category)
.Select(g => Merge(g))
.SelectMany(g => g);
Assuming that the list is sorted -and- the ranges are non overlapping, as you have stated in the question, this will run in O(n) time:
var flattenedRanges = new List<Range>{new Range(list.First())};
foreach (var range in list.Skip(1))
{
if (flattenedRanges.Last().To + 1 == range.From && flattenedRanges.Last().Category == range.Category)
flattenedRanges.Last().To = range.To;
else
flattenedRanges.Add(new Range(range));
}
This is assuming you have a copy-constructor for Range
EDIT:
Here's an in-place algorithm:
for (int i = 1; i < list.Count(); i++)
{
if (list[i].From == list[i - 1].To+1 && list[i-1].Category == list[i].Category)
{
list[i - 1].To = list[i].To;
list.RemoveAt(i--);
}
}
EDIT:
Added the category check, and fixed the inplace version.

Categories