SortedSet.Remove() does not remove anything - c#

I am currently implementing Dijkstra's algorithm and I am using the C# SortedSet as a priority queue.
However, in order to keep track of what vertices I have already visited, I want to remove the first item from the priority queue.
Here is my code:
static int shortestPath(int start, int target)
{
SortedSet<int> PQ = new SortedSet<int>(new compareByVertEstimate());
for (int i = 0; i < n; i++)
{
if (i == start - 1)
vertices[i].estimate = 0;
else
vertices[i].estimate = int.MaxValue;
PQ.Add(i);
}
int noOfVisited = 0;
while (noOfVisited < n)
{
int u = PQ.First();
noOfVisited++;
foreach (Edge e in vertices[u].neighbours)
{
if (vertices[e.target.Item1].estimate > vertices[u].estimate + e.length)
{
vertices[e.target.Item1].estimate = vertices[u].estimate + e.length;
}
}
PQ.Remove(u);
}
return vertices[target - 1].estimate;
}
And this is the comparer:
public class compareByVertEstimate : IComparer<int>
{
public int Compare(int a, int b)
{
if (Program.vertices[a].estimate >= Program.vertices[b].estimate) return 1;
else return -1;
}
}
My priority queue does not explicitly hold vertices, instead I have an array of vertices and the priority queue holds the indices.
So the priority queue is sorted based on the 'estimate' integer that each vertex holds.
Now my problem is, I can easily retrieve the first element from the SortedSet using .First(), or .Min, but when I try to remove that element with .Remove(), the method returns false, and nothing gets removed. The SortedSet remains unchanged.
Any ideas on how to fix this?
Thanks in advance!
EDIT
I changed the Comparer to this:
public class compareByVertEstimate : IComparer<int>
{
public int Compare(int a, int b)
{
if (Program.vertices[a].estimate == Program.vertices[b].estimate) return 0;
else if (Program.vertices[a].estimate >= Program.vertices[b].estimate) return 1;
else return -1;
}
}
But now the priority queue doesn't contain all the right elements anymore.
(Note, the priority queue will contain pointers to vertices that have the same estimate value)

Your compare function never compares two elements as equal (return 0;).
Your collection will not be able to remove an element that is not equal to any element it holds.
Example:
public class compareByVertEstimate : IComparer<int>
{
public int Compare(int a, int b)
{
if (a == b) return 0;
if (Program.vertices[a].estimate >= Program.vertices[b].estimate) return 1;
return -1;
}
}
#hvd is correct of course, while the above version works, it's quite broken. A better comparer might be:
public class compareByVertEstimate : IComparer<int>
{
public int Compare(int a, int b)
{
var ae = Program.vertices[a].estimate;
var be = Program.vertices[b].estimate;
var result = ae.CompareTo(be);
if (result == 0) return a.CompareTo(b);
return result;
}
}

Related

c# - Binary Search Algorithm for Custom Class String List

Basically I need help adapting my Binary Search Algorithm to work with my string list as seen below. Note, I have to use a written Binary Search algorithm, no use of built-in c# functions like .BinarySearch .
I will now show you how the list is formatted and the list itself:
// This class formats the list, might be useful to know
public class Demo
{
public string Col;
public string S1;
public string S2;
public string S3;
public override string ToString()
{
return string.Format("Col: {0}, S1: {1}, S2: {2}, S3: {3}", Col, S1, S2, S3);
}
}
// The list itself
var list = new List<Demo>
{
new Demo {Col = "Blue", S1 ="88", S2 ="Yes"},
new Demo {Col = "Green", S1 ="43", S2 ="Yes"},
new Demo {Col = "Red", S1 ="216", S2 ="No"},
new Demo {Col = "Yellow", S1 ="100", S2 ="No"}
};
The list is already sorted into alphabetical order of the 'Col' string values, hence why Blue is first and Yellow is last. The 'Col' is the part of the list that needs to be searched. Below I have inserted my current Binary Search that can search int arrays.
public static int BinarySearch_R(int key, int[] array, int low, int high)
{
if (low > high) return -1;
int mid = (low + high) / 2;
if (key == array[mid])
{
return mid;
}
if (key < array[mid]) {
return BinarySearch_R(key, array, low, mid - 1);
} else {
return BinarySearch_R(key, array, mid + 1, high);
}
}
I need help adapting my BinarySearch Algorith to work for the list above. If you guys have any questions, or need to see more of my code, just ask.
Concrete answer: Adapting your method for the specific case is quite easy.
Let first update your existing method to use a more general method (IComparable<T>.CompareTo for comparing rather than the int operators:
public static int BinarySearch_R(int key, int[] array, int low, int high)
{
if (low > high) return -1;
int mid = (low + high) / 2;
int compare = key.CompareTo(array[mid]);
if (compare == 0)
{
return mid;
}
if (compare < 0)
{
return BinarySearch_R(key, array, low, mid - 1);
}
else {
return BinarySearch_R(key, array, mid + 1, high);
}
}
Then all you need is to copy/paste the above method, replace int key with string key, int[] array with List<Demo> array and array[mid] with array[mid].Col:
public static int BinarySearch_R(string key, List<Demo> array, int low, int high)
{
if (low > high) return -1;
int mid = (low + high) / 2;
int compare = key.CompareTo(array[mid].Col);
if (compare == 0)
{
return mid;
}
if (compare < 0)
{
return BinarySearch_R(key, array, low, mid - 1);
}
else {
return BinarySearch_R(key, array, mid + 1, high);
}
}
Extended answer: While you can do the above, it will require you to do the same for any other property/class you need such capability.
A much better approach would be to generalize the code. For instance, int[] and List<Demo> can be generalized as IReadOnlyList<T>, int/string key as TKey key, Demo.Col as Func<T, TKey>, CompareTo as IComparer<TKey>.Compare, so the final generic method could be like this:
public static class MyAlgorithms
{
public static int BinarySearch<T, TKey>(this IReadOnlyList<T> source, Func<T, TKey> keySelector, TKey key, IComparer<TKey> keyComparer = null)
{
return source.BinarySearch(0, source.Count, keySelector, key, keyComparer);
}
public static int BinarySearch<T, TKey>(this IReadOnlyList<T> source, int start, int count, Func<T, TKey> keySelector, TKey key, IComparer<TKey> keyComparer = null)
{
// Argument validations skipped
if (keyComparer == null) keyComparer = Comparer<TKey>.Default;
int lo = start, hi = start + count - 1;
while (lo <= hi)
{
int mid = lo + (hi - lo) / 2;
int compare = keyComparer.Compare(key, keySelector(source[mid]));
if (compare < 0)
hi = mid - 1;
else if (compare > 0)
lo = mid + 1;
else
return mid;
}
return -1;
}
}
Now you can use that single method for any data structure. For instance, searching your List<Demo> by Col would be like this:
int index = list.BinarySearch(e => e.Col, "Red");
Ive only done the most basic things in C# so this might just be completely useless. I had an assignment for CS 2 class where at least it sounds somewhat similar to what you want but we use java. So im going to assume you want your list of items sorted by some keyword ("Blue","Green" etc...). I used a LinkedList but it doesnt matter.
class Node {
String keyword;
LinkedList<String> records = new LinkedList<>();
Node left;
Node right;
public Node(String keyword, LinkedList<String> records) {
this.keyword = keyword;
this.records = records;
}
}
Now, the only real difference at least i can tell between having a BST sorted by a string and one sorted by numbers is that you need some type of comparison method to see whether one word is > or < in alphabet. So here's how i did the insert function:
/**
* insert node
* #param keyword compare it to other strings
*/
public void insert(String keyword, LinkedList<String> records) {
//create a new Node
Node n = new Node(keyword, records);
int result;
Node current = root;
Node parent = null;
//cont. until NULL
while (current != null) {
result = current.keyword.compareTo(n.keyword);
if (result == 0) return;
else if (result > 0) {
parent = current;
current = current.left;
}
else if (result < 0) {
parent = current;
current = current.right;
}
}
if (parent == null) root = n;
else {
result = parent.keyword.compareTo(n.keyword);
if (result > 0) parent.left = n;
else if (result < 0) parent.right = n;
}
}
So the method "compareTo(...)" returns 1 if string is higher in alphabet 0 if same and -1 if lower. So i would, if im at all close to getting what youre asking, get the C# version of this method and implement BST as you normally would.
Just create make class IComparable and create a custom CompareTo() method. The standard methods like sort will automatically work once the class inherits IComparable.
public class Demo : IComparable
{
public string Color;
public int value;
public Boolean truth;
public int CompareTo(Demo other)
{
int results = 0;
if (this.Color == other.Color)
{
if (this.value == other.value)
{
results = this.truth.CompareTo(other.truth);
}
else
{
results = this.value.CompareTo(other.value);
}
}
else
{
results = this.Color.CompareTo(other.Color);
}
return results;
}

(Dynamic programming) How to maximize room utilization with a list of meeting?

I am trying this problem using dynamic programming
Problem:
Given a meeting room and a list of intervals (represent the meeting), for e.g.:
interval 1: 1.00-2.00
interval 2: 2.00-4.00
interval 3: 14.00-16.00
...
etc.
Question:
How to schedule the meeting to maximize the room utilization, and NO meeting should overlap with each other?
Attempted solution
Below is my initial attempt in C# (knowing it is a modified Knapsack problem with constraints). However I had difficulty in getting the result correctly.
bool ContainsOverlapped(List<Interval> list)
{
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++)
{
for (int j = i + 1; j < sortedList.Count; j++)
{
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public bool Optimize(List<Interval> intervals, int limit, List<Interval> itemSoFar){
if (intervals == null || intervals.Count == 0)
return true; //no more choice
if (Sum(itemSoFar) > limit) //over limit
return false;
var arrInterval = intervals.ToArray();
//try all choices
for (int i = 0; i < arrInterval.Length; i++){
List<Interval> remaining = new List<Interval>();
for (int j = i + 1; j < arrInterval.Length; j++) {
remaining.Add(arrInterval[j]);
}
var partialChoice = new List<Interval>();
partialChoice.AddRange(itemSoFar);
partialChoice.Add(arrInterval[i]);
//should not schedule overlap
if (ContainsOverlapped(partialChoice))
partialChoice.Remove(arrInterval[i]);
if (Optimize(remaining, limit, partialChoice))
return true;
else
partialChoice.Remove(arrInterval[i]); //undo
}
//try all solution
return false;
}
public class Interval
{
public bool IsOverlap(Interval other)
{
return (other.Start < this.Start && this.Start < other.End) || //other < this
(this.Start < other.Start && other.End < this.End) || // this covers other
(other.Start < this.Start && this.End < other.End) || // other covers this
(this.Start < other.Start && other.Start < this.End); //this < other
}
public override bool Equals(object obj){
var i = (Interval)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Interval(int start, int end){
Start = start;
End = end;
}
public int Duration{
get{
return End - Start;
}
}
}
Edit 1
Room utilization = amount of time the room is occupied. Sorry for confusion.
Edit 2
for simplicity: the duration of each interval is integer, and the start/end time start at whole hour (1,2,3..24)
I'm not sure how you are relating this to a knapsack problem. To me it seems more of a vertex cover problem.
First sort the intervals as per their start times and form a graph representation in the form of adjacency matrix or list.
The vertices shall be the interval numbers. There shall be an edge between two vertices if the corresponding intervals overlap with each other. Also, each vertex shall be associated with a value equal to the interval's duration.
The problem then becomes choosing the independent vertices in such a way that the total value is maximum.
This can be done through dynamic programming. The recurrence relation for each vertex shall be as follows:
V[i] = max{ V[j] | j < i and i->j is an edge,
V[k] + value[i] | k < i and there is no edge between i and k }
Base Case V[1] = value[1]
Note:
The vertices should be numbered in increasing order of their start times. Then if there are three vertices:
i < j < k, and if there is no edge between vertex i and vertex j, then there cannot be any edge between vertex i and vertex k.
Good approach is to create class that can easily handle for you.
First I create helper class for easily storing intervals
public class FromToDateTime
{
private DateTime _start;
public DateTime Start
{
get
{
return _start;
}
set
{
_start = value;
}
}
private DateTime _end;
public DateTime End
{
get
{
return _end;
}
set
{
_end = value;
}
}
public FromToDateTime(DateTime start, DateTime end)
{
Start = start;
End = end;
}
}
And then here is class Room, where all intervals are and which has method "addInterval", which returns true, if interval is ok and was added and false, if it does not.
btw : I got a checking condition for overlapping here : Algorithm to detect overlapping periods
public class Room
{
private List<FromToDateTime> _intervals;
public List<FromToDateTime> Intervals
{
get
{
return _intervals;
}
set
{
_intervals = value;
}
}
public Room()
{
Intervals = new List<FromToDateTime>();
}
public bool addInterval(FromToDateTime newInterval)
{
foreach (FromToDateTime interval in Intervals)
{
if (newInterval.Start < interval.End && interval.Start < newInterval.End)
{
return false;
}
}
Intervals.Add(newInterval);
return true;
}
}
While the more general problem (if you have multiple number of meeting rooms) is indeed NP-Hard, and is known as the interval scheduling problem.
Optimal solution for 1-d problem with one classroom:
For the 1-d problem, choosing the (still valid) earliest deadline first solves the problem optimally.
Proof: by induction, the base clause is the void clause - the algorithm optimally solves a problem with zero meetings.
The induction hypothesis is the algorithm solves the problem optimally for any number of k tasks.
The step: Given a problem with n meetings, hose the earliest deadline, and remove all invalid meetings after choosing it. Let the chosen earliest deadline task be T.
You will get a new problem of smaller size, and by invoking the algorithm on the reminder, you will get the optimal solution for them (induction hypothesis).
Now, note that given that optimal solution, you can add at most one of the discarded tasks, since you can either add T, or another discarded task - but all of them overlaps T - otherwise they wouldn't have been discarded), thus, you can add at most one from all discarded tasks, same as the suggested algorithm.
Conclusion: For 1 meeting room, this algorithm is optimal.
QED
high level pseudo code of the solution:
findOptimal(list<tasks>):
res = [] //empty list
sort(list) //according to deadline/meeting end
while (list.IsEmpty() == false):
res = res.append(list.first())
end = list.first().endTime()
//remove all overlaps with the chosen meeting
while (list.first().startTine() < end):
list.removeFirst()
return res
Clarification: This answer assumes "Room Utilization" means maximize number of meetings placed in the room.
Thanks all, here is my solution based on this Princeton note on dynamic programming.
Algorithm:
Sort all events by end time.
For each event, find p[n] - the latest event (by end time) which does not overlap with it.
Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
}
The complete source-code:
namespace CommonProblems.Algorithm.DynamicProgramming {
public class Scheduler {
#region init & test
public List<Event> _events { get; set; }
public List<Event> Init() {
if (_events == null) {
_events = new List<Event>();
_events.Add(new Event(8, 11));
_events.Add(new Event(6, 10));
_events.Add(new Event(5, 9));
_events.Add(new Event(3, 8));
_events.Add(new Event(4, 7));
_events.Add(new Event(0, 6));
_events.Add(new Event(3, 5));
_events.Add(new Event(1, 4));
}
return _events;
}
public void DemoOptimize() {
this.Init();
this.DynamicOptimize(this._events);
}
#endregion
#region Dynamic Programming
public void DynamicOptimize(List<Event> events) {
events.Add(new Event(0, 0));
events = events.SortByEndTime();
int[] eventIndexes = getCompatibleEvent(events);
int[] utilization = getBestUtilization(events, eventIndexes);
List<Event> schedule = getOptimizeSchedule(events, events.Count - 1, utilization, eventIndexes);
foreach (var e in schedule) {
Console.WriteLine("Event: [{0}- {1}]", e.Start, e.End);
}
}
/*
Algo to get optimization value:
1) Sort all events by end time, give each of the an index.
2) For each event, find p[n] - the latest event (by end time) which does not overlap with it.
3) Compute the optimization values: choose the best between including/not including the event.
Optimize(n) {
opt(0) = 0;
for j = 1 to n-th {
opt(j) = max(length(j) + opt[p(j)], opt[j-1]);
}
display opt();
}
*/
int[] getBestUtilization(List<Event> sortedEvents, int[] compatibleEvents) {
int[] optimal = new int[sortedEvents.Count];
int n = optimal.Length;
optimal[0] = 0;
for (int j = 1; j < n; j++) {
var thisEvent = sortedEvents[j];
//pick between 2 choices:
optimal[j] = Math.Max(thisEvent.Duration + optimal[compatibleEvents[j]], //Include this event
optimal[j - 1]); //Not include
}
return optimal;
}
/*
Show the optimized events:
sortedEvents: events sorted by end time.
index: event index to start with.
optimal: optimal[n] = the optimized schedule at n-th event.
compatibleEvents: compatibleEvents[n] = the latest event before n-th
*/
List<Event> getOptimizeSchedule(List<Event> sortedEvents, int index, int[] optimal, int[] compatibleEvents) {
List<Event> output = new List<Event>();
if (index == 0) {
//base case: no more event
return output;
}
//it's better to choose this event
else if (sortedEvents[index].Duration + optimal[compatibleEvents[index]] >= optimal[index]) {
output.Add(sortedEvents[index]);
//recursive go back
output.AddRange(getOptimizeSchedule(sortedEvents, compatibleEvents[index], optimal, compatibleEvents));
return output;
}
//it's better NOT choose this event
else {
output.AddRange(getOptimizeSchedule(sortedEvents, index - 1, optimal, compatibleEvents));
return output;
}
}
//compatibleEvents[n] = the latest event which do not overlap with n-th.
int[] getCompatibleEvent(List<Event> sortedEvents) {
int[] compatibleEvents = new int[sortedEvents.Count];
for (int i = 0; i < sortedEvents.Count; i++) {
for (int j = 0; j <= i; j++) {
if (!sortedEvents[j].IsOverlap(sortedEvents[i])) {
compatibleEvents[i] = j;
}
}
}
return compatibleEvents;
}
#endregion
}
public class Event {
public int EventId { get; set; }
public bool IsOverlap(Event other) {
return !(this.End <= other.Start ||
this.Start >= other.End);
}
public override bool Equals(object obj) {
var i = (Event)obj;
return base.Equals(obj) && i.Start == this.Start && i.End == this.End;
}
public int Start { get; set; }
public int End { get; set; }
public Event(int start, int end) {
Start = start;
End = end;
}
public int Duration {
get {
return End - Start;
}
}
}
public static class ListExtension {
public static bool ContainsOverlapped(this List<Event> list) {
var sortedList = list.OrderBy(x => x.Start).ToList();
for (int i = 0; i < sortedList.Count; i++) {
for (int j = i + 1; j < sortedList.Count; j++) {
if (sortedList[i].IsOverlap(sortedList[j]))
return true;
}
}
return false;
}
public static List<Event> SortByEndTime(this List<Event> events) {
if (events == null) return new List<Event>();
return events.OrderBy(x => x.End).ToList();
}
}
}

What does ParallelQuerys Count count?

I'm testing a self written element generator (ICollection<string>) and compare the calculated count to the actual count to get an idea if there's an error or not in my algorithm.
As this generator can generate lots of elements on demand I'm looking in Partitioner<string> and I have implemented a basic one which seems to also produce valid enumerators which together give the same amount of strings as calculated.
Now I want to test how this behaves if run parallel (again first testing for correct count):
MyGenerator generator = new MyGenerator();
MyPartitioner partitioner = new MyPartitioner(generator);
int isCount = partitioner.AsParallel().Count();
int shouldCount = generator.Count;
bool same = isCount == shouldCount; // false
I don't get why this count is not equal! What is the ParallelQuery<string> doing?
generator.Count() == generator.Count // true
partitioner.GetPartitions(xyz).Select(enumerator =>
{
int count = 0;
while (enumerator.MoveNext())
{
count++;
}
return count;
}).Sum() == generator.Count // true
So, I'm currently not seeing an error in my code. Next I tried to manualy count that ParallelQuery<string>:
int count = 0;
partitioner.AsParallel().ForAll(e => Interlocked.Increment(ref count));
count == generator.Count // true
Summed up: Everyone counts my enumerable correct, ParallelQuery.ForAll enumerates exactly generator.Count elements. But what does ParallelQuery.Count()?
If the correct count is something about 10k, ParallelQuery sees 40k.
internal sealed class PartialWordEnumerator : IEnumerator<string>
{
private object sync = new object();
private readonly IEnumerable<char> characters;
private readonly char[] limit;
private char[] buffer;
private IEnumerator<char>[] enumerators;
private int position = 0;
internal PartialWordEnumerator(IEnumerable<char> characters, char[] state, char[] limit)
{
this.characters = new List<char>(characters);
this.buffer = (char[])state.Clone();
if (limit != null)
{
this.limit = (char[])limit.Clone();
}
this.enumerators = new IEnumerator<char>[this.buffer.Length];
for (int i = 0; i < this.buffer.Length; i++)
{
this.enumerators[i] = SkipTo(state[i]);
}
}
private IEnumerator<char> SkipTo(char c)
{
IEnumerator<char> first = this.characters.GetEnumerator();
IEnumerator<char> second = this.characters.GetEnumerator();
while (second.MoveNext())
{
if (second.Current == c)
{
return first;
}
first.MoveNext();
}
throw new InvalidOperationException();
}
private bool ReachedLimit
{
get
{
if (this.limit == null)
{
return false;
}
for (int i = 0; i < this.buffer.Length; i++)
{
if (this.buffer[i] != this.limit[i])
{
return false;
}
}
return true;
}
}
public string Current
{
get
{
if (this.buffer == null)
{
throw new ObjectDisposedException(typeof(PartialWordEnumerator).FullName);
}
return new string(this.buffer);
}
}
object IEnumerator.Current
{
get { return this.Current; }
}
public bool MoveNext()
{
lock (this.sync)
{
if (this.position == this.buffer.Length)
{
this.position--;
}
if (this.position == -1)
{
return false;
}
IEnumerator<char> enumerator = this.enumerators[this.position];
if (enumerator.MoveNext())
{
this.buffer[this.position] = enumerator.Current;
this.position++;
if (this.position == this.buffer.Length)
{
return !this.ReachedLimit;
}
else
{
return this.MoveNext();
}
}
else
{
this.enumerators[this.position] = this.characters.GetEnumerator();
this.position--;
return this.MoveNext();
}
}
}
public void Dispose()
{
this.position = -1;
this.buffer = null;
}
public void Reset()
{
throw new NotSupportedException();
}
}
public override IList<IEnumerator<string>> GetPartitions(int partitionCount)
{
IEnumerator<string>[] enumerators = new IEnumerator<string>[partitionCount];
List<char> characters = new List<char>(this.generator.Characters);
int length = this.generator.Length;
int characterCount = this.generator.Characters.Count;
int steps = Math.Min(characterCount, partitionCount);
int skip = characterCount / steps;
for (int i = 0; i < steps; i++)
{
char c = characters[i * skip];
char[] state = new string(c, length).ToCharArray();
char[] limit = null;
if ((i + 1) * skip < characterCount)
{
c = characters[(i + 1) * skip];
limit = new string(c, length).ToCharArray();
}
if (i == steps - 1)
{
limit = null;
}
enumerators[i] = new PartialWordEnumerator(characters, state, limit);
}
for (int i = steps; i < partitionCount; i++)
{
enumerators[i] = Enumerable.Empty<string>().GetEnumerator();
}
return enumerators;
}
EDIT: I believe I have found the solution. According to the documentation on IEnumerable.MoveNext (emphasis mine):
If MoveNext passes the end of the collection, the enumerator is
positioned after the last element in the collection and MoveNext
returns false. When the enumerator is at this position, subsequent
calls to MoveNext also return false until Reset is called.
According to the following logic:
private bool ReachedLimit
{
get
{
if (this.limit == null)
{
return false;
}
for (int i = 0; i < this.buffer.Length; i++)
{
if (this.buffer[i] != this.limit[i])
{
return false;
}
}
return true;
}
}
The call to MoveNext() will return false only one time - when the buffer is exactly equal to the limit. Once you have passed the limit, the return value from ReachedLimit will start to become false again, making return !this.ReachedLimit return true, so the enumerator will continue past the end of the limit all the way until it runs out of characters to enumerate. Apparently, in the implementation of ParallelQuery.Count(), MoveNext() is called multiple times when it has reached the end, and since it starts to return a true value again, the enumerator happily continues returning more elements (this is not the case in your custom code that walks the enumerator manually, and apparently also is not the case for the ForAll call, so they "accidentally" return the correct results).
The simplest fix to this is to remember the return value from MoveNext() once it becomes false:
private bool _canMoveNext = true;
public bool MoveNext()
{
if (!_canMoveNext) return false;
...
if (this.position == this.buffer.Length)
{
if (this.ReachedLimit) _canMoveNext = false;
...
}
Now once it begins returning false, it will return false for every future call and this returns the correct result from AsParallel().Count(). Hope this helps!
The documentation on Partitioner notes (emphasis mine):
The static methods on Partitioner are all thread-safe and may
be used concurrently from multiple threads. However, while a created
partitioner is in use, the underlying data source should not be
modified, whether from the same thread that is using a partitioner or
from a separate thread.
From what I can understand of the code you have given, it would seem that ParallelQuery.Count() is most likely to have thread-safety issues because it may possibly be iterating multiple enumerators at the same time, whereas all the other solutions would require the enumerators to be run synchronized. Without seeing the code you are using for MyGenerator and MyPartitioner is it difficult to determine if thread-safety issues could be the culprit.
To demonstrate, I have written a simple enumerator that returns the first hundred numbers as strings. Also, I have a partitioner, that distributes the elements in the underlying enumerator over a collection of numPartitions separate lists. Using all the methods you described above on our 12-core server (when I output numPartitions, it uses 12 by default on this machine), I get the expected result of 100 (this is LINQPad-ready code):
void Main()
{
var partitioner = new SimplePartitioner(GetEnumerator());
GetEnumerator().Count().Dump();
partitioner.GetPartitions(10).Select(enumerator =>
{
int count = 0;
while (enumerator.MoveNext())
{
count++;
}
return count;
}).Sum().Dump();
var theCount = 0;
partitioner.AsParallel().ForAll(e => Interlocked.Increment(ref theCount));
theCount.Dump();
partitioner.AsParallel().Count().Dump();
}
// Define other methods and classes here
public IEnumerable<string> GetEnumerator()
{
for (var i = 1; i <= 100; i++)
yield return i.ToString();
}
public class SimplePartitioner : Partitioner<string>
{
private IEnumerable<string> input;
public SimplePartitioner(IEnumerable<string> input)
{
this.input = input;
}
public override IList<IEnumerator<string>> GetPartitions(int numPartitions)
{
var list = new List<string>[numPartitions];
for (var i = 0; i < numPartitions; i++)
list[i] = new List<string>();
var index = 0;
foreach (var s in input)
list[(index = (index + 1) % numPartitions)].Add(s);
IList<IEnumerator<string>> result = new List<IEnumerator<string>>();
foreach (var l in list)
result.Add(l.GetEnumerator());
return result;
}
}
Output:
100
100
100
100
This clearly works. Without more information it is impossible to tell you what is not working in your particular implementation.

IComparer for integers and force empty strings to end

I've written the following IComparer but I need some help. I'm trying to sort a list of numbers but some of the numbers may not have been filled in. I want these numbers to be sent to the end of the list at all times.. for example...
[EMPTY], 1, [EMPTY], 3, 2
would become...
1, 2, 3, [EMPTY], [EMPTY]
and reversed this would become...
3, 2, 1, [EMPTY], [EMPTY]
Any ideas?
public int Compare(ListViewItem x, ListViewItem y)
{
int comparison = int.MinValue;
ListViewItem.ListViewSubItem itemOne = x.SubItems[subItemIndex];
ListViewItem.ListViewSubItem itemTwo = y.SubItems[subItemIndex];
if (!string.IsNullOrEmpty(itemOne.Text) && !string.IsNullOrEmpty(itemTwo.Text))
{
uint itemOneComparison = uint.Parse(itemOne.Text);
uint itemTwoComparison = uint.Parse(itemTwo.Text);
comparison = itemOneComparison.CompareTo(itemTwoComparison);
}
else
{
// ALWAYS SEND TO BOTTOM/END OF LIST.
}
// Calculate correct return value based on object comparison.
if (OrderOfSort == SortOrder.Descending)
{
// Descending sort is selected, return negative result of compare operation.
comparison = (-comparison);
}
else if (OrderOfSort == SortOrder.None)
{
// Return '0' to indicate they are equal.
comparison = 0;
}
return comparison;
}
Cheers.
Your logic is slightly off: your else will be entered if either of them are empty, but you only want the empty one to go to the end of the list, not the non-empty one. Something like this should work:
public int Compare(ListViewItem x, ListViewItem y)
{
ListViewItem.ListViewSubItem itemOne = x.SubItems[subItemIndex];
ListViewItem.ListViewSubItem itemTwo = y.SubItems[subItemIndex];
// if they're both empty, return 0
if (string.IsNullOrEmpty(itemOne.Text) && string.IsNullOrEmpty(itemTwo.Text))
return 0;
// if itemOne is empty, it comes second
if (string.IsNullOrEmpty(itemOne.Text))
return 1;
// if itemTwo is empty, it comes second
if (string.IsNullOrEmpty(itemTwo.Text)
return -1;
uint itemOneComparison = uint.Parse(itemOne.Text);
uint itemTwoComparison = uint.Parse(itemTwo.Text);
// Calculate correct return value based on object comparison.
int comparison = itemOneComparison.CompareTo(itemTwoComparison);
if (OrderOfSort == SortOrder.Descending)
comparison = (-comparison);
return comparison;
}
(I might've got the "1" and "-1" for when they're empty back to front, I can never remember :)
I'd actually approach this a completely different way, remove the empty slots, sort the list, then add the empty ones to the end of the list
static void Main(string[] args)
{
List<string> ints = new List<string> { "3", "1", "", "5", "", "2" };
CustomIntSort(ints, (x, y) => int.Parse(x) - int.Parse(y)); // Ascending
ints.ForEach(i => Console.WriteLine("[{0}]", i));
CustomIntSort(ints, (x, y) => int.Parse(y) - int.Parse(x)); // Descending
ints.ForEach(i => Console.WriteLine("[{0}]", i));
}
private static void CustomIntSort(List<string> ints, Comparison<string> Comparer)
{
int emptySlots = CountAndRemove(ints);
ints.Sort(Comparer);
for (int i = 0; i < emptySlots; i++)
ints.Add("");
}
private static int CountAndRemove(List<string> ints)
{
int emptySlots = 0;
int i = 0;
while (i < ints.Count)
{
if (string.IsNullOrEmpty(ints[i]))
{
emptySlots++;
ints.RemoveAt(i);
}
else
i++;
}
return emptySlots;
}
This question caught my attention recently, this comparer will do it either
class CustomComparer
: IComparer<string>
{
private bool isAscending;
public CustomComparer(bool isAscending = true)
{
this.isAscending = isAscending;
}
public int Compare(string x, string y)
{
long ix = CustomParser(x) * (isAscending ? 1 : -1);
long iy = CustomParser(y) * (isAscending ? 1 : -1);
return ix.CompareTo(iy) ;
}
private long CustomParser(string s)
{
if (string.IsNullOrEmpty(s))
return isAscending ? int.MaxValue : int.MinValue;
else
return int.Parse(s);
}
}
Your // ALWAYS SEND TO BOTTOM/END OF LIST. branch is being executed when either the x or y parameters are empty, i.e. a non-empty value will be sorted according to this rule if it is being compared to an empty value. You probably want something more like this:
if (!string.IsNullOrEmpty(itemOne.Text) && !string.IsNullOrEmpty(itemTwo.Text))
{
uint itemOneComparison = uint.Parse(itemOne.Text);
uint itemTwoComparison = uint.Parse(itemTwo.Text);
comparison = itemOneComparison.CompareTo(itemTwoComparison);
}
else if (!string.IsNullOrEmpty(itemOne.Text)
{
comparison = -1;
}
else
{
comparison = 1;
}
Always return 1 for your empty x values and -1 for your empty y values. This will mean that the comparer sees empty values as the greater value in all cases so they should end up at the end of the sorted list.
Of course, if both are empty, you should return 0 as they are equal.
else
{
//ALWAYS SEND TO BOTTOM/END OF LIST.
if (string.IsNullOrEmpty(itemOne.Text) && string.IsNullOrEmpty(itemTwo.Text))
{
return 0;
}
else if (string.IsNullOrEmpty(itemOne.Text))
{
return -1;
}
else if (string.IsNullOrEmpty(itemTwo.Text))
{
return 1;
}
}

Split a collection into `n` parts with LINQ? [duplicate]

This question already has answers here:
Split List into Sublists with LINQ
(34 answers)
Closed 1 year ago.
Is there a nice way to split a collection into n parts with LINQ?
Not necessarily evenly of course.
That is, I want to divide the collection into sub-collections, which each contains a subset of the elements, where the last collection can be ragged.
A pure linq and the simplest solution is as shown below.
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
{
int i = 0;
var splits = from item in list
group item by i++ % parts into part
select part.AsEnumerable();
return splits;
}
}
EDIT: Okay, it looks like I misread the question. I read it as "pieces of length n" rather than "n pieces". Doh! Considering deleting answer...
(Original answer)
I don't believe there's a built-in way of partitioning, although I intend to write one in my set of additions to LINQ to Objects. Marc Gravell has an implementation here although I would probably modify it to return a read-only view:
public static IEnumerable<IEnumerable<T>> Partition<T>
(this IEnumerable<T> source, int size)
{
T[] array = null;
int count = 0;
foreach (T item in source)
{
if (array == null)
{
array = new T[size];
}
array[count] = item;
count++;
if (count == size)
{
yield return new ReadOnlyCollection<T>(array);
array = null;
count = 0;
}
}
if (array != null)
{
Array.Resize(ref array, count);
yield return new ReadOnlyCollection<T>(array);
}
}
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
{
return list.Select((item, index) => new {index, item})
.GroupBy(x => x.index % parts)
.Select(x => x.Select(y => y.item));
}
}
Ok, I'll throw my hat in the ring. The advantages of my algorithm:
No expensive multiplication, division, or modulus operators
All operations are O(1) (see note below)
Works for IEnumerable<> source (no Count property needed)
Simple
The code:
public static IEnumerable<IEnumerable<T>>
Section<T>(this IEnumerable<T> source, int length)
{
if (length <= 0)
throw new ArgumentOutOfRangeException("length");
var section = new List<T>(length);
foreach (var item in source)
{
section.Add(item);
if (section.Count == length)
{
yield return section.AsReadOnly();
section = new List<T>(length);
}
}
if (section.Count > 0)
yield return section.AsReadOnly();
}
As pointed out in the comments below, this approach doesn't actually address the original question which asked for a fixed number of sections of approximately equal length. That said, you can still use my approach to solve the original question by calling it this way:
myEnum.Section(myEnum.Count() / number_of_sections + 1)
When used in this manner, the approach is no longer O(1) as the Count() operation is O(N).
This is same as the accepted answer, but a much simpler representation:
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items,
int numOfParts)
{
int i = 0;
return items.GroupBy(x => i++ % numOfParts);
}
The above method splits an IEnumerable<T> into N number of chunks of equal sizes or close to equal sizes.
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> items,
int partitionSize)
{
int i = 0;
return items.GroupBy(x => i++ / partitionSize).ToArray();
}
The above method splits an IEnumerable<T> into chunks of desired fixed size with total number of chunks being unimportant - which is not what the question is about.
The problem with the Split method, besides being slower, is that it scrambles the output in the sense that the grouping will be done on the basis of i'th multiple of N for each position, or in other words you don't get the chunks in the original order.
Almost every answer here either doesn't preserve order, or is about partitioning and not splitting, or is plainly wrong. Try this which is faster, preserves order but a lil' more verbose:
public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items,
int numberOfChunks)
{
if (numberOfChunks <= 0 || numberOfChunks > items.Count)
throw new ArgumentOutOfRangeException("numberOfChunks");
int sizePerPacket = items.Count / numberOfChunks;
int extra = items.Count % numberOfChunks;
for (int i = 0; i < numberOfChunks - extra; i++)
yield return items.Skip(i * sizePerPacket).Take(sizePerPacket);
int alreadyReturnedCount = (numberOfChunks - extra) * sizePerPacket;
int toReturnCount = extra == 0 ? 0 : (items.Count - numberOfChunks) / extra + 1;
for (int i = 0; i < extra; i++)
yield return items.Skip(alreadyReturnedCount + i * toReturnCount).Take(toReturnCount);
}
The equivalent method for a Partition operation here
I have been using the Partition function I posted earlier quite often. The only bad thing about it was that is wasn't completely streaming. This is not a problem if you work with few elements in your sequence. I needed a new solution when i started working with 100.000+ elements in my sequence.
The following solution is a lot more complex (and more code!), but it is very efficient.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
namespace LuvDaSun.Linq
{
public static class EnumerableExtensions
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int partitionSize)
{
/*
return enumerable
.Select((item, index) => new { Item = item, Index = index, })
.GroupBy(item => item.Index / partitionSize)
.Select(group => group.Select(item => item.Item) )
;
*/
return new PartitioningEnumerable<T>(enumerable, partitionSize);
}
}
class PartitioningEnumerable<T> : IEnumerable<IEnumerable<T>>
{
IEnumerable<T> _enumerable;
int _partitionSize;
public PartitioningEnumerable(IEnumerable<T> enumerable, int partitionSize)
{
_enumerable = enumerable;
_partitionSize = partitionSize;
}
public IEnumerator<IEnumerable<T>> GetEnumerator()
{
return new PartitioningEnumerator<T>(_enumerable.GetEnumerator(), _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitioningEnumerator<T> : IEnumerator<IEnumerable<T>>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitioningEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
_enumerator.Dispose();
}
IEnumerable<T> _current;
public IEnumerable<T> Current
{
get { return _current; }
}
object IEnumerator.Current
{
get { return _current; }
}
public void Reset()
{
_current = null;
_enumerator.Reset();
}
public bool MoveNext()
{
bool result;
if (_enumerator.MoveNext())
{
_current = new PartitionEnumerable<T>(_enumerator, _partitionSize);
result = true;
}
else
{
_current = null;
result = false;
}
return result;
}
}
class PartitionEnumerable<T> : IEnumerable<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitionEnumerable(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public IEnumerator<T> GetEnumerator()
{
return new PartitionEnumerator<T>(_enumerator, _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitionEnumerator<T> : IEnumerator<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
int _count;
public PartitionEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
}
public T Current
{
get { return _enumerator.Current; }
}
object IEnumerator.Current
{
get { return _enumerator.Current; }
}
public void Reset()
{
if (_count > 0) throw new InvalidOperationException();
}
public bool MoveNext()
{
bool result;
if (_count < _partitionSize)
{
if (_count > 0)
{
result = _enumerator.MoveNext();
}
else
{
result = true;
}
_count++;
}
else
{
result = false;
}
return result;
}
}
}
Enjoy!
Interesting thread. To get a streaming version of Split/Partition, one can use enumerators and yield sequences from the enumerator using extension methods. Converting imperative code to functional code using yield is a very powerful technique indeed.
First an enumerator extension that turns a count of elements into a lazy sequence:
public static IEnumerable<T> TakeFromCurrent<T>(this IEnumerator<T> enumerator, int count)
{
while (count > 0)
{
yield return enumerator.Current;
if (--count > 0 && !enumerator.MoveNext()) yield break;
}
}
And then an enumerable extension that partitions a sequence:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> seq, int partitionSize)
{
var enumerator = seq.GetEnumerator();
while (enumerator.MoveNext())
{
yield return enumerator.TakeFromCurrent(partitionSize);
}
}
The end result is a highly efficient, streaming and lazy implementation that relies on very simple code.
Enjoy!
I use this:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> instance, int partitionSize)
{
return instance
.Select((value, index) => new { Index = index, Value = value })
.GroupBy(i => i.Index / partitionSize)
.Select(i => i.Select(i2 => i2.Value));
}
As of .NET 6 you can use Enumerable.Chunk<TSource>(IEnumerable<TSource>, Int32).
This is memory efficient and defers execution as much as possible (per batch) and operates in linear time O(n)
public static IEnumerable<IEnumerable<T>> InBatchesOf<T>(this IEnumerable<T> items, int batchSize)
{
List<T> batch = new List<T>(batchSize);
foreach (var item in items)
{
batch.Add(item);
if (batch.Count >= batchSize)
{
yield return batch;
batch = new List<T>();
}
}
if (batch.Count != 0)
{
//can't be batch size or would've yielded above
batch.TrimExcess();
yield return batch;
}
}
There are lots of great answers for this question (and its cousins). I needed this myself and had created a solution that is designed to be efficient and error tolerant in a scenario where the source collection can be treated as a list. It does not use any lazy iteration so it may not be suitable for collections of unknown size that may apply memory pressure.
static public IList<T[]> GetChunks<T>(this IEnumerable<T> source, int batchsize)
{
IList<T[]> result = null;
if (source != null && batchsize > 0)
{
var list = source as List<T> ?? source.ToList();
if (list.Count > 0)
{
result = new List<T[]>();
for (var index = 0; index < list.Count; index += batchsize)
{
var rangesize = Math.Min(batchsize, list.Count - index);
result.Add(list.GetRange(index, rangesize).ToArray());
}
}
}
return result ?? Enumerable.Empty<T[]>().ToList();
}
static public void TestGetChunks()
{
var ids = Enumerable.Range(1, 163).Select(i => i.ToString());
foreach (var chunk in ids.GetChunks(20))
{
Console.WriteLine("[{0}]", String.Join(",", chunk));
}
}
I have seen a few answers across this family of questions that use GetRange and Math.Min. But I believe that overall this is a more complete solution in terms of error checking and efficiency.
protected List<List<int>> MySplit(int MaxNumber, int Divider)
{
List<List<int>> lst = new List<List<int>>();
int ListCount = 0;
int d = MaxNumber / Divider;
lst.Add(new List<int>());
for (int i = 1; i <= MaxNumber; i++)
{
lst[ListCount].Add(i);
if (i != 0 && i % d == 0)
{
ListCount++;
d += MaxNumber / Divider;
lst.Add(new List<int>());
}
}
return lst;
}
Great Answers, for my scenario i tested the accepted answer , and it seems it does not keep order. there is also great answer by Nawfal that keeps order.
But in my scenario i wanted to split the remainder in a normalized way,
all answers i saw spread the remainder or at the beginning or at the end.
My answer also takes the remainder spreading in more normalized way.
static class Program
{
static void Main(string[] args)
{
var input = new List<String>();
for (int k = 0; k < 18; ++k)
{
input.Add(k.ToString());
}
var result = splitListIntoSmallerLists(input, 15);
int i = 0;
foreach(var resul in result){
Console.WriteLine("------Segment:" + i.ToString() + "--------");
foreach(var res in resul){
Console.WriteLine(res);
}
i++;
}
Console.ReadLine();
}
private static List<List<T>> splitListIntoSmallerLists<T>(List<T> i_bigList,int i_numberOfSmallerLists)
{
if (i_numberOfSmallerLists <= 0)
throw new ArgumentOutOfRangeException("Illegal value of numberOfSmallLists");
int normalizedSpreadRemainderCounter = 0;
int normalizedSpreadNumber = 0;
//e.g 7 /5 > 0 ==> output size is 5 , 2 /5 < 0 ==> output is 2
int minimumNumberOfPartsInEachSmallerList = i_bigList.Count / i_numberOfSmallerLists;
int remainder = i_bigList.Count % i_numberOfSmallerLists;
int outputSize = minimumNumberOfPartsInEachSmallerList > 0 ? i_numberOfSmallerLists : remainder;
//In case remainder > 0 we want to spread the remainder equally between the others
if (remainder > 0)
{
if (minimumNumberOfPartsInEachSmallerList > 0)
{
normalizedSpreadNumber = (int)Math.Floor((double)i_numberOfSmallerLists / remainder);
}
else
{
normalizedSpreadNumber = 1;
}
}
List<List<T>> retVal = new List<List<T>>(outputSize);
int inputIndex = 0;
for (int i = 0; i < outputSize; ++i)
{
retVal.Add(new List<T>());
if (minimumNumberOfPartsInEachSmallerList > 0)
{
retVal[i].AddRange(i_bigList.GetRange(inputIndex, minimumNumberOfPartsInEachSmallerList));
inputIndex += minimumNumberOfPartsInEachSmallerList;
}
//If we have remainder take one from it, if our counter is equal to normalizedSpreadNumber.
if (remainder > 0)
{
if (normalizedSpreadRemainderCounter == normalizedSpreadNumber-1)
{
retVal[i].Add(i_bigList[inputIndex]);
remainder--;
inputIndex++;
normalizedSpreadRemainderCounter=0;
}
else
{
normalizedSpreadRemainderCounter++;
}
}
}
return retVal;
}
}
If order in these parts is not very important you can try this:
int[] array = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int n = 3;
var result =
array.Select((value, index) => new { Value = value, Index = index }).GroupBy(i => i.Index % n, i => i.Value);
// or
var result2 =
from i in array.Select((value, index) => new { Value = value, Index = index })
group i.Value by i.Index % n into g
select g;
However these can't be cast to IEnumerable<IEnumerable<int>> by some reason...
This is my code, nice and short.
<Extension()> Public Function Chunk(Of T)(ByVal this As IList(Of T), ByVal size As Integer) As List(Of List(Of T))
Dim result As New List(Of List(Of T))
For i = 0 To CInt(Math.Ceiling(this.Count / size)) - 1
result.Add(New List(Of T)(this.GetRange(i * size, Math.Min(size, this.Count - (i * size)))))
Next
Return result
End Function
This is my way, listing items and breaking row by columns
int repat_count=4;
arrItems.ForEach((x, i) => {
if (i % repat_count == 0)
row = tbo.NewElement(el_tr, cls_min_height);
var td = row.NewElement(el_td);
td.innerHTML = x.Name;
});
I was looking for a split like the one with string, so the whole List is splitted according to some rule, not only the first part, this is my solution
List<int> sequence = new List<int>();
for (int i = 0; i < 2000; i++)
{
sequence.Add(i);
}
int splitIndex = 900;
List<List<int>> splitted = new List<List<int>>();
while (sequence.Count != 0)
{
splitted.Add(sequence.Take(splitIndex).ToList() );
sequence.RemoveRange(0, Math.Min(splitIndex, sequence.Count));
}
Here is a little tweak for the number of items instead of the number of parts:
public static class MiscExctensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int nbItems)
{
return (
list
.Select((o, n) => new { o, n })
.GroupBy(g => (int)(g.n / nbItems))
.Select(g => g.Select(x => x.o))
);
}
}
below code returns both given number of chunks also with sorted data
static IEnumerable<IEnumerable<T>> SplitSequentially<T>(int chunkParts, List<T> inputList)
{
List<int> Splits = split(inputList.Count, chunkParts);
var skipNumber = 0;
List<List<T>> list = new List<List<T>>();
foreach (var count in Splits)
{
var internalList = inputList.Skip(skipNumber).Take(count).ToList();
list.Add(internalList);
skipNumber += count;
}
return list;
}
static List<int> split(int x, int n)
{
List<int> list = new List<int>();
if (x % n == 0)
{
for (int i = 0; i < n; i++)
list.Add(x / n);
}
else
{
// upto n-(x % n) the values
// will be x / n
// after that the values
// will be x / n + 1
int zp = n - (x % n);
int pp = x / n;
for (int i = 0; i < n; i++)
{
if (i >= zp)
list.Add((pp + 1));
else
list.Add(pp);
}
}
return list;
}
int[] items = new int[] { 0,1,2,3,4,5,6,7,8,9, 10 };
int itemIndex = 0;
int groupSize = 2;
int nextGroup = groupSize;
var seqItems = from aItem in items
group aItem by
(itemIndex++ < nextGroup)
?
nextGroup / groupSize
:
(nextGroup += groupSize) / groupSize
into itemGroup
select itemGroup.AsEnumerable();
Just came across this thread, and most of the solutions here involve adding items to collections, effectively materialising each page before returning it. This is bad for two reasons - firstly if your pages are large there's a memory overhead to filling the page, secondly there are iterators which invalidate previous records when you advance to the next one (for example if you wrap a DataReader within an enumerator method).
This solution uses two nested enumerator methods to avoid any need to cache items into temporary collections. Since the outer and inner iterators are traversing the same enumerable, they necessarily share the same enumerator, so it's important not to advance the outer one until you're done with processing the current page. That said, if you decide not to iterate all the way through the current page, when you move to the next page this solution will iterate forward to the page boundary automatically.
using System.Collections.Generic;
public static class EnumerableExtensions
{
/// <summary>
/// Partitions an enumerable into individual pages of a specified size, still scanning the source enumerable just once
/// </summary>
/// <typeparam name="T">The element type</typeparam>
/// <param name="enumerable">The source enumerable</param>
/// <param name="pageSize">The number of elements to return in each page</param>
/// <returns></returns>
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int pageSize)
{
var enumerator = enumerable.GetEnumerator();
while (enumerator.MoveNext())
{
var indexWithinPage = new IntByRef { Value = 0 };
yield return SubPartition(enumerator, pageSize, indexWithinPage);
// Continue iterating through any remaining items in the page, to align with the start of the next page
for (; indexWithinPage.Value < pageSize; indexWithinPage.Value++)
{
if (!enumerator.MoveNext())
{
yield break;
}
}
}
}
private static IEnumerable<T> SubPartition<T>(IEnumerator<T> enumerator, int pageSize, IntByRef index)
{
for (; index.Value < pageSize; index.Value++)
{
yield return enumerator.Current;
if (!enumerator.MoveNext())
{
yield break;
}
}
}
private class IntByRef
{
public int Value { get; set; }
}
}

Categories