Pagination of multiple sorted lists - c#

I have an unknown number of ordered lists that I need to do paging on.
For example, the pages for these 3 lists should look like this when the page size is 6.
List1: 01,02,03,04,05,06,07,08,09,10
List2: 11,12,13,14,15
List3: 16,17,18,19,20,21,22,23,24,25,26,27,28
Result Pages:
Page1: 01,11,16,02,12,17
Page2: 03,13,18,04,14,19
Page3: 05,15,20,06,21,07
Page4: 22,08,23,09,24,10
page5: 25,26,27,28
What will be the most efficient way to get which items should I take from each list (start index and number of items) when given the page number?
Take in consideration that each list can have a few hundred thousand of items so iterating through all of them will not be efficient.

I can't say if it's the most efficient way or not, but here is an algorithm with O(M*Log2(M)) time complexity where M is the number of the lists. It works as follows. The input set is grouped and sorted in ascending order by the item Count, which is iterated until the effective start index fits into current range, skipping the previous ranges. This is possible because at every step we know that it is the minimum count, hence all the remaining lists have items in that range. Once we are done with that, we emit the page items from the remaining lists.
Here is the function:
static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int pageIndex)
{
int start = pageIndex * pageSize;
var counts = new int[itemLists.Count];
for (int i = 0; i < counts.Length; i++)
counts[i] = itemLists[i].Count;
Array.Sort(counts);
int listCount = counts.Length;
int itemIndex = 0;
for (int i = 0; i < counts.Length; i++)
{
int itemCount = counts[i];
if (itemIndex < itemCount)
{
int rangeLength = listCount * (itemCount - itemIndex);
if (start < rangeLength) break;
start -= rangeLength;
itemIndex = itemCount;
}
listCount--;
}
if (listCount > 0)
{
var listQueue = new List<T>[listCount];
listCount = 0;
foreach (var list in itemLists)
if (itemIndex < list.Count) listQueue[listCount++] = list;
itemIndex += start / listCount;
int listIndex = 0;
int skipCount = start % listCount;
int nextCount = 0;
int yieldCount = 0;
while (true)
{
var list = listQueue[listIndex];
if (skipCount > 0)
skipCount--;
else
{
yield return list[itemIndex];
if (++yieldCount >= pageSize) break;
}
if (itemIndex + 1 < list.Count)
{
if (nextCount != listIndex)
listQueue[nextCount] = list;
nextCount++;
}
if (++listIndex < listCount) continue;
if (nextCount == 0) break;
itemIndex++;
listIndex = 0;
listCount = nextCount;
nextCount = 0;
}
}
}
and test:
static void Main(string[] args)
{
var data = new List<List<int>>
{
new List<int> { 01, 02, 03, 04, 05, 06, 07, 08, 09, 10 },
new List<int> { 11, 12, 13, 14, 15 },
new List<int> { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 },
};
int totalCount = data.Sum(list => list.Count);
int pageSize = 6;
int pageCount = 1 + (totalCount - 1) / pageSize;
for (int pageIndex = 0; pageIndex < pageCount; pageIndex++)
Console.WriteLine("Page #{0}: {1}", pageIndex + 1, string.Join(", ", GetPageItems(data, pageSize, pageIndex)));
Console.ReadLine();
}

I think it could be done nicely in two steps:
Flatten your lists to a single list (ordered in the way you describe).
Take items from that list for the desired page.
To accomplish step 1, I'd do something like what was suggested here: Merging multiple lists
So, (assuming your page items are ints, as in your example), here's a nice method that finds exactly the ones you want:
static IEnumerable<int> GetPageItems(IEnumerable<List<int>> itemLists, int pageSize, int page)
{
var mergedOrderedItems = itemLists.SelectMany(x => x.Select((s, index) => new { s, index }))
.GroupBy(x => x.index)
.SelectMany(x => x.Select(y => y.s));
// assuming that the first page is page 1, not page 0:
var startingIndex = pageSize * (page - 1);
var pageItems = mergedOrderedItems.Skip(startingIndex)
.Take(pageSize);
return pageItems;
}
Note - you don't have to worry about passing in a page# that exceeds the total number of pages that could exist given the total item count... Thanks to the magic of Linq, this method will simply return an empty IEnumerable. Likewise, if Take(pageSize) results in less than "pageSize" items, it simply returns the items that it did find.

I'll submit another implementation, based on Bear.S' feedback on my first answer. This one's pretty low-level and very performant. There are two major parts to it:
Figure out which item should appear first on the page (specifically what is the index of the list that contains it, and what is the item's index within that list).
Take items from all lists, in the correct order, as needed (until we have all that we need or run out of items).
This implementation doesn't iterate the individual lists during step 1. It does use List.Count property, but that is an O(1) operation.
Since we're going for performance here, the code isn't necessarily as self-descriptive as I'd like, so I put in some comments to help explain the logic:
static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int page)
{
if (page < 1)
{
return new List<T>();
}
// a simple copy so that we don't change the original (the individual Lists inside are untouched):
var lists = itemLists.ToList();
// Let's find the starting indexes for the first item on this page:
var currItemIndex = 0;
var currListIndex = 0;
var itemsToSkipCount = pageSize * (page - 1); // <-- assuming that the first page is page 1, not page 0
// I'll just break out of this loop manually, because I think this configuration actually makes
// the logic below a little easier to understand. Feel free to change it however you see fit :)
while (true)
{
var listsCount = lists.Count;
if (listsCount == 0)
{
return new List<T>();
}
// Let's consider a horizontal section of items taken evenly from all lists (based on the length of
// the shortest list). We don't need to iterate any items in the lists; Rather, we'll just count
// the total number of items we could get from this horizontal portion, and set our indexes accordingly...
var shortestListCount = lists.Min(x => x.Count);
var itemsWeAreConsideringCount = listsCount * (shortestListCount - currItemIndex);
// Does this horizontal section contain at least as many items as we must skip?
if (itemsWeAreConsideringCount >= itemsToSkipCount)
{ // Yes: So mathematically find the indexes of the first page item, and we're done.
currItemIndex += itemsToSkipCount / listsCount;
currListIndex = itemsToSkipCount % listsCount;
break;
}
else
{ // No: So we need to keep going. Let's increase currItemIndex to the end of this horizontal
// section, remove the shortest list(s), and the loop will continue with the remaining lists:
currItemIndex = shortestListCount;
lists.RemoveAll(x => x.Count == shortestListCount);
itemsToSkipCount -= itemsWeAreConsideringCount;
}
}
// Ok, we've got our starting indexes, and the remaining lists that still have items in the index range.
// Let's get our items from those lists:
var pageItems = new List<T>();
var largestListCount = lists.Max(x => x.Count);
// Loop until we have enough items to fill the page, or we run out of items:
while (pageItems.Count < pageSize && currItemIndex < largestListCount)
{
// Taking from one list at a time:
var currList = lists[currListIndex];
// If the list has an element at this index, get it:
if (currItemIndex < currList.Count)
{
pageItems.Add(currList[currItemIndex]);
}
// else... this list has no more elements.
// We could throw away this list, since it's pointless to iterate over it any more, but that might
// change the indices of other lists... for simplicity, I'm just gonna let it be... since the above
// logic simply ignores an empty list.
currListIndex++;
if (currListIndex == lists.Count)
{
currListIndex = 0;
currItemIndex++;
}
}
return pageItems;
}
Here's some test code, using three lists. I can grab 6 items off of page 1,000,000 in just a few milliseconds :)
var list1 = Enumerable.Range(0, 10000000).ToList();
var list2 = Enumerable.Range(10000000, 10000000).ToList();
var list3 = Enumerable.Range(20000000, 10000000).ToList();
var lists = new List<List<int>> { list1, list2, list3 };
var timer = new Stopwatch();
timer.Start();
var items = GetPageItems(lists, 6, 1000000).ToList();
var count = items.Count();
timer.Stop();

Related

LINQ get x amount of elements from a list

I have a query which I get as:
var query = Data.Items
.Where(x => criteria.IsMatch(x))
.ToList<Item>();
This works fine.
However now I want to break up this list into x number of lists, for example 3. Each list will therefore contain 1/3 the amount of elements from query.
Can it be done using LINQ?
You can use PLINQ partitioners to break the results into separate enumerables.
var partitioner = Partitioner.Create<Item>(query);
var partitions = partitioner.GetPartitions(3);
You'll need to reference the System.Collections.Concurrent namespace. partitions will be a list of IEnumerable<Item> where each enumerable returns a portion of the query.
I think something like this could work, splitting the list into IGroupings.
const int numberOfGroups = 3;
var groups = query
.Select((item, i) => new { item, i })
.GroupBy(e => e.i % numberOfGroups);
You can use Skip and Take in a simple for to accomplish what you want
var groupSize = (int)Math.Ceiling(query.Count() / 3d);
var result = new List<List<Item>>();
for (var j = 0; j < 3; j++)
result.Add(query.Skip(j * groupSize).Take(groupSize).ToList());
If the order of the elements doesn't matter using an IGrouping as suggested by Daniel Imms is probably the most elegant way (add .Select(gr => gr.Select(e => e.item)) to get an IEnumerable<IEnumerable<T>>).
If however you want to preserve the order you need to know the total number of elements. Otherwise you wouldn't know when to start the next group. You can do this with LINQ but it requires two enumerations: one for counting and another for returning the data (as suggested by Esteban Elverdin).
If enumerating the query is expensive you can avoid the second enumeration by turning the query into a list and then use the GetRange method:
public static IEnumerable<List<T>> SplitList<T>(List<T> list, int numberOfRanges)
{
int sizeOfRanges = list.Count / numberOfRanges;
int remainder = list.Count % numberOfRanges;
int startIndex = 0;
for (int i = 0; i < numberOfRanges; i++)
{
int size = sizeOfRanges + (remainder > 0 ? 1 : 0);
yield return list.GetRange(startIndex, size);
if (remainder > 0)
{
remainder--;
}
startIndex += size;
}
}
static void Main()
{
List<int> list = Enumerable.Range(0, 10).ToList();
IEnumerable<List<int>> result = SplitList(list, 3);
foreach (List<int> values in result)
{
string s = string.Join(", ", values);
Console.WriteLine("{{ {0} }}", s);
}
}
The output is:
{ 0, 1, 2, 3 }
{ 4, 5, 6 }
{ 7, 8, 9 }
You can create an extension method:
public static IList<List<T>> GetChunks<T>(this IList<T> items, int numOfChunks)
{
if (items.Count < numOfChunks)
throw new ArgumentException("The number of elements is lower than the number of chunks");
int div = items.Count / numOfChunks;
int rem = items.Count % numOfChunks;
var listOfLists = new List<T>[numOfChunks];
for (int i = 0; i < numOfChunks; i++)
listOfLists[i] = new List<T>();
int currentGrp = 0;
int currRemainder = rem;
foreach (var el in items)
{
int currentElementsInGrp = listOfLists[currentGrp].Count;
if (currentElementsInGrp == div && currRemainder > 0)
{
currRemainder--;
}
else if (currentElementsInGrp >= div)
{
currentGrp++;
}
listOfLists[currentGrp].Add(el);
}
return listOfLists;
}
then use it like this :
var chunks = query.GetChunks(3);
N.B.
in case of number of elements not divisible by the number of groups, the first groups will be bigger. e.g. [0,1,2,3,4] --> [0,1] - [2,3] - [4]

inserting element into a list, if condition is met with C#

How to insert some number into the middle of the list, if there is no such number present?
In the example below I'm trying to insert number 4
List<int> list1 = new List<int>(){ 0, 1, 2, 3, 5, 6 };
int must_enter = 4;
if (!list1.Contains(must_enter))
{
list1.Add(must_enter);
}
As the result number will be entered at the end of the list, but I want it right after 3 (before 5).
please note that due to project's specifics I can't use sorted list, but all numbers in the list are guaranteed to be in ascending order (0,2,6,9,10,...)
EDIT: I knew about an error and that's what I did:
List<int> list0 = new List<int>() { 1, 2, 3, 5, 6 };
int must_enter = 7;
if (!list0.Contains(must_enter))
{
if (must_enter < list0.Max())
{
int result = list0.FindIndex(item => item > must_enter || must_enter > list0.Max());
list0.Insert(result, must_enter);
}
else
{
list0.Add(must_enter);
}
}
edit2: anyway I've switched to BinarySearch method due to several factors. Everyone thanks for your help!
You could do something like this:
int index = list1.BinarySearch(must_enter);
if (index < 0)
list1.Insert(~index, must_enter);
This way you will keep the list sorted with the best possible performance.
You can do:
list1.Add(must_enter);
And then order the list:
list1 = list1.OrderBy(n => n).ToList();
The result will be:
0, 1, 2, 3, 4, 5, 6
EDIT:
Or use an extesion method:
static class Utility
{
public static void InsertElement(this List<int> list, int n)
{
if(!list.Contains(n))
{
for(int i = 0; i < list.Count; i++)
{
if(list[i] > n)
{
list.Insert(i-1, n);
break;
}
if(i == list.Count - 1)
list.Add(n);
}
}
}
}
And then:
list1.InsertElement(must_enter);
You are looking for
list1.Insert(index, must_enter);
To insert an element at a specific index rather than at the end of the list.
You'll have to find the index to insert at first which is easily done with a binary search. Start with the value in the middle of the list and compare it to your number to insert. If it's greater, search the lower half of the list, if it's more, search the upper half of the list. Repeat the process, dividing the list in half each time until you find the spot where the item before is less than the one you are inserting and the item after is more than the one you are inserting. (edit: of course, if you list is always very small, it's probably less hassle just to iterate through the list from the beginning to find the right spot!)
List<int> list1 = new List<int>() { 0, 1, 2, 3, 5, 6 };
int must_enter = 4;
for (int i = 0; i < list1.Count; i++)
{
if (must_enter >= list1[i])
{
list1.Insert(i + 1, must_enter);
}
}
Edit: I like sarwar026, implementation better.
list1.Insert(4, 4)
List<T>.Insert Method - Inserts an element into the List at the specified index.
Quick Note-
the Insert instance method on the List type does not have good performance in many cases. because for Insert, list has to adjust the following elements.
here is the original post from where i got this answer try it out may help you : Finding best position for element in list
List<int> list = new List<int>{0,2,6,9,10};
for (int i = 0; i < list1.Count; i++)
{
int index = list.BinarySearch(i);
if( i < 0)
{
int insertIndex = ~index;
list.Insert(insertIndex, i);
}
}
just for one missing element as op needs
int index = list.BinarySearch(4);
if( index < 0)
{
int insertIndex = ~index;
list.Insert(insertIndex, 4);
}
or
List<int> list1 = new List<int>() { 0,2,6,9,10 };
int must_enter = 4;
for (int i = 0; i < list1.Count; i++)
{
if (!list1.Contains(i))
{
list1.Insert(i , i);
}
}
just for one element as op needs
if (!list1.Contains(4))
{
list1.Insert(4 , 4);
}
List<int> list1 = new List<int>(){ 0, 1, 2, 3, 5, 6 };
int must_enter = 4;
if (!list1.Contains(must_enter))
{
int result = list.FindIndex(item => item > must_enter);
if(result!=-1)
list1.Insert(result, must_enter);
else // must_enter is not found
{
if(must_enter > list.Max()) // must_enter > max value of list
list1.Add(must_enter);
else if(must_enter < list.Min()) // must_enter < min value of list
list1.Insert(0, must_enter);
}
}
First, find the index of the number which is greater than must_enter(4) and then insert the must_enter to that position
if (!list1.Contains(must_enter))
{
SortedSet<int> sorted = new SortedSet<int>( list1 );
sorted.Add( must_enter );
list1 = sorted.ToList();
}

Find the smallest window of input array that contains all the elements of query array

Problem: Given an input array of integers of size n, and a query array of integers of size k, find the smallest window of input array that contains all the elements of query array and also in the same order.
I have tried below approach.
int[] inputArray = new int[] { 2, 5, 2, 8, 0, 1, 4, 7 };
int[] queryArray = new int[] { 2, 1, 7 };
Will find the position of all query array element in inputArray.
public static void SmallestWindow(int[] inputArray, int[] queryArray)
{
Dictionary<int, HashSet<int>> dict = new Dictionary<int, HashSet<int>>();
int index = 0;
foreach (int i in queryArray)
{
HashSet<int> hash = new HashSet<int>();
foreach (int j in inputArray)
{
index++;
if (i == j)
hash.Add(index);
}
dict.Add(i, hash);
index = 0;
}
// Need to perform action in above dictionary.??
}
I got following dictionary
int 2--> position {1, 3}
int 1 --> position {6}
int 7 --> position {8}
Now I want to perform following step to findout minimum window
Compare int 2 position to int 1 position. As (6-3) < (6-1)..So I will store 3, 6 in a hashmap.
Will compare the position of int 1 and int 7 same like above.
I cannot understand how I will compare two consecutive value of a dictionary. Please help.
The algorithm:
For each element in the query array, store in a map M (V → (I,P)), V is the element, I is an index into the input array, P is the position in the query array. (The index into the input array for some P is the largest such that query[0..P] is a subsequence of input[I..curr])
Iterate through the array.
If the value is the first term in the query array: Store the current index as I.
Else: Store the value of the index of the previous element in the query array, e.g. M[currVal].I = M[query[M[currVal].P-1]].I.
If the value is the last term: Check if [I..curr] is a new best.
Complexity
The complexity of this is O(N), where N is the size of the input array.
N.B.
This code expects that no elements are repeated in the query array. To cater for this, we can use a map M (V → listOf((I,P))). This is O(NhC(Q)), where hC(Q) is the count of the mode for the query array..
Even better would be to use M (V → listOf((linkedList(I), P))). Where repeated elements occur consecutively in the query array, we use a linked list. Updating those values then becomes O(1). The complexity is then O(NhC(D(Q))), where D(Q) is Q with consecutive terms merged.
Implementation
Sample java implementation is available here. This does not work for repeated elements in the query array, nor do error checking, etc.
I don't see how using HashSet and Dictionary will help you in this. Were I faced with this problem, I'd go about it quite differently.
One way to do it (not the most efficient way) is shown below. This code makes the assumption that queryArray contains at least two items.
int FindInArray(int[] a, int start, int value)
{
for (int i = start; i < a.Length; ++i)
{
if (a[i] == value)
return i;
}
return -1;
}
struct Pair
{
int first;
int last;
}
List<Pair> foundPairs = new List<Pair>();
int startPos = 0;
bool found = true;
while (found)
{
found = false;
// find next occurrence of queryArray[0] in inputArray
startPos = FindInArray(inputArray, startPos, queryArray[0]);
if (startPos == -1)
{
// no more occurrences of the first item
break;
}
Pair p = new Pair();
p.first = startPos;
++startPos;
int nextPos = startPos;
// now find occurrences of remaining items
for (int i = 1; i < queryArray.Length; ++i)
{
nextPos = FindInArray(inputArray, nextPos, queryArray[i]);
if (nextPos == -1)
{
break; // didn't find it
}
else
{
p.last = nextPos++;
found = (i == queryArray.Length-1);
}
}
if (found)
{
foundPairs.Add(p);
}
}
// At this point, the foundPairs list contains the (start, end) of all
// sublists that contain the items in order.
// You can then iterate through that list, subtract (last-first), and take
// the item that has the smallest value. That will be the shortest sublist
// that matches the criteria.
With some work, this could be made more efficient. For example, if 'queryArray' contains [1, 2, 3] and inputArray contains [1, 7, 4, 9, 1, 3, 6, 4, 1, 8, 2, 3], the above code will find three matches (starting at positions 0, 4, and 8). Slightly smarter code could determine that when the 1 at position 4 is found, since no 2 was found prior to it, that any sequence starting at the first position would be longer than the sequence starting at position 4, and therefore short-circuit the first sequence and start over at the new position. That complicates the code a bit, though.
You want not a HashSet but a (sorted) tree or array as the value in the dictionary; the dictionary contains mappings from values you find in the input array to the (sorted) list of indices where that value appears.
Then you do the following
Look up the first entry in the query. Pick the lowest index where it appears.
Look up the second entry; pick the lowest entry greater than the index of the first.
Look up the third; pick the lowest greater than the second. (Etc.)
When you reach the last entry in the query, (1 + last index - first index) is the size of the smallest match.
Now pick the second index of the first query, repeat, etc.
Pick the smallest match found from any of the starting indices.
(Note that the "lowest entry greater" is an operation supplied with sorted trees, or can be found via binary search on a sorted array.)
The complexity of this is approximately O(M*n*log n) where M is the length of the query and n is the average number of indices at which a given value appears in the input array. You can modify the strategy by picking that query array value that appears least often for the starting point and going up and down from there; if there are k of those entries (k <= n) then the complexity is O(M*k*log n).
After you got all the positions(indexes) in the inputArray:
2 --> position {0,2} // note: I change them to 0-based array
1 --> position {5,6} // I suppose it's {5,6} to make it more complex, in your code it's only {5}
7 --> position {7}
I use a recursion to get all possible paths. [0->5->7] [0->6->7] [2->5->7] [2->6->7]. The total is 2*2*1=4 possible paths. Obviously the one who has Min(Last-First) is the shortest path(smallest window), those numbers in the middle of the path don't matter. Here comes the code.
struct Pair
{
public int Number; // the number in queryArray
public int[] Indexes; // the positions of the number
}
static List<int[]> results = new List<int[]>(); //store all possible paths
static Stack<int> currResult = new Stack<int>(); // the container of current path
static int[] inputArray, queryArray;
static Pair[] pairs;
After the data structures, here is the Main.
inputArray = new int[] { 2, 7, 1, 5, 2, 8, 0, 1, 4, 7 }; //my test case
queryArray = new int[] { 2, 1, 7 };
pairs = (from n in queryArray
select new Pair { Number = n, Indexes = inputArray.FindAllIndexes(i => i == n) }).ToArray();
Go(0);
FindAllIndexes is an extension method to help find all the indexes.
public static int[] FindAllIndexes<T>(this IEnumerable<T> source, Func<T,bool> predicate)
{
//do necessary check here, then
Queue<int> indexes = new Queue<int>();
for (int i = 0;i<source.Count();i++)
if (predicate(source.ElementAt(i))) indexes.Enqueue(i);
return indexes.ToArray();
}
The recursion method:
static void Go(int depth)
{
if (depth == pairs.Length)
{
results.Add(currResult.Reverse().ToArray());
}
else
{
var indexes = pairs[depth].Indexes;
for (int i = 0; i < indexes.Length; i++)
{
if (depth == 0 || indexes[i] > currResult.Last())
{
currResult.Push(indexes[i]);
Go(depth + 1);
currResult.Pop();
}
}
}
}
At last, a loop of results can find the Min(Last-First) result(shortest window).
Algorithm:
get all indexes into the inputArray
of all queryArray values
order them ascending by index
using each index (x) as a starting
point find the first higher index
(y) such that the segment
inputArray[x-y] contains all
queryArray values
keep only those segments that have the queryArray items in order
order the segments by their lengths,
ascending
c# implementation:
First get all indexes into the inputArray of all queryArray values and order them ascending by index.
public static int[] SmallestWindow(int[] inputArray, int[] queryArray)
{
var indexed = queryArray
.SelectMany(x => inputArray
.Select((y, i) => new
{
Value = y,
Index = i
})
.Where(y => y.Value == x))
.OrderBy(x => x.Index)
.ToList();
Next, using each index (x) as a starting point find the first higher index (y) such that the segment inputArray[x-y] contains all queryArray values.
var segments = indexed
.Select(x =>
{
var unique = new HashSet<int>();
return new
{
Item = x,
Followers = indexed
.Where(y => y.Index >= x.Index)
.TakeWhile(y => unique.Count != queryArray.Length)
.Select(y =>
{
unique.Add(y.Value);
return y;
})
.ToList(),
IsComplete = unique.Count == queryArray.Length
};
})
.Where(x => x.IsComplete);
Now keep only those segments that have the queryArray items in order.
var queryIndexed = segments
.Select(x => x.Followers.Select(y => new
{
QIndex = Array.IndexOf(queryArray, y.Value),
y.Index,
y.Value
}).ToArray());
var queryOrdered = queryIndexed
.Where(item =>
{
var qindex = item.Select(x => x.QIndex).ToList();
bool changed;
do
{
changed = false;
for (int i = 1; i < qindex.Count; i++)
{
if (qindex[i] <= qindex[i - 1])
{
qindex.RemoveAt(i);
changed = true;
}
}
} while (changed);
return qindex.Count == queryArray.Length;
});
Finally, order the segments by their lengths, ascending. The first segment in the result is the smallest window into inputArray that contains all queryArray values in the order of queryArray.
var result = queryOrdered
.Select(x => new[]
{
x.First().Index,
x.Last().Index
})
.OrderBy(x => x[1] - x[0]);
var best = result.FirstOrDefault();
return best;
}
test it with
public void Test()
{
var inputArray = new[] { 2, 1, 5, 6, 8, 1, 8, 6, 2, 9, 2, 9, 1, 2 };
var queryArray = new[] { 6, 1, 2 };
var result = SmallestWindow(inputArray, queryArray);
if (result == null)
{
Console.WriteLine("no matching window");
}
else
{
Console.WriteLine("Smallest window is indexes " + result[0] + " to " + result[1]);
}
}
output:
Smallest window is indexes 3 to 8
Thank you everyone for your inputs. I have changed my code a bit and find it working. Though it might not be very efficient but I'm happy to solve using my head :). Please give your feedback
Here is my Pair class with having number and position as variable
public class Pair
{
public int Number;
public List<int> Position;
}
Here is a method which will return the list of all Pairs.
public static Pair[] GetIndex(int[] inputArray, int[] query)
{
Pair[] pairList = new Pair[query.Length];
int pairIndex = 0;
foreach (int i in query)
{
Pair pair = new Pair();
int index = 0;
pair.Position = new List<int>();
foreach (int j in inputArray)
{
if (i == j)
{
pair.Position.Add(index);
}
index++;
}
pair.Number = i;
pairList[pairIndex] = pair;
pairIndex++;
}
return pairList;
}
Here is the line of code in Main method
Pair[] pairs = NewCollection.GetIndex(array, intQuery);
List<int> minWindow = new List<int>();
for (int i = 0; i <pairs.Length - 1; i++)
{
List<int> first = pairs[i].Position;
List<int> second = pairs[i + 1].Position;
int? temp = null;
int? temp1 = null;
foreach(int m in first)
{
foreach (int n in second)
{
if (n > m)
{
temp = m;
temp1 = n;
}
}
}
if (temp.HasValue && temp1.HasValue)
{
if (!minWindow.Contains((int)temp))
minWindow.Add((int)temp);
if (!minWindow.Contains((int)temp1))
minWindow.Add((int)temp1);
}
else
{
Console.WriteLine(" Bad Query array");
minWindow.Clear();
break;
}
}
if(minWindow.Count > 0)
{
Console.WriteLine("Minimum Window is :");
foreach(int i in minWindow)
{
Console.WriteLine(i + " ");
}
}
It is worth noting that this problem is related to the longest common subsequence problem, so coming up with algorithms that run in better than O(n^2) time in the general case with duplicates would be challenging.
Just in case someone is interested in C++ implementation with O(nlog(k))
void findMinWindow(const vector<int>& input, const vector<int>& query) {
map<int, int> qtree;
for(vector<int>::const_iterator itr=query.begin(); itr!=query.end(); itr++) {
qtree[*itr] = 0;
}
int first_ptr=0;
int begin_ptr=0;
int index1 = 0;
int queptr = 0;
int flip = 0;
while(true) {
//check if value is in query
if(qtree.find(input[index1]) != qtree.end()) {
int x = qtree[input[index1]];
if(0 == x) {
flip++;
}
qtree[input[index1]] = ++x;
}
//remove all nodes that are not required and
//yet satisfy the all query condition.
while(query.size() == flip) {
//done nothing more
if(queptr == input.size()) {
break;
}
//check if queptr is pointing to node in the query
if(qtree.find(input[queptr]) != qtree.end()) {
int y = qtree[input[queptr]];
//more nodes and the queue is pointing to deleteable node
//condense the nodes
if(y > 1) {
qtree[input[queptr]] = --y;
queptr++;
} else {
//cant condense more just keep that memory
if((!first_ptr && !begin_ptr) ||
((first_ptr-begin_ptr)>(index1-queptr))) {
first_ptr=index1;
begin_ptr=queptr;
}
break;
}
} else {
queptr++;
}
}
index1++;
if(index1==input.size()) {
break;
}
}
cout<<"["<<begin_ptr<<" - "<<first_ptr<<"]"<<endl;
}
here the main for calling it.
#include <iostream>
#include <vector>
#include <map>
using namespace std;
int main() {
vector<int> input;
input.push_back(2);
input.push_back(5);
input.push_back(2);
input.push_back(8);
input.push_back(0);
input.push_back(1);
input.push_back(4);
input.push_back(7);
vector<int> query1;
query1.push_back(2);
query1.push_back(8);
query1.push_back(0);
vector<int> query2;
query2.push_back(2);
query2.push_back(1);
query2.push_back(7);
vector<int> query3;
query3.push_back(1);
query3.push_back(4);
findMinWindow(input, query1);
findMinWindow(input, query2);
findMinWindow(input, query3);
}

Merging two lists into one and sorting the items

Is there a way to merge(union without dupes) two given lists into one and store the items in sorted way by using ONE for loop?
Also, i am looking for a solution which does not makes use of API methods ( like, union, sort etc).
Sample Code.
private static void MergeAndOrder()
{
var listOne = new List<int> {3, 4, 1, 2, 7, 6, 9, 11};
var listTwo = new List<int> {1, 7, 8, 3, 5, 10, 15, 12};
//Without Using C# helper methods...
//ToDo.............................
//Using C# APi.
var expectedResult = listOne.Union(listTwo).ToList();
expectedResult.Sort();//Output: 1,2,3,4,5,6,7,8,9,10,11,12,15
//I need the same result without using API methods, and that too by iterating over items only once.
}
PS: I have been asked this question in an interview, but couldn't find answer as yet.
Why can't you use the api methods? Re-inventing the wheel is dumb. Also, it's the .ToList() call that's killing you. Never call .ToList() or .ToArray() until you absolutely have to, because they break your lazy evaluation.
Do it like this and you'll enumerate the lists with the minimum amount necessary:
var expectedResult = listOne.Union(listTwo).OrderBy(i => i);
This will do the union in one loop using a hashset, and lazy execution means the base-pass for the sort will piggyback on the union. But I don't think it's possible finish the sort in a single iteration, because sorting is not a O(n) operation.
Without the precondition that both lists are sorted before the merge + sort operation, you can't do this in O(n) time (or "using one loop").
Add that precondition and the problem is very easy.
Keep two iterators, one for each list. On each loop, compare the element from each list and choose the smaller. Increment that list's iterator. If the element you are about to insert in the final list is already the last element in that list, skip the insert.
In pseudocode:
List a = { 1, 3, 5, 7, 9 }
List b = { 2, 4, 6, 8, 10 }
List result = { }
int i=0, j=0, lastIndex=0
while(i < a.length || j < b.length)
// If we're done with a, just gobble up b (but don't add duplicates)
if(i >= a.length)
if(result[lastIndex] != b[j])
result[++lastIndex] = b[j]
j++
continue
// If we're done with b, just gobble up a (but don't add duplicates)
if(j >= b.length)
if(result[lastIndex] != a[i])
result[++lastIndex] = a[i]
i++
continue
int smallestVal
// Choose the smaller of a or b
if(a[i] < b[j])
smallestVal = a[i++]
else
smallestVal = b[j++]
// Don't insert duplicates
if(result[lastIndex] != smallestVal)
result[++lastIndex] = smallestVal
end while
private static void MergeTwoSortedArray(int[] first, int[] second)
{
//throw new NotImplementedException();
int[] result = new int[first.Length + second.Length];
int i=0 , j=0 , k=0;
while(i < first.Length && j <second.Length)
{
if(first[i] < second[j])
{
result[k++] = first[i++];
}
else
{
result[k++] = second[j++];
}
}
if (i < first.Length)
{
for (int a = i; a < first.Length; a++)
result[k] = first[a];
}
if (j < second.Length)
{
for (int a = j; a < second.Length; a++)
result[k++] = second[a];
}
foreach (int a in result)
Console.Write(a + " ");
Console.WriteLine();
}
Using iterators and streaming interface the task is not that complicated:
class MergeTwoSortedLists
{
static void Main(string[] args) {
var list1 = new List<int?>() {
1,3,5,9,11
};
var list2 = new List<int?>() {
2,5,6,11,15,17,19,29
};
foreach (var c in SortedAndMerged(list1.GetEnumerator(), list2.GetEnumerator())) {
Console.Write(c+" ");
}
Console.ReadKey();
}
private static IEnumerable<int> SortedAndMerged(IEnumerator<int?> e1, IEnumerator<int?> e2) {
e2.MoveNext();
e1.MoveNext();
do {
while (e1.Current < e2.Current) {
if (e1.Current != null) yield return e1.Current.Value;
e1.MoveNext();
}
if (e2.Current != null) yield return e2.Current.Value;
e2.MoveNext();
} while (!(e1.Current == null && e2.Current == null));
}
}
Try this:
public static IEnumerable<T> MergeWith<T>(IEnumerable<T> collection1, IEnumerable<T> collection2,
IComparer<T> comparer)
{
using (var enumerator1 = collection1.GetEnumerator())
using (var enumerator2 = collection2.GetEnumerator())
{
var isMoveNext1 = enumerator1.MoveNext();
var isMoveNext2 = enumerator2.MoveNext();
do
{
while (comparer.Compare(enumerator1.Current, enumerator2.Current) < 0 || !isMoveNext2)
{
if (isMoveNext1)
yield return enumerator1.Current;
else
break;
isMoveNext1 = enumerator1.MoveNext();
}
if (isMoveNext2)
yield return enumerator2.Current;
isMoveNext2 = enumerator2.MoveNext();
} while (isMoveNext1 || isMoveNext2);
}
}
You could write a loop that merges and de-dups the lists and uses a binary-search approach to insert new values into the destination list.
var listOne = new List<int> { 3, 4, 1, 2, 7, 6, 9, 11 };
var listTwo = new List<int> { 1, 7, 8, 3, 5, 10, 15, 12 };
var result = listOne.ToList();
foreach (var n in listTwo)
{
if (result.IndexOf(n) == -1)
result.Add(n);
}
The closest solution I see would be to allocate an array knowing that integers are bounded to some value.
int[] values = new int[ Integer.MAX ]; // initialize with 0
int size1 = list1.size();
int size2 = list2.size();
for( int pos = 0; pos < size1 + size2 ; pos++ )
{
int val = pos > size1 ? list2[ pos-size1 ] : list1[ pos ] ;
values[ val ]++;
}
Then you can argue that you have the sorted array in a "special" form :-) To get a clean sorted array, you need to traverse the values array, skip all position with 0 count, and build the final list.
This will only work for lists of integers, but happily that is what you have!
List<int> sortedList = new List<int>();
foreach (int x in listOne)
{
sortedList<x> = x;
}
foreach (int x in listTwo)
{
sortedList<x> = x;
}
This is using the values in each list as the index position at which to store the value. Any duplicate values will overwrite the previous entry at that index position. It meets the requirement of only one iteration over the values.
It does of course mean that there will be 'empty' positions in the list.
I suspect the job position has been filled by now though.... :-)

C#: Cleanest way to divide a string array into N instances N items long

I know how to do this in an ugly way, but am wondering if there is a more elegant and succinct method.
I have a string array of e-mail addresses. Assume the string array is of arbitrary length -- it could have a few items or it could have a great many items. I want to build another string consisting of say, 50 email addresses from the string array, until the end of the array, and invoke a send operation after each 50, using the string of 50 addresses in the Send() method.
The question more generally is what's the cleanest/clearest way to do this kind of thing. I have a solution that's a legacy of my VBScript learnings, but I'm betting there's a better way in C#.
You want elegant and succinct, I'll give you elegant and succinct:
var fifties = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/50;
foreach(var fifty in fifties)
Send(string.Join(";", fifty.ToArray());
Why mess around with all that awful looping code when you don't have to? You want to group things by fifties, then group them by fifties.
That's what the group operator is for!
UPDATE: commenter MoreCoffee asks how this works. Let's suppose we wanted to group by threes, because that's easier to type.
var threes = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/3;
Let's suppose that there are nine addresses, indexed zero through eight
What does this query mean?
The Enumerable.Range is a range of nine numbers starting at zero, so 0, 1, 2, 3, 4, 5, 6, 7, 8.
Range variable index takes on each of these values in turn.
We then go over each corresponding addresses[index] and assign it to a group.
What group do we assign it to? To group index/3. Integer arithmetic rounds towards zero in C#, so indexes 0, 1 and 2 become 0 when divided by 3. Indexes 3, 4, 5 become 1 when divided by 3. Indexes 6, 7, 8 become 2.
So we assign addresses[0], addresses[1] and addresses[2] to group 0, addresses[3], addresses[4] and addresses[5] to group 1, and so on.
The result of the query is a sequence of three groups, and each group is a sequence of three items.
Does that make sense?
Remember also that the result of the query expression is a query which represents this operation. It does not perform the operation until the foreach loop executes.
Seems similar to this question: Split a collection into n parts with LINQ?
A modified version of Hasan Khan's answer there should do the trick:
public static IEnumerable<IEnumerable<T>> Chunk<T>(
this IEnumerable<T> list, int chunkSize)
{
int i = 0;
var chunks = from name in list
group name by i++ / chunkSize into part
select part.AsEnumerable();
return chunks;
}
Usage example:
var addresses = new[] { "a#example.com", "b#example.org", ...... };
foreach (var chunk in Chunk(addresses, 50))
{
SendEmail(chunk.ToArray(), "Buy V14gr4");
}
It sounds like the input consists of separate email address strings in a large array, not several email address in one string, right? And in the output, each batch is a single combined string.
string[] allAddresses = GetLongArrayOfAddresses();
const int batchSize = 50;
for (int n = 0; n < allAddresses.Length; n += batchSize)
{
string batch = string.Join(";", allAddresses, n,
Math.Min(batchSize, allAddresses.Length - n));
// use batch somehow
}
Assuming you are using .NET 3.5 and C# 3, something like this should work nicely:
string[] s = new string[] {"1", "2", "3", "4"....};
for (int i = 0; i < s.Count(); i = i + 50)
{
string s = string.Join(";", s.Skip(i).Take(50).ToArray());
DoSomething(s);
}
I would just loop through the array and using StringBuilder to create the list (I'm assuming it's separated by ; like you would for email). Just send when you hit mod 50 or the end.
void Foo(string[] addresses)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < addresses.Length; i++)
{
sb.Append(addresses[i]);
if ((i + 1) % 50 == 0 || i == addresses.Length - 1)
{
Send(sb.ToString());
sb = new StringBuilder();
}
else
{
sb.Append("; ");
}
}
}
void Send(string addresses)
{
}
I think we need to have a little bit more context on what exactly this list looks like to give a definitive answer. For now I'm assuming that it's a semicolon delimeted list of email addresses. If so you can do the following to get a chunked up list.
public IEnumerable<string> DivideEmailList(string list) {
var last = 0;
var cur = list.IndexOf(';');
while ( cur >= 0 ) {
yield return list.SubString(last, cur-last);
last = cur + 1;
cur = list.IndexOf(';', last);
}
}
public IEnumerable<List<string>> ChunkEmails(string list) {
using ( var e = DivideEmailList(list).GetEnumerator() ) {
var list = new List<string>();
while ( e.MoveNext() ) {
list.Add(e.Current);
if ( list.Count == 50 ) {
yield return list;
list = new List<string>();
}
}
if ( list.Count != 0 ) {
yield return list;
}
}
}
I think this is simple and fast enough.The example below divides the long sentence into 15 parts,but you can pass batch size as parameter to make it dynamic.Here I simply divide using "/n".
private static string Concatenated(string longsentence)
{
const int batchSize = 15;
string concatanated = "";
int chanks = longsentence.Length / batchSize;
int currentIndex = 0;
while (chanks > 0)
{
var sub = longsentence.Substring(currentIndex, batchSize);
concatanated += sub + "/n";
chanks -= 1;
currentIndex += batchSize;
}
if (currentIndex < longsentence.Length)
{
int start = currentIndex;
var finalsub = longsentence.Substring(start);
concatanated += finalsub;
}
return concatanated;
}
This show result of split operation.
var parts = Concatenated(longsentence).Split(new string[] { "/n" }, StringSplitOptions.None);
Extensions methods based on Eric's answer:
public static IEnumerable<IEnumerable<T>> SplitIntoChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks;
}
public static T[][] SplitIntoArrayChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks.Select(e => e.ToArray()).ToArray();
}

Categories