Copy subarray based on index array - c#

There are a bunch of answers on how to get a subarray with start and end index or start index and length. But I m looking for a way to get an subarray based on index array.
Here is what I have (which works fine but seems clunky).
//sample arrays (could be unordered)
double[] expiry= { 0.99, 0.9, 0.75, 0.60, 0.5, 0.4, ...};
double[] values = { 0.245, 0.24, 0.235, 0.22, 0.21, 0.20, ... };
//find index of all elements meeting criteria in array expiry
int[] expind = expiry
.Select((b, i) => b == 0.75? i : -1)
.Where(i => i != -1).ToArray();
//create new arrays of appropriate size
double[] newvalues = new double[expind.Length];
//populate new arrays based on index
for (int i = 0; i < expind.Length; i++)
newvalues[i] = values[expind[i]];
//check values
foreach (var item in newvalues)
Console.WriteLine(item);
Is there a more efficient and general way of doing this please?
UPDATE
next attempt (Still not super efficient, but at least loopless):
Array.Sort(expiry, values);
double criteria = 0.75;
int start = Array.IndexOf(expiry, criteria);
int last = Array.LastIndexOf(expiry, criteria);
int length = last - start + 1;
double[] newvalues2 = new double[length];
Array.Copy(values, start, newvalues2, 0, length);

Hi you can find the values in this way using lambda expression:
double[] newVals = values.Where((t, i) => expiry[i] == 0.75).ToArray();

This is a bit more concise. no need to actually put the indexes into an expind array; just use the indices directly with the Where() overload that takes an index:
double[] newpoints = points.Where((p, i) => (expiry[i] == 0.75)).ToArray();
double[] newvalues = values.Where((v, i) => (expiry[i] == 0.75)).ToArray();
See deeper discussion.
Now, if for some reason you already have an array of expind indices, but not the original array of expiry it came from, you can do this:
double[] newpoints = expind.Select(ind => values[ind]).ToArray();

Depending on the circumstances, this might work for you.
private static IEnumerable<double> GetByCondition(List<double> expiry, List<double> value)
{
for(int i = 0; i < expiry.Count; i++)
if(expiry[i] == 0.75)
yield return value[i];
}
Furthermore, I'd put it as a extension method, if frequently used in your arrays/lists.
public static IEnumerable<double> GetValuesByExpiry(
this List<double> self, List<double> values)
{
return GetByCondition(self, values);
}
As #Corak mentioned, the problem might be eliminated all together if you merge those two arrays into a single one consisting of touples. If appropriate in your case, of course. You can probably zip them together.

Related

Why do the elements of the array change when they shouldn't?

I am writing code for Merge sort, I use object arrays with lists which are then sorted and merged, I know it's a bit strange and there is probably a better way to do it. When I recurse back to function in the code below, there a more elements than there should be, and I just don't get why it happens.
public static void RecurseSort(Array arr)
{
Array ForWork = arr;
if (ForWork.Length == 1)
{
MessageBox.Show("recurs finish");
}
else
{
List<object> ForRecurse = new List<object>();
Array arrCopy = new object[ForWork.Length / 2];
for (int i = 0; i < ForWork.Length - 1; i = i + 2)
{
List<int> r1 = (List<int>)ForWork.GetValue(i);
List<int> r2 = (List<int>)ForWork.GetValue(i + 1);
if (i == ForWork.Length - 3)
{
List<int> r3 =
(List<int>)ForWork.GetValue(ForWork.Length - 1);
r2.Add(r3[0]);
}
ForRecurse.Add(CompareAndMerge(r1, r2));
}
arrCopy = ForRecurse.ToArray();
RecurseSort(arrCopy);
}
}
So the arrCopy has the correct number of elements but literally when I press 'continue' in the visual studio debbuger, arr[3] has count of 3, when it should have been 2.
Divide and conquer - split it to smaller problems and solve them.
Copy
How do you copy data from array A to array B, for example what will be the result of:
int[] src = { 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270 };
int[] dest = { 17, 18, 19, 20 };
int length = 2;
Array.Copy(src, 4, dest, 2, length);
Arithmetics
How do you divide array to 2 - divide by 2, but if the size is uneven ex: 7, what will be the result of:
var length = 7;
var result = length / 2;
Type constraints
var length = 7d;
var result = length / 2;
Merging 2 sorted array
In merge sort you use another sort, for example insertion sort if you have only few elements left, ex: 20. So if you are given following insertion sort, how do you split array of 37 random numbers to 2 partisions, sort them and merge them.
static class InsertionSort<T> where T : IComparable {
public static void Sort(T[] entries, Int32 first, Int32 last) {
for (var index = first + 1; index <= last; index++)
insert(entries, first, index);
}
private static void insert(T[] entries, Int32 first, Int32 index) {
var entry = entries[index];
while (index > first && entries[index - 1].CompareTo(entry) > 0)
entries[index] = entries[--index];
entries[index] = entry;
}
}
In addition, debugger is your friend.
It is difficult to say as you did not provide the full text of the working example. But I suspect that you keeping references to objects in the array when you should not.
But most importantly I suggest you rewrite your code so it does not use casts. Try using List of List - List<List<int>> - it will be strongly typed and clear your algorithm. Also, I suggest writing it for concrete type to sort - int and later moving to Generic implementation with your method that will accept type <T> to sort.

getting the index of an array, multiple times, then storing them in an array c#

I want to get the index of an array which I have done with Array.IndexOf(array, value). This works good with one value but I want to get every occurrence of the value and store the index's into another array. For example, the name 'tom' appears 5 times in an array, I want to find the index positions of each one.
Maybe something like this? This uses a list rather than an array, but it follows the same idea.
List<int> Indexes = new List<int>();
for (int i = 0; i < array.Count(); i++)
{
if (array[i] == "tom")
{
Indexes.Add(i);
}
}
This solution is like the previous one, but will run faster:
string value = "tom";
int[] indices = stringArray.Where(s => s != null)
.Select((s, i) => s.Equals(value) ? i: -1)
.Where(i => i >= 0)
.ToArray();
If I'm not mistaken, you can add another parameter to IndexOf(), which will let you specify where in the array to start. This should give you more or less what you need:
var indices = new List<int>();
int i = Array.IndexOf(array, val);
while(i > -1){
indices.Add(i);
i = Array.IndexOf(array, val, i+1);
}
// ... and if it is important to have the result as an array:
int[] result = indices.ToArray();
Practical example:
var array = new int[]{ 1, 2, 3, 3, 4, 5, 3, 6, 7, 8, 3};
int val = 3;
var indices = new List<int>();
int i = Array.IndexOf(array, val);
while(i > -1){
indices.Add(i);
i = Array.IndexOf(array, val, i+1);
}
// ... and if it is important to have the result as an array:
int[] result = indices.ToArray();
Edit: Just realized a while-loop may well be a lot cleaner than a for-loop for this.
Edit 2: Due to popular demand (see comment below), here`s the original beautiful non-basic for-loop, re-introduced just for your reading pleasure:
for(int i = Array.IndexOf(array, val); i > -1; i = Array.IndexOf(array, val, i+1)){
indices.Add(i);
}
Could create an extension method to do it
namespace Extensions
{
public static class ArrayExtension
{
public static IEnumerable<int> GetIndicesOf<T>(this T[] target, T val, int start = 0)
{
EqualityComparer<T> comparer = EqualityComparer<T>.Default;
for (int i = start; i < target.Length; i++)
{
if (comparer.Equals(target[i], val))
{
yield return i;
}
}
}
}
}
Add the using statement for your namespace with the extension method using Extensions; in the file you want to call it in.
Then to call it just do the following to get the indices.
IEnumerable<int> indices = array.GetIndicesOf(value);
or to get an array just do
int[] indicies = array.GetIndicesOf(value).ToArray();
You can use LINQ's Select overload which uses elements index as well, like:
var indices = stringArray.Select((s, i) => new {Value = s, Index = i})
.Where(r => r.Value == "tom")
.Select(r => r.Index);

Efficient algorithm for removing an array from another array

I'm wondering if anyone knows a better (as in faster) algorithm/solution to solve my problem:
In my program I have an array of uints, from which I want to remove the entries contained in another uint array. However, I cannot use the union of the sets, because I need to keep duplicate values. Badly worded explaination, but the example should make it a bit clearer:
uint[] array_1 = new uint[7] { 1, 1, 1, 2, 3, 4, 4};
uint[] array_2 = new uint[4] { 1, 2, 3, 4 };
uint[] result = array_1 .RemoveRange(array_2);
// result should be: { 1, 1, 4 }
This is my current best idea; but it's fairly slow:
public static uint[] RemoveRange(this uint[] source_array, uint[] entries_to_remove)
{
int current_source_length = source_array.Length;
for (int i = 0; i < entries_to_remove.Length; i++)
{
for (int j = 0; j < current_source_length; j++)
{
if (entries_to_remove[i] == source_array[j])
{
// Shifts the entries in the source_array.
Buffer.BlockCopy(source_array, (j + 1)* 4 , source_array, j * 4, (current_source_length - j) * 4);
current_source_length--;
break;
}
}
}
uint[] new_array = new uint[current_source_length];
Buffer.BlockCopy(source_array, 0, new_array, 0, current_source_length * 4);
return new_array;
}
So again, can someone come up with a more clever approach to achieve what I want?
Thanks!
What about using a Dictionary<uint,int> using the uint number as the key and the number of times the number occurs as the value?
var source = new Dictionary<uint,int>();
source.Add(1,3);
source.Add(2,1);
source.Add(3,1);
source.Add(4,2);
var remove = new uint[]{ 1, 2, 3, 4 };
for (int i = 0; i<remove.Length; i++) {
int occurences;
if (source.TryGet(remove[i], out occurences)) {
if (occurences>1) {
source[remove[i]] = occurences-1;
} else {
source.Remove(remove[i]);
}
}
}
This would do what you want as far as I understand it, they key is reference counting of the number of occurrences and then using the remaining reference count (if > 0) as the number of times a number has to be emitted:
public static uint[] RemoveRange(this uint[] source_array, uint[] entries_to_remove)
{
var referenceCount = new Dictionary<uint, int>();
foreach (uint n in source_array)
{
if (!referenceCount.ContainsKey(n))
referenceCount[n] = 1;
else
referenceCount[n]++;
}
foreach (uint n in entries_to_remove)
{
if (referenceCount.ContainsKey(n))
referenceCount[n]--;
}
return referenceCount.Where(x => x.Value > 0)
.Select(x => Enumerable.Repeat(x.Key, x.Value))
.SelectMany( x => x)
.ToArray();
}
EDIT: This won't help you, since you want to keep duplicates.
I'm leaving it here for people who don't want duplicates.
Create a HashSet<T> from the second list, then call List<T>.RemoveAll with the hashset's Contains method.
var unwanted = new HashSet<uint(...);
list.RemoveAll(unwanted.Contains);
If you don't want to remove them in-place, you can use LINQ:
list.Except(unwanted);
Except will build two hashsets and return items one at a time (deferred execution0
If the arrays aren't sorted, sort them. Initialize 3 indexes to 0. 's'(source) and 'd' (dest) index the big array A, 'r' indexes the "toRemove" array B.
While r<B.length,
While B[r] > A[s], A[d++]= A[s++].
If B[r]==A[s], s++.
r++.
Endwhile.
While s<A.length, A[d++]= A[s++].
A.length = d.
This takes no extra space, and runs in O(N), (or N lg N if they are initially unsorted), compared to the N^2 I your original solution.
You can try using Linq here,
var resultarray = array1.Except(array2);

Find the smallest window of input array that contains all the elements of query array

Problem: Given an input array of integers of size n, and a query array of integers of size k, find the smallest window of input array that contains all the elements of query array and also in the same order.
I have tried below approach.
int[] inputArray = new int[] { 2, 5, 2, 8, 0, 1, 4, 7 };
int[] queryArray = new int[] { 2, 1, 7 };
Will find the position of all query array element in inputArray.
public static void SmallestWindow(int[] inputArray, int[] queryArray)
{
Dictionary<int, HashSet<int>> dict = new Dictionary<int, HashSet<int>>();
int index = 0;
foreach (int i in queryArray)
{
HashSet<int> hash = new HashSet<int>();
foreach (int j in inputArray)
{
index++;
if (i == j)
hash.Add(index);
}
dict.Add(i, hash);
index = 0;
}
// Need to perform action in above dictionary.??
}
I got following dictionary
int 2--> position {1, 3}
int 1 --> position {6}
int 7 --> position {8}
Now I want to perform following step to findout minimum window
Compare int 2 position to int 1 position. As (6-3) < (6-1)..So I will store 3, 6 in a hashmap.
Will compare the position of int 1 and int 7 same like above.
I cannot understand how I will compare two consecutive value of a dictionary. Please help.
The algorithm:
For each element in the query array, store in a map M (V → (I,P)), V is the element, I is an index into the input array, P is the position in the query array. (The index into the input array for some P is the largest such that query[0..P] is a subsequence of input[I..curr])
Iterate through the array.
If the value is the first term in the query array: Store the current index as I.
Else: Store the value of the index of the previous element in the query array, e.g. M[currVal].I = M[query[M[currVal].P-1]].I.
If the value is the last term: Check if [I..curr] is a new best.
Complexity
The complexity of this is O(N), where N is the size of the input array.
N.B.
This code expects that no elements are repeated in the query array. To cater for this, we can use a map M (V → listOf((I,P))). This is O(NhC(Q)), where hC(Q) is the count of the mode for the query array..
Even better would be to use M (V → listOf((linkedList(I), P))). Where repeated elements occur consecutively in the query array, we use a linked list. Updating those values then becomes O(1). The complexity is then O(NhC(D(Q))), where D(Q) is Q with consecutive terms merged.
Implementation
Sample java implementation is available here. This does not work for repeated elements in the query array, nor do error checking, etc.
I don't see how using HashSet and Dictionary will help you in this. Were I faced with this problem, I'd go about it quite differently.
One way to do it (not the most efficient way) is shown below. This code makes the assumption that queryArray contains at least two items.
int FindInArray(int[] a, int start, int value)
{
for (int i = start; i < a.Length; ++i)
{
if (a[i] == value)
return i;
}
return -1;
}
struct Pair
{
int first;
int last;
}
List<Pair> foundPairs = new List<Pair>();
int startPos = 0;
bool found = true;
while (found)
{
found = false;
// find next occurrence of queryArray[0] in inputArray
startPos = FindInArray(inputArray, startPos, queryArray[0]);
if (startPos == -1)
{
// no more occurrences of the first item
break;
}
Pair p = new Pair();
p.first = startPos;
++startPos;
int nextPos = startPos;
// now find occurrences of remaining items
for (int i = 1; i < queryArray.Length; ++i)
{
nextPos = FindInArray(inputArray, nextPos, queryArray[i]);
if (nextPos == -1)
{
break; // didn't find it
}
else
{
p.last = nextPos++;
found = (i == queryArray.Length-1);
}
}
if (found)
{
foundPairs.Add(p);
}
}
// At this point, the foundPairs list contains the (start, end) of all
// sublists that contain the items in order.
// You can then iterate through that list, subtract (last-first), and take
// the item that has the smallest value. That will be the shortest sublist
// that matches the criteria.
With some work, this could be made more efficient. For example, if 'queryArray' contains [1, 2, 3] and inputArray contains [1, 7, 4, 9, 1, 3, 6, 4, 1, 8, 2, 3], the above code will find three matches (starting at positions 0, 4, and 8). Slightly smarter code could determine that when the 1 at position 4 is found, since no 2 was found prior to it, that any sequence starting at the first position would be longer than the sequence starting at position 4, and therefore short-circuit the first sequence and start over at the new position. That complicates the code a bit, though.
You want not a HashSet but a (sorted) tree or array as the value in the dictionary; the dictionary contains mappings from values you find in the input array to the (sorted) list of indices where that value appears.
Then you do the following
Look up the first entry in the query. Pick the lowest index where it appears.
Look up the second entry; pick the lowest entry greater than the index of the first.
Look up the third; pick the lowest greater than the second. (Etc.)
When you reach the last entry in the query, (1 + last index - first index) is the size of the smallest match.
Now pick the second index of the first query, repeat, etc.
Pick the smallest match found from any of the starting indices.
(Note that the "lowest entry greater" is an operation supplied with sorted trees, or can be found via binary search on a sorted array.)
The complexity of this is approximately O(M*n*log n) where M is the length of the query and n is the average number of indices at which a given value appears in the input array. You can modify the strategy by picking that query array value that appears least often for the starting point and going up and down from there; if there are k of those entries (k <= n) then the complexity is O(M*k*log n).
After you got all the positions(indexes) in the inputArray:
2 --> position {0,2} // note: I change them to 0-based array
1 --> position {5,6} // I suppose it's {5,6} to make it more complex, in your code it's only {5}
7 --> position {7}
I use a recursion to get all possible paths. [0->5->7] [0->6->7] [2->5->7] [2->6->7]. The total is 2*2*1=4 possible paths. Obviously the one who has Min(Last-First) is the shortest path(smallest window), those numbers in the middle of the path don't matter. Here comes the code.
struct Pair
{
public int Number; // the number in queryArray
public int[] Indexes; // the positions of the number
}
static List<int[]> results = new List<int[]>(); //store all possible paths
static Stack<int> currResult = new Stack<int>(); // the container of current path
static int[] inputArray, queryArray;
static Pair[] pairs;
After the data structures, here is the Main.
inputArray = new int[] { 2, 7, 1, 5, 2, 8, 0, 1, 4, 7 }; //my test case
queryArray = new int[] { 2, 1, 7 };
pairs = (from n in queryArray
select new Pair { Number = n, Indexes = inputArray.FindAllIndexes(i => i == n) }).ToArray();
Go(0);
FindAllIndexes is an extension method to help find all the indexes.
public static int[] FindAllIndexes<T>(this IEnumerable<T> source, Func<T,bool> predicate)
{
//do necessary check here, then
Queue<int> indexes = new Queue<int>();
for (int i = 0;i<source.Count();i++)
if (predicate(source.ElementAt(i))) indexes.Enqueue(i);
return indexes.ToArray();
}
The recursion method:
static void Go(int depth)
{
if (depth == pairs.Length)
{
results.Add(currResult.Reverse().ToArray());
}
else
{
var indexes = pairs[depth].Indexes;
for (int i = 0; i < indexes.Length; i++)
{
if (depth == 0 || indexes[i] > currResult.Last())
{
currResult.Push(indexes[i]);
Go(depth + 1);
currResult.Pop();
}
}
}
}
At last, a loop of results can find the Min(Last-First) result(shortest window).
Algorithm:
get all indexes into the inputArray
of all queryArray values
order them ascending by index
using each index (x) as a starting
point find the first higher index
(y) such that the segment
inputArray[x-y] contains all
queryArray values
keep only those segments that have the queryArray items in order
order the segments by their lengths,
ascending
c# implementation:
First get all indexes into the inputArray of all queryArray values and order them ascending by index.
public static int[] SmallestWindow(int[] inputArray, int[] queryArray)
{
var indexed = queryArray
.SelectMany(x => inputArray
.Select((y, i) => new
{
Value = y,
Index = i
})
.Where(y => y.Value == x))
.OrderBy(x => x.Index)
.ToList();
Next, using each index (x) as a starting point find the first higher index (y) such that the segment inputArray[x-y] contains all queryArray values.
var segments = indexed
.Select(x =>
{
var unique = new HashSet<int>();
return new
{
Item = x,
Followers = indexed
.Where(y => y.Index >= x.Index)
.TakeWhile(y => unique.Count != queryArray.Length)
.Select(y =>
{
unique.Add(y.Value);
return y;
})
.ToList(),
IsComplete = unique.Count == queryArray.Length
};
})
.Where(x => x.IsComplete);
Now keep only those segments that have the queryArray items in order.
var queryIndexed = segments
.Select(x => x.Followers.Select(y => new
{
QIndex = Array.IndexOf(queryArray, y.Value),
y.Index,
y.Value
}).ToArray());
var queryOrdered = queryIndexed
.Where(item =>
{
var qindex = item.Select(x => x.QIndex).ToList();
bool changed;
do
{
changed = false;
for (int i = 1; i < qindex.Count; i++)
{
if (qindex[i] <= qindex[i - 1])
{
qindex.RemoveAt(i);
changed = true;
}
}
} while (changed);
return qindex.Count == queryArray.Length;
});
Finally, order the segments by their lengths, ascending. The first segment in the result is the smallest window into inputArray that contains all queryArray values in the order of queryArray.
var result = queryOrdered
.Select(x => new[]
{
x.First().Index,
x.Last().Index
})
.OrderBy(x => x[1] - x[0]);
var best = result.FirstOrDefault();
return best;
}
test it with
public void Test()
{
var inputArray = new[] { 2, 1, 5, 6, 8, 1, 8, 6, 2, 9, 2, 9, 1, 2 };
var queryArray = new[] { 6, 1, 2 };
var result = SmallestWindow(inputArray, queryArray);
if (result == null)
{
Console.WriteLine("no matching window");
}
else
{
Console.WriteLine("Smallest window is indexes " + result[0] + " to " + result[1]);
}
}
output:
Smallest window is indexes 3 to 8
Thank you everyone for your inputs. I have changed my code a bit and find it working. Though it might not be very efficient but I'm happy to solve using my head :). Please give your feedback
Here is my Pair class with having number and position as variable
public class Pair
{
public int Number;
public List<int> Position;
}
Here is a method which will return the list of all Pairs.
public static Pair[] GetIndex(int[] inputArray, int[] query)
{
Pair[] pairList = new Pair[query.Length];
int pairIndex = 0;
foreach (int i in query)
{
Pair pair = new Pair();
int index = 0;
pair.Position = new List<int>();
foreach (int j in inputArray)
{
if (i == j)
{
pair.Position.Add(index);
}
index++;
}
pair.Number = i;
pairList[pairIndex] = pair;
pairIndex++;
}
return pairList;
}
Here is the line of code in Main method
Pair[] pairs = NewCollection.GetIndex(array, intQuery);
List<int> minWindow = new List<int>();
for (int i = 0; i <pairs.Length - 1; i++)
{
List<int> first = pairs[i].Position;
List<int> second = pairs[i + 1].Position;
int? temp = null;
int? temp1 = null;
foreach(int m in first)
{
foreach (int n in second)
{
if (n > m)
{
temp = m;
temp1 = n;
}
}
}
if (temp.HasValue && temp1.HasValue)
{
if (!minWindow.Contains((int)temp))
minWindow.Add((int)temp);
if (!minWindow.Contains((int)temp1))
minWindow.Add((int)temp1);
}
else
{
Console.WriteLine(" Bad Query array");
minWindow.Clear();
break;
}
}
if(minWindow.Count > 0)
{
Console.WriteLine("Minimum Window is :");
foreach(int i in minWindow)
{
Console.WriteLine(i + " ");
}
}
It is worth noting that this problem is related to the longest common subsequence problem, so coming up with algorithms that run in better than O(n^2) time in the general case with duplicates would be challenging.
Just in case someone is interested in C++ implementation with O(nlog(k))
void findMinWindow(const vector<int>& input, const vector<int>& query) {
map<int, int> qtree;
for(vector<int>::const_iterator itr=query.begin(); itr!=query.end(); itr++) {
qtree[*itr] = 0;
}
int first_ptr=0;
int begin_ptr=0;
int index1 = 0;
int queptr = 0;
int flip = 0;
while(true) {
//check if value is in query
if(qtree.find(input[index1]) != qtree.end()) {
int x = qtree[input[index1]];
if(0 == x) {
flip++;
}
qtree[input[index1]] = ++x;
}
//remove all nodes that are not required and
//yet satisfy the all query condition.
while(query.size() == flip) {
//done nothing more
if(queptr == input.size()) {
break;
}
//check if queptr is pointing to node in the query
if(qtree.find(input[queptr]) != qtree.end()) {
int y = qtree[input[queptr]];
//more nodes and the queue is pointing to deleteable node
//condense the nodes
if(y > 1) {
qtree[input[queptr]] = --y;
queptr++;
} else {
//cant condense more just keep that memory
if((!first_ptr && !begin_ptr) ||
((first_ptr-begin_ptr)>(index1-queptr))) {
first_ptr=index1;
begin_ptr=queptr;
}
break;
}
} else {
queptr++;
}
}
index1++;
if(index1==input.size()) {
break;
}
}
cout<<"["<<begin_ptr<<" - "<<first_ptr<<"]"<<endl;
}
here the main for calling it.
#include <iostream>
#include <vector>
#include <map>
using namespace std;
int main() {
vector<int> input;
input.push_back(2);
input.push_back(5);
input.push_back(2);
input.push_back(8);
input.push_back(0);
input.push_back(1);
input.push_back(4);
input.push_back(7);
vector<int> query1;
query1.push_back(2);
query1.push_back(8);
query1.push_back(0);
vector<int> query2;
query2.push_back(2);
query2.push_back(1);
query2.push_back(7);
vector<int> query3;
query3.push_back(1);
query3.push_back(4);
findMinWindow(input, query1);
findMinWindow(input, query2);
findMinWindow(input, query3);
}

C#: Cleanest way to divide a string array into N instances N items long

I know how to do this in an ugly way, but am wondering if there is a more elegant and succinct method.
I have a string array of e-mail addresses. Assume the string array is of arbitrary length -- it could have a few items or it could have a great many items. I want to build another string consisting of say, 50 email addresses from the string array, until the end of the array, and invoke a send operation after each 50, using the string of 50 addresses in the Send() method.
The question more generally is what's the cleanest/clearest way to do this kind of thing. I have a solution that's a legacy of my VBScript learnings, but I'm betting there's a better way in C#.
You want elegant and succinct, I'll give you elegant and succinct:
var fifties = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/50;
foreach(var fifty in fifties)
Send(string.Join(";", fifty.ToArray());
Why mess around with all that awful looping code when you don't have to? You want to group things by fifties, then group them by fifties.
That's what the group operator is for!
UPDATE: commenter MoreCoffee asks how this works. Let's suppose we wanted to group by threes, because that's easier to type.
var threes = from index in Enumerable.Range(0, addresses.Length)
group addresses[index] by index/3;
Let's suppose that there are nine addresses, indexed zero through eight
What does this query mean?
The Enumerable.Range is a range of nine numbers starting at zero, so 0, 1, 2, 3, 4, 5, 6, 7, 8.
Range variable index takes on each of these values in turn.
We then go over each corresponding addresses[index] and assign it to a group.
What group do we assign it to? To group index/3. Integer arithmetic rounds towards zero in C#, so indexes 0, 1 and 2 become 0 when divided by 3. Indexes 3, 4, 5 become 1 when divided by 3. Indexes 6, 7, 8 become 2.
So we assign addresses[0], addresses[1] and addresses[2] to group 0, addresses[3], addresses[4] and addresses[5] to group 1, and so on.
The result of the query is a sequence of three groups, and each group is a sequence of three items.
Does that make sense?
Remember also that the result of the query expression is a query which represents this operation. It does not perform the operation until the foreach loop executes.
Seems similar to this question: Split a collection into n parts with LINQ?
A modified version of Hasan Khan's answer there should do the trick:
public static IEnumerable<IEnumerable<T>> Chunk<T>(
this IEnumerable<T> list, int chunkSize)
{
int i = 0;
var chunks = from name in list
group name by i++ / chunkSize into part
select part.AsEnumerable();
return chunks;
}
Usage example:
var addresses = new[] { "a#example.com", "b#example.org", ...... };
foreach (var chunk in Chunk(addresses, 50))
{
SendEmail(chunk.ToArray(), "Buy V14gr4");
}
It sounds like the input consists of separate email address strings in a large array, not several email address in one string, right? And in the output, each batch is a single combined string.
string[] allAddresses = GetLongArrayOfAddresses();
const int batchSize = 50;
for (int n = 0; n < allAddresses.Length; n += batchSize)
{
string batch = string.Join(";", allAddresses, n,
Math.Min(batchSize, allAddresses.Length - n));
// use batch somehow
}
Assuming you are using .NET 3.5 and C# 3, something like this should work nicely:
string[] s = new string[] {"1", "2", "3", "4"....};
for (int i = 0; i < s.Count(); i = i + 50)
{
string s = string.Join(";", s.Skip(i).Take(50).ToArray());
DoSomething(s);
}
I would just loop through the array and using StringBuilder to create the list (I'm assuming it's separated by ; like you would for email). Just send when you hit mod 50 or the end.
void Foo(string[] addresses)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < addresses.Length; i++)
{
sb.Append(addresses[i]);
if ((i + 1) % 50 == 0 || i == addresses.Length - 1)
{
Send(sb.ToString());
sb = new StringBuilder();
}
else
{
sb.Append("; ");
}
}
}
void Send(string addresses)
{
}
I think we need to have a little bit more context on what exactly this list looks like to give a definitive answer. For now I'm assuming that it's a semicolon delimeted list of email addresses. If so you can do the following to get a chunked up list.
public IEnumerable<string> DivideEmailList(string list) {
var last = 0;
var cur = list.IndexOf(';');
while ( cur >= 0 ) {
yield return list.SubString(last, cur-last);
last = cur + 1;
cur = list.IndexOf(';', last);
}
}
public IEnumerable<List<string>> ChunkEmails(string list) {
using ( var e = DivideEmailList(list).GetEnumerator() ) {
var list = new List<string>();
while ( e.MoveNext() ) {
list.Add(e.Current);
if ( list.Count == 50 ) {
yield return list;
list = new List<string>();
}
}
if ( list.Count != 0 ) {
yield return list;
}
}
}
I think this is simple and fast enough.The example below divides the long sentence into 15 parts,but you can pass batch size as parameter to make it dynamic.Here I simply divide using "/n".
private static string Concatenated(string longsentence)
{
const int batchSize = 15;
string concatanated = "";
int chanks = longsentence.Length / batchSize;
int currentIndex = 0;
while (chanks > 0)
{
var sub = longsentence.Substring(currentIndex, batchSize);
concatanated += sub + "/n";
chanks -= 1;
currentIndex += batchSize;
}
if (currentIndex < longsentence.Length)
{
int start = currentIndex;
var finalsub = longsentence.Substring(start);
concatanated += finalsub;
}
return concatanated;
}
This show result of split operation.
var parts = Concatenated(longsentence).Split(new string[] { "/n" }, StringSplitOptions.None);
Extensions methods based on Eric's answer:
public static IEnumerable<IEnumerable<T>> SplitIntoChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks;
}
public static T[][] SplitIntoArrayChunks<T>(this T[] source, int chunkSize)
{
var chunks = from index in Enumerable.Range(0, source.Length)
group source[index] by index / chunkSize;
return chunks.Select(e => e.ToArray()).ToArray();
}

Categories