I need to find how many times the Anagrams are contained in a String like in this example:(the anagrams and the string itself)
Input 1(String) = thegodsanddogsweredogged
Input 2(String) = dog
Output(int) = 3
the output will be 3 because of these - (thegodsanddogsweredogged)
So far i managed to check how many times the word "dog" is contained in the string:
public ActionResult FindAnagram (string word1, string word2)
{
int ?count = Regex.Matches(word1, word2).Count;
return View(count);
}
This works to check how many times the word is contained but i still get an error: Cannot convert null to 'int' because it is a non-nulable value type.
So i need to check for how many times input 2 and the anagrams of input 2(dog,god,dgo,ogd etc) are contained in input 1?(int this case its 3 times-thegodsanddogsweredogged)
Thank you
I wanted to post a variation which is more readable at the cost of some runtime performance. Anton's answer is probably more performant, but IMHO less readable than it could be.
The nice thing about anagrams is that you know their exact length, and you can figure out all possible anagram locations quite easily. For a 3 letter anagram in a 100 letter haystack, you know that there are 98 possible locations:
0..2
1..3
2..4
...
96..98
97..99
These indexes can be generated quite easily:
var amountOfPossibleAnagramLocations = haystack.Length - needle.Length + 1;
var substringIndexes = Enumerable.Range(0, amountOfPossibleAnagramLocations);
At this point, you simply take every listed substring and test if it's an anagram.
var anagramLength = needle.Length;
int count = 0;
foreach(var index in substringIndexes)
{
var substring = haystack.Substring(index, anagramLength);
if(substring.IsAnagramOf(needle))
count++;
}
Note that a lot of this can be condensed into a single LINQ chain:
var amountOfPossibleAnagramLocations = haystack.Length - needle.Length + 1;
var anagramLength = needle.Length;
var anagramCount = Enumerable
.Range(0, amountOfPossibleAnagramLocations)
.Select(x => haystack.Substring(x, anagramLength))
.Count(substring => substring.IsAnagramOf(needle));
Whether it's more readable or not depends on how comfortable you are with LINQ. I personally prefer it (up to a reasonable size, of course).
To check for an anagram, simply sort the characters and check for equality. I used an extension method for the readability bonus:
public static bool IsAnagramOf(this string word1, string word2)
{
var word1Sorted = String.Concat(word1.OrderBy(c => c));
var word2Sorted = String.Concat(word2.OrderBy(c => c));
return word1Sorted == word2Sorted;
}
I've omitted things like case insensitivity or ignoring whitespace for the sake of brevity.
It would be better not to try to use Regex but write your own logic.
You can use a dictionary with key char - a letter of the word and value int - number of letter occurrences. And build such a dictionary for the word.
Anagrams will have similar dictionaries, so you can build a temp dictionary for each temp string built using the Windowing method over your str and compare it with the dictionary built for your word.
Here is my code:
using System;
using System.Collections.Generic;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
var str = "thegodsanddogsweredogged";
var word = "dog";
Console.WriteLine("Word: " + word);
Console.WriteLine("Str: " + str);
Console.WriteLine();
var count = CountAnagrams(str, word);
Console.WriteLine("Count: " + count);
Console.ReadKey();
}
private static int CountAnagrams(string str, string word) {
var charDict = BuildCharDict(word);
int count = 0;
for (int i = 0; i < str.Length - word.Length + 1; i++) {
string tmp = "";
for (int j = i; j < str.Length; j++) {
tmp += str[j];
if (tmp.Length == word.Length)
break;
}
var tmpCharDict = BuildCharDict(tmp);
if (CharDictsEqual(charDict, tmpCharDict)) {
count++;
Console.WriteLine("Anagram: " + tmp);
Console.WriteLine("Index: " + i);
Console.WriteLine();
}
}
return count;
}
private static Dictionary<char, int> BuildCharDict(string word) {
var charDict = new Dictionary<char, int>();
foreach (var ch in word)
{
if (charDict.ContainsKey(ch))
{
charDict[ch] += 1;
}
else
{
charDict[ch] = 1;
}
}
return charDict;
}
private static bool CharDictsEqual(Dictionary<char, int> dict1, Dictionary<char, int> dict2)
{
if (dict1.Count != dict2.Count)
return false;
foreach (var kv in dict1) {
if (!dict2.TryGetValue(kv.Key, out var val) || val != kv.Value)
{
return false;
}
}
return true;
}
}
}
Possibly there is a better solution, but mine works.
P.S. About your error. You should change int? to int, because your View might expect non-nullable int type
Is there a way to remove characters from a current character array and then save it to a new character array. Following is the code:
string s1 = "move";
string s2 = "remove";
char[] c1 = s1.ToCharArray();
char[] c2 = s2.ToCharArray();
for (int i = 0; i < s2.Length; i++)
{
for (int p = 0; p < s1.Length; p++)
{
if (c2[i] == c1[p])
{
// REMOVE LETTER FROM C2
}
// IN THE END I SHOULD JUST HAVE c3 = re (ALL THE MATCHING CHARACTERS M-O-V-E SHOULD BE
DELETED)
Would appreciate your help
You can create a third array, c3, where you will add characters from c2 that are not to be removed.
You may also use Replace.
string s3 = s2.Replace(s1,"");
The original O(N^2) approach is wasteful. And I don't see how the other two answers actually perform the work you seem to be trying to accomplish. I hope this example, which has O(N) performance, works better for you:
string s1 = "move";
string s2 = "remove";
HashSet<char> excludeCharacters = new HashSet<char>(s1);
StringBuilder sb = new StringBuilder();
// Copy every character from the original string, except those to be excluded
foreach (char ch in s2)
{
if (!excludeCharacters.Contains(ch))
{
sb.Append(ch);
}
}
return sb.ToString();
Granted, for short strings the performance isn't likely to matter. But IMHO this is also easier to comprehend than the alternatives.
EDIT:
It is still not entirely clear to me what the OP is trying to do here. The most obvious task would be to remove whole words, but neither of his descriptions seem to say that's what he really wants. So, on the assumption that the above is not addressing his needs, but that he also does not want to remove whole words, here are a couple of other options...
1) O(N), the best approach for strings of non-trivial length, but is somewhat more complicated:
string s1 = "move";
string s2 = "remove";
Dictionary<char, int> excludeCharacters = new Dictionary<char, int>();
foreach (char ch in s1)
{
int count;
excludeCharacters.TryGetValue(ch, out count);
excludeCharacters[ch] = ++count;
}
StringBuilder sb = new StringBuilder();
foreach (char ch in s2)
{
int count;
if (!excludeCharacters.TryGetValue(ch, out count) || count == 0)
{
sb.Append(ch);
}
else
{
excludeCharacters[ch] = --count;
}
}
return sb.ToString();
2) An O(N^2) implementation which at least minimizes other unnecessary inefficiencies and which would suffice if all the inputs are relatively short:
StringBuilder sb = new StringBuilder(s2);
foreach (char ch in s1)
{
for (int i = 0; i < sb.Length; i++)
{
if (sb[i] == ch)
{
sb.Remove(i, 1);
break;
}
}
}
return sb.ToString();
This isn't particularly efficient, but it will probably be fast enough for short strings:
string s1 = "move";
string s2 = "remove";
foreach (char charToRemove in s1)
{
int index = s2.IndexOf(charToRemove);
if (index >= 0)
s2 = s2.Remove(index, 1);
}
// Result is now in s2.
Console.WriteLine(s2);
This avoids converting to a char array.
However, just to emphasize: This will be VERY slow for large strings.
[EDIT]
I have done some testing, and it turns out that this code is in fact pretty fast.
Here I'm comparing the code with the optimized code from another answer. However, note that we're not comparing entirely fairly, since the code here correctly implements the OP's requirement while the other code doesn't. However, it does demonstrate that the use of HashSet doesn't help as much as one might think.
I tested this code on a release build, not run inside a debugger (if you run it in a debugger, it does a debug build not a release build which will give incorrect timings).
This test uses target strings of length 1024 and chars to remove == "SKFPBPENAALDKOWJKFPOSKLW".
My results, where test1() is the incorrect but supposedly optimized solution from another answer, and test2() is my unoptimized but correct solution:
test1() took 00:00:00.2891665
test2() took 00:00:00.1004743
test1() took 00:00:00.2720192
test2() took 00:00:00.0993898
test1() took 00:00:00.2753971
test2() took 00:00:00.0997268
test1() took 00:00:00.2754325
test2() took 00:00:00.1026486
test1() took 00:00:00.2785548
test2() took 00:00:00.1039417
test1() took 00:00:00.2818029
test2() took 00:00:00.1029695
test1() took 00:00:00.2727377
test2() took 00:00:00.0995654
test1() took 00:00:00.2711982
test2() took 00:00:00.1009849
As you can see, test2() consistently outperforms test1(). This remains true even if the strings are increased to length 8192.
The test code:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
namespace Demo
{
public static class Program
{
private static void Main(string[] args)
{
var sw = new Stopwatch();
string text = randomString(8192, 27367);
string charsToRemove = "SKFPBPENAALDKOWJKFPOSKLW";
int dummyLength = 0;
int iters = 10000;
for (int trial = 0; trial < 8; ++trial)
{
sw.Restart();
for (int i = 0; i < iters; ++i)
dummyLength += test1(text, charsToRemove).Length;
Console.WriteLine("test1() took " + sw.Elapsed);
sw.Restart();
for (int i = 0; i < iters; ++i)
dummyLength += test2(text, charsToRemove).Length;
Console.WriteLine("test2() took " + sw.Elapsed);
Console.WriteLine();
}
}
private static string randomString(int length, int seed)
{
var rng = new Random(seed);
var sb = new StringBuilder(length);
for (int i = 0; i < length; ++i)
sb.Append((char) rng.Next(65, 65 + 26*2));
return sb.ToString();
}
private static string test1(string text, string charsToRemove)
{
HashSet<char> excludeCharacters = new HashSet<char>(charsToRemove);
StringBuilder sb = new StringBuilder();
foreach (char ch in text)
{
if (!excludeCharacters.Contains(ch))
{
sb.Append(ch);
}
}
return sb.ToString();
}
private static string test2(string text, string charsToRemove)
{
foreach (char charToRemove in charsToRemove)
{
int index = text.IndexOf(charToRemove);
if (index >= 0)
text = text.Remove(index, 1);
}
return text;
}
}
}
[EDIT 2]
Here's a much more optimized solution:
public static string RemoveChars(string text, string charsToRemove)
{
char[] result = new char[text.Length];
char[] targets = charsToRemove.ToCharArray();
int n = 0;
int m = targets.Length;
foreach (char ch in text)
{
if (m == 0)
{
result[n++] = ch;
}
else
{
int index = findFirst(targets, ch, m);
if (index < 0)
{
result[n++] = ch;
}
else
{
if (m > 1)
{
--m;
targets[index] = targets[m];
}
else
{
m = 0;
}
}
}
}
return new string(result, 0, n);
}
private static int findFirst(char[] chars, char target, int n)
{
for (int i = 0; i < n; ++i)
if (chars[i] == target)
return i;
return -1;
}
Plugging that into my test program above shows that it runs around 3 times faster than test2().
I have a sanctions api that i need to call, passing in a string of values. these values are constructed as follows:
string searchString = string.Join(" ", myList.ToArray());
// remove any numbers and return complete words
MatcCollection strMatch = Regex.Matches(searchString, #"[^\W\d]+");
var values = strMatch.Cast<Group>().Select(g => g.Value).ToArray();
var combinations = values.Permutations();
Now, that i have the array i need, i call the Permutations method below:
public static IEnumerable<IEnumerable<T>> Permutations<T>(this IEnumerable<T> source)
{
if (source == null)
throw new ArgumentException("source");
return permutations(source.ToArray());
}
the permutations method is:
private static IEnumerable<IEnumerable<T>> permutations<T>(IEnumerable<T> source)
{
var c = source.Count();
if (c == 1)
yield return source;
else
for (int i = 0; i < c; i++)
foreach (var p in permutations(source.Take(i).Concat(source.Skip(i + 1))))
yield return source.Skip(i).Take(1).Concat(p);
}
With a example list of 7 items {one,two,three,four,five,six,seven} this code returns numerous list of 7 elements in lenght.
What I need to create is the following:
First iteration:
return result = one
Second iteration
return result = one + ' ' + two
so on and so
I got the above exmple code from a post on SO, so don't know how to change it properly to get what i need.
So do I get right that not only you want all permutations of the 7 items, but also any subsets of them enumerated (something like all combinations)?
I guess the simplest way to get that behaviour would be adding some sort of length-parameter to the permutations method:
private static IEnumerable<IEnumerable<T>> permutations<T>(IEnumerable<T> source, int length)
{
var c = source.Count();
if (length == 1 || c == 1)
foreach(var x in source)
yield return new T[] { x };
else
for (int i = 0; i < c; i++)
foreach (var p in permutations(source.Take(i).Concat(source.Skip(i + 1)), length - 1))
yield return source.Skip(i).Take(1).Concat(p);
}
and then calling this method with parameters from 1 to n:
public static IEnumerable<IEnumerable<T>> Permutations<T>(this IEnumerable<T> source)
{
if (source == null)
throw new ArgumentException("source");
var src = source.ToArray();
for (int i = 1; i <= src.Length; i++)
foreach (var result in permutations(src, i))
yield return result;
}
Hope I didn't make any typos...
i have a piece of code like this
class Program
{
static IEnumerable<string> GetSequences(string a)
{
yield return a;
GetSequences(a + ">");
}
static void Main(string[] args)
{
foreach (var n in GetSequences(">"))
Console.Write(n + ",");
}
}
i was expecting an output like this
,>>,>>>
but it doesn't. it printed only ">,". Does anyone knows what i am missing ?
Use the same foreach in function itself:
static IEnumerable<string> GetSequences(string a)
{
yield return a;
foreach (var n in GetSequences(a + ">"))
yield return n;
}
and don't forget to quit recursion.
The foreach loop only works with the yield return, and you don't have a yield return on your GetSequences() command in the GetSequences() method; whatever it returns doesn't get stored or returned. It is like you are doing this:
It's like you are doing this:
static IEnumerable<string> GetSequences(string a)
{
GetSequences(a + ">");
}
which of course doesn't have a return statement (it wouldn't compile, but you knew that).
After toying around with this for a little while, if you want to use recursion I would recommend you don't use an enumerable, particularly because loops and recursion are meant to serve separate purposes, and the foreach with IEnumerable works best with collections that have already been enumerated. The loop suggested above, as is, just enables endless recursion, and implementing an escape of the recursion like this:
static IEnumerable<string> GetSequences(string a)
{
if(a.Length > 100)
yield return a;
else
foreach (var n in GetSequences(a + ">"))
yield return n;
}
yields this output:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>, (maybe to scale, I didn't want to count)
By implementing it like this I was able to get the output you wanted:
static string GetSequences(string a, int len)
{
if (a.Length < len)
{
return GetSequences(a + ">", len);
}
else
return a;
}
static void Main(string[] args)
{
for (int i = 1; i < 5; i++)
{
Console.Write(GetSequences(">", i) + ",");
}
Console.Read();
}
Of course, my integers are arbitrary, it will work with any length.
EDIT: I knew there was a way to do it the way abatishchev was saying, but I couldn't figure it out. After racking my brain, here is what I got:
static IEnumerable<string> GetSequences(string a)
{
if (a.Length < 100)
{
yield return a;
foreach (var n in GetSequences(a + ">"))
yield return n;
}else
yield break;
}
This has the output you want, although I still think using recursion with loops is a little funny.
This question already has answers here:
Split List into Sublists with LINQ
(34 answers)
Closed 1 year ago.
Is there a nice way to split a collection into n parts with LINQ?
Not necessarily evenly of course.
That is, I want to divide the collection into sub-collections, which each contains a subset of the elements, where the last collection can be ragged.
A pure linq and the simplest solution is as shown below.
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
{
int i = 0;
var splits = from item in list
group item by i++ % parts into part
select part.AsEnumerable();
return splits;
}
}
EDIT: Okay, it looks like I misread the question. I read it as "pieces of length n" rather than "n pieces". Doh! Considering deleting answer...
(Original answer)
I don't believe there's a built-in way of partitioning, although I intend to write one in my set of additions to LINQ to Objects. Marc Gravell has an implementation here although I would probably modify it to return a read-only view:
public static IEnumerable<IEnumerable<T>> Partition<T>
(this IEnumerable<T> source, int size)
{
T[] array = null;
int count = 0;
foreach (T item in source)
{
if (array == null)
{
array = new T[size];
}
array[count] = item;
count++;
if (count == size)
{
yield return new ReadOnlyCollection<T>(array);
array = null;
count = 0;
}
}
if (array != null)
{
Array.Resize(ref array, count);
yield return new ReadOnlyCollection<T>(array);
}
}
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
{
return list.Select((item, index) => new {index, item})
.GroupBy(x => x.index % parts)
.Select(x => x.Select(y => y.item));
}
}
Ok, I'll throw my hat in the ring. The advantages of my algorithm:
No expensive multiplication, division, or modulus operators
All operations are O(1) (see note below)
Works for IEnumerable<> source (no Count property needed)
Simple
The code:
public static IEnumerable<IEnumerable<T>>
Section<T>(this IEnumerable<T> source, int length)
{
if (length <= 0)
throw new ArgumentOutOfRangeException("length");
var section = new List<T>(length);
foreach (var item in source)
{
section.Add(item);
if (section.Count == length)
{
yield return section.AsReadOnly();
section = new List<T>(length);
}
}
if (section.Count > 0)
yield return section.AsReadOnly();
}
As pointed out in the comments below, this approach doesn't actually address the original question which asked for a fixed number of sections of approximately equal length. That said, you can still use my approach to solve the original question by calling it this way:
myEnum.Section(myEnum.Count() / number_of_sections + 1)
When used in this manner, the approach is no longer O(1) as the Count() operation is O(N).
This is same as the accepted answer, but a much simpler representation:
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items,
int numOfParts)
{
int i = 0;
return items.GroupBy(x => i++ % numOfParts);
}
The above method splits an IEnumerable<T> into N number of chunks of equal sizes or close to equal sizes.
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> items,
int partitionSize)
{
int i = 0;
return items.GroupBy(x => i++ / partitionSize).ToArray();
}
The above method splits an IEnumerable<T> into chunks of desired fixed size with total number of chunks being unimportant - which is not what the question is about.
The problem with the Split method, besides being slower, is that it scrambles the output in the sense that the grouping will be done on the basis of i'th multiple of N for each position, or in other words you don't get the chunks in the original order.
Almost every answer here either doesn't preserve order, or is about partitioning and not splitting, or is plainly wrong. Try this which is faster, preserves order but a lil' more verbose:
public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items,
int numberOfChunks)
{
if (numberOfChunks <= 0 || numberOfChunks > items.Count)
throw new ArgumentOutOfRangeException("numberOfChunks");
int sizePerPacket = items.Count / numberOfChunks;
int extra = items.Count % numberOfChunks;
for (int i = 0; i < numberOfChunks - extra; i++)
yield return items.Skip(i * sizePerPacket).Take(sizePerPacket);
int alreadyReturnedCount = (numberOfChunks - extra) * sizePerPacket;
int toReturnCount = extra == 0 ? 0 : (items.Count - numberOfChunks) / extra + 1;
for (int i = 0; i < extra; i++)
yield return items.Skip(alreadyReturnedCount + i * toReturnCount).Take(toReturnCount);
}
The equivalent method for a Partition operation here
I have been using the Partition function I posted earlier quite often. The only bad thing about it was that is wasn't completely streaming. This is not a problem if you work with few elements in your sequence. I needed a new solution when i started working with 100.000+ elements in my sequence.
The following solution is a lot more complex (and more code!), but it is very efficient.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
namespace LuvDaSun.Linq
{
public static class EnumerableExtensions
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int partitionSize)
{
/*
return enumerable
.Select((item, index) => new { Item = item, Index = index, })
.GroupBy(item => item.Index / partitionSize)
.Select(group => group.Select(item => item.Item) )
;
*/
return new PartitioningEnumerable<T>(enumerable, partitionSize);
}
}
class PartitioningEnumerable<T> : IEnumerable<IEnumerable<T>>
{
IEnumerable<T> _enumerable;
int _partitionSize;
public PartitioningEnumerable(IEnumerable<T> enumerable, int partitionSize)
{
_enumerable = enumerable;
_partitionSize = partitionSize;
}
public IEnumerator<IEnumerable<T>> GetEnumerator()
{
return new PartitioningEnumerator<T>(_enumerable.GetEnumerator(), _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitioningEnumerator<T> : IEnumerator<IEnumerable<T>>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitioningEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
_enumerator.Dispose();
}
IEnumerable<T> _current;
public IEnumerable<T> Current
{
get { return _current; }
}
object IEnumerator.Current
{
get { return _current; }
}
public void Reset()
{
_current = null;
_enumerator.Reset();
}
public bool MoveNext()
{
bool result;
if (_enumerator.MoveNext())
{
_current = new PartitionEnumerable<T>(_enumerator, _partitionSize);
result = true;
}
else
{
_current = null;
result = false;
}
return result;
}
}
class PartitionEnumerable<T> : IEnumerable<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitionEnumerable(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public IEnumerator<T> GetEnumerator()
{
return new PartitionEnumerator<T>(_enumerator, _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitionEnumerator<T> : IEnumerator<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
int _count;
public PartitionEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
}
public T Current
{
get { return _enumerator.Current; }
}
object IEnumerator.Current
{
get { return _enumerator.Current; }
}
public void Reset()
{
if (_count > 0) throw new InvalidOperationException();
}
public bool MoveNext()
{
bool result;
if (_count < _partitionSize)
{
if (_count > 0)
{
result = _enumerator.MoveNext();
}
else
{
result = true;
}
_count++;
}
else
{
result = false;
}
return result;
}
}
}
Enjoy!
Interesting thread. To get a streaming version of Split/Partition, one can use enumerators and yield sequences from the enumerator using extension methods. Converting imperative code to functional code using yield is a very powerful technique indeed.
First an enumerator extension that turns a count of elements into a lazy sequence:
public static IEnumerable<T> TakeFromCurrent<T>(this IEnumerator<T> enumerator, int count)
{
while (count > 0)
{
yield return enumerator.Current;
if (--count > 0 && !enumerator.MoveNext()) yield break;
}
}
And then an enumerable extension that partitions a sequence:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> seq, int partitionSize)
{
var enumerator = seq.GetEnumerator();
while (enumerator.MoveNext())
{
yield return enumerator.TakeFromCurrent(partitionSize);
}
}
The end result is a highly efficient, streaming and lazy implementation that relies on very simple code.
Enjoy!
I use this:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> instance, int partitionSize)
{
return instance
.Select((value, index) => new { Index = index, Value = value })
.GroupBy(i => i.Index / partitionSize)
.Select(i => i.Select(i2 => i2.Value));
}
As of .NET 6 you can use Enumerable.Chunk<TSource>(IEnumerable<TSource>, Int32).
This is memory efficient and defers execution as much as possible (per batch) and operates in linear time O(n)
public static IEnumerable<IEnumerable<T>> InBatchesOf<T>(this IEnumerable<T> items, int batchSize)
{
List<T> batch = new List<T>(batchSize);
foreach (var item in items)
{
batch.Add(item);
if (batch.Count >= batchSize)
{
yield return batch;
batch = new List<T>();
}
}
if (batch.Count != 0)
{
//can't be batch size or would've yielded above
batch.TrimExcess();
yield return batch;
}
}
There are lots of great answers for this question (and its cousins). I needed this myself and had created a solution that is designed to be efficient and error tolerant in a scenario where the source collection can be treated as a list. It does not use any lazy iteration so it may not be suitable for collections of unknown size that may apply memory pressure.
static public IList<T[]> GetChunks<T>(this IEnumerable<T> source, int batchsize)
{
IList<T[]> result = null;
if (source != null && batchsize > 0)
{
var list = source as List<T> ?? source.ToList();
if (list.Count > 0)
{
result = new List<T[]>();
for (var index = 0; index < list.Count; index += batchsize)
{
var rangesize = Math.Min(batchsize, list.Count - index);
result.Add(list.GetRange(index, rangesize).ToArray());
}
}
}
return result ?? Enumerable.Empty<T[]>().ToList();
}
static public void TestGetChunks()
{
var ids = Enumerable.Range(1, 163).Select(i => i.ToString());
foreach (var chunk in ids.GetChunks(20))
{
Console.WriteLine("[{0}]", String.Join(",", chunk));
}
}
I have seen a few answers across this family of questions that use GetRange and Math.Min. But I believe that overall this is a more complete solution in terms of error checking and efficiency.
protected List<List<int>> MySplit(int MaxNumber, int Divider)
{
List<List<int>> lst = new List<List<int>>();
int ListCount = 0;
int d = MaxNumber / Divider;
lst.Add(new List<int>());
for (int i = 1; i <= MaxNumber; i++)
{
lst[ListCount].Add(i);
if (i != 0 && i % d == 0)
{
ListCount++;
d += MaxNumber / Divider;
lst.Add(new List<int>());
}
}
return lst;
}
Great Answers, for my scenario i tested the accepted answer , and it seems it does not keep order. there is also great answer by Nawfal that keeps order.
But in my scenario i wanted to split the remainder in a normalized way,
all answers i saw spread the remainder or at the beginning or at the end.
My answer also takes the remainder spreading in more normalized way.
static class Program
{
static void Main(string[] args)
{
var input = new List<String>();
for (int k = 0; k < 18; ++k)
{
input.Add(k.ToString());
}
var result = splitListIntoSmallerLists(input, 15);
int i = 0;
foreach(var resul in result){
Console.WriteLine("------Segment:" + i.ToString() + "--------");
foreach(var res in resul){
Console.WriteLine(res);
}
i++;
}
Console.ReadLine();
}
private static List<List<T>> splitListIntoSmallerLists<T>(List<T> i_bigList,int i_numberOfSmallerLists)
{
if (i_numberOfSmallerLists <= 0)
throw new ArgumentOutOfRangeException("Illegal value of numberOfSmallLists");
int normalizedSpreadRemainderCounter = 0;
int normalizedSpreadNumber = 0;
//e.g 7 /5 > 0 ==> output size is 5 , 2 /5 < 0 ==> output is 2
int minimumNumberOfPartsInEachSmallerList = i_bigList.Count / i_numberOfSmallerLists;
int remainder = i_bigList.Count % i_numberOfSmallerLists;
int outputSize = minimumNumberOfPartsInEachSmallerList > 0 ? i_numberOfSmallerLists : remainder;
//In case remainder > 0 we want to spread the remainder equally between the others
if (remainder > 0)
{
if (minimumNumberOfPartsInEachSmallerList > 0)
{
normalizedSpreadNumber = (int)Math.Floor((double)i_numberOfSmallerLists / remainder);
}
else
{
normalizedSpreadNumber = 1;
}
}
List<List<T>> retVal = new List<List<T>>(outputSize);
int inputIndex = 0;
for (int i = 0; i < outputSize; ++i)
{
retVal.Add(new List<T>());
if (minimumNumberOfPartsInEachSmallerList > 0)
{
retVal[i].AddRange(i_bigList.GetRange(inputIndex, minimumNumberOfPartsInEachSmallerList));
inputIndex += minimumNumberOfPartsInEachSmallerList;
}
//If we have remainder take one from it, if our counter is equal to normalizedSpreadNumber.
if (remainder > 0)
{
if (normalizedSpreadRemainderCounter == normalizedSpreadNumber-1)
{
retVal[i].Add(i_bigList[inputIndex]);
remainder--;
inputIndex++;
normalizedSpreadRemainderCounter=0;
}
else
{
normalizedSpreadRemainderCounter++;
}
}
}
return retVal;
}
}
If order in these parts is not very important you can try this:
int[] array = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int n = 3;
var result =
array.Select((value, index) => new { Value = value, Index = index }).GroupBy(i => i.Index % n, i => i.Value);
// or
var result2 =
from i in array.Select((value, index) => new { Value = value, Index = index })
group i.Value by i.Index % n into g
select g;
However these can't be cast to IEnumerable<IEnumerable<int>> by some reason...
This is my code, nice and short.
<Extension()> Public Function Chunk(Of T)(ByVal this As IList(Of T), ByVal size As Integer) As List(Of List(Of T))
Dim result As New List(Of List(Of T))
For i = 0 To CInt(Math.Ceiling(this.Count / size)) - 1
result.Add(New List(Of T)(this.GetRange(i * size, Math.Min(size, this.Count - (i * size)))))
Next
Return result
End Function
This is my way, listing items and breaking row by columns
int repat_count=4;
arrItems.ForEach((x, i) => {
if (i % repat_count == 0)
row = tbo.NewElement(el_tr, cls_min_height);
var td = row.NewElement(el_td);
td.innerHTML = x.Name;
});
I was looking for a split like the one with string, so the whole List is splitted according to some rule, not only the first part, this is my solution
List<int> sequence = new List<int>();
for (int i = 0; i < 2000; i++)
{
sequence.Add(i);
}
int splitIndex = 900;
List<List<int>> splitted = new List<List<int>>();
while (sequence.Count != 0)
{
splitted.Add(sequence.Take(splitIndex).ToList() );
sequence.RemoveRange(0, Math.Min(splitIndex, sequence.Count));
}
Here is a little tweak for the number of items instead of the number of parts:
public static class MiscExctensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int nbItems)
{
return (
list
.Select((o, n) => new { o, n })
.GroupBy(g => (int)(g.n / nbItems))
.Select(g => g.Select(x => x.o))
);
}
}
below code returns both given number of chunks also with sorted data
static IEnumerable<IEnumerable<T>> SplitSequentially<T>(int chunkParts, List<T> inputList)
{
List<int> Splits = split(inputList.Count, chunkParts);
var skipNumber = 0;
List<List<T>> list = new List<List<T>>();
foreach (var count in Splits)
{
var internalList = inputList.Skip(skipNumber).Take(count).ToList();
list.Add(internalList);
skipNumber += count;
}
return list;
}
static List<int> split(int x, int n)
{
List<int> list = new List<int>();
if (x % n == 0)
{
for (int i = 0; i < n; i++)
list.Add(x / n);
}
else
{
// upto n-(x % n) the values
// will be x / n
// after that the values
// will be x / n + 1
int zp = n - (x % n);
int pp = x / n;
for (int i = 0; i < n; i++)
{
if (i >= zp)
list.Add((pp + 1));
else
list.Add(pp);
}
}
return list;
}
int[] items = new int[] { 0,1,2,3,4,5,6,7,8,9, 10 };
int itemIndex = 0;
int groupSize = 2;
int nextGroup = groupSize;
var seqItems = from aItem in items
group aItem by
(itemIndex++ < nextGroup)
?
nextGroup / groupSize
:
(nextGroup += groupSize) / groupSize
into itemGroup
select itemGroup.AsEnumerable();
Just came across this thread, and most of the solutions here involve adding items to collections, effectively materialising each page before returning it. This is bad for two reasons - firstly if your pages are large there's a memory overhead to filling the page, secondly there are iterators which invalidate previous records when you advance to the next one (for example if you wrap a DataReader within an enumerator method).
This solution uses two nested enumerator methods to avoid any need to cache items into temporary collections. Since the outer and inner iterators are traversing the same enumerable, they necessarily share the same enumerator, so it's important not to advance the outer one until you're done with processing the current page. That said, if you decide not to iterate all the way through the current page, when you move to the next page this solution will iterate forward to the page boundary automatically.
using System.Collections.Generic;
public static class EnumerableExtensions
{
/// <summary>
/// Partitions an enumerable into individual pages of a specified size, still scanning the source enumerable just once
/// </summary>
/// <typeparam name="T">The element type</typeparam>
/// <param name="enumerable">The source enumerable</param>
/// <param name="pageSize">The number of elements to return in each page</param>
/// <returns></returns>
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int pageSize)
{
var enumerator = enumerable.GetEnumerator();
while (enumerator.MoveNext())
{
var indexWithinPage = new IntByRef { Value = 0 };
yield return SubPartition(enumerator, pageSize, indexWithinPage);
// Continue iterating through any remaining items in the page, to align with the start of the next page
for (; indexWithinPage.Value < pageSize; indexWithinPage.Value++)
{
if (!enumerator.MoveNext())
{
yield break;
}
}
}
}
private static IEnumerable<T> SubPartition<T>(IEnumerator<T> enumerator, int pageSize, IntByRef index)
{
for (; index.Value < pageSize; index.Value++)
{
yield return enumerator.Current;
if (!enumerator.MoveNext())
{
yield break;
}
}
}
private class IntByRef
{
public int Value { get; set; }
}
}