how can i consume this method of split List<>
private List<List<T>> SplitPartition<T>(this IEnumerable<T> collection, int size)
{
var chunks = new List<List<T>>();
var count = 0;
var temp = new List<T>();
foreach (var element in collection)
{
if (count++ == size)
{
chunks.Add(temp);
temp = new List<T>();
count = 1;
}
temp.Add(element);
}
chunks.Add(temp);
return chunks;
}
I wanna implement this list partition to my existing code:
public void ExportToCsv()
{
List<GenerateModel> members = getDataTop5000(); // I got data from my List of Data with return List<>
int offset = 0;
const int numberPerBatch = 500000; // count parameter.
double LoopMax = Math.Ceiling(members.Count / (double)numberPerBatch);
var PartitionMembers = members.SplitPartition(numberPerBatch); //error here
while (offset < LoopMax)
{
//do the index of partion here PartitionMembers
offset++;
}
}
any suggest or example how to comsume those Method? this is really what i need partition to my List. When i tried consume that method i got error like this:
List' does not contain a definition for 'SplitPartition' and no accessible extension method 'SplitPartition' accepting a first argument of type 'List' could be found (are you missing a using directive or an assembly reference?)
I think you'd be better off rolling your own solution to this. Say you had downloaded 5000 members and wanted to write them to file in 50 member chunks (100 files), you can simply do:
StringBuilder sb = new StringBuilder(10000);
int x= 0;
foreach(var m in members){
if(++x%50 == 0){
File.WriteAllText(sb.ToString(), $#"c:\temp\{x%50}.csv");
sb.Length = 0;
}
sb.AppendLine(m.ToCsvRepresentationEtc());
}
The point I'm making is not about writing to file, it's about knowing what you want to do (eg write to file) with your chunks and making a single pass of the enumerable and the cutting into chunks being done in that pass by changing what action you take every now and then. In this example changing the action is a simple modulo, that empties the buffer of the StringBuilder and writes out to a filename based on the modulo. This is preferable to burning a boatload of memory (the performance of that split routine could well be horrific depending on the numbers involved; it makes no attempt to provision any suitably sized list based on the numbers) on pre-chunking it
At the very least consider rewriting the chunking so that it uses straight 2d (jaggy) arrays or pre-capacity-provisioned lists; you know what sizes they need to be from how big the passed in List is and the chunk size:
public static class ListExtensions{
public List<List<T>> SplitPartition<T>(this IEnumerable<T> collection, int size)
{
var chunks = new List<List<T>>(collection.Count/size + 1);
var temp = new List<T>(size);
foreach (var element in collection)
{
if (temp.Count == size)
{
chunks.Add(temp);
temp = new List<T>(size);
}
temp.Add(element);
}
chunks.Add(temp);
return chunks;
}
}
Can't see your full code, but:
extension methods need to belong to static classes, and
this class and method need to be visible from the calling code.
In particular, I can see that your ExportToCsv is not static, so it doesn't belong to a static class, so I can deduce that your private extension method either:
doesn't belong to a static class, or
belongs to a separate class than your ExportToCsv method, and hence can't be seen from it
So make a public static class to hold the extension method, mark the method itself public static, and you should be in business.
More details: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/how-to-implement-and-call-a-custom-extension-method
Related
Given a class:
class clsPerson { public int x, y; }
Is there some way to create an array of these classes with each element initialized to a (default) constructed instance, without doing it manually in a for loop like:
clsPerson[] objArr = new clsPerson[1000];
for (int i = 0; i < 1000; ++i)
objArr[i] = new clsPerson();
Can I shorten the declaration and instantiation of an array of N objects?
The constructor must be run for every item in the array in this scenario. Whether or not you use a loop, collection initializers or a helper method every element in the array must be visited.
If you're just looking for a handy syntax though you could use the following
public static T[] CreateArray<T>(int count) where T : new() {
var array = new T[count];
for (var i = 0; i < count; i++) {
array[i] = new T();
}
return array;
}
clsPerson[] objArary = CreateArray<clsPerson>(1000);
You must invoke the constructor for each item. There is no way to allocate an array and invoke your class constructors on the items without constructing each item.
You could shorten it (a tiny bit) from a loop using:
clsPerson[] objArr = Enumerable.Range(0, 1000).Select(i => new clsPerson()).ToArray();
Personally, I'd still allocate the array and loop through it (and/or move it into a helper routine), though, as it's very clear and still fairly simple:
clsPerson[] objArr = new clsPerson[1000];
for (int i=0;i<1000;++i)
clsPerson[i] = new clsPerson();
If it would make sense to do so, you could change class clsPerson to struct Person. structs always have a default value.
I am trying to split collection into equal number of batches.below is the code.
public static List<List<T>> SplitIntoBatches<T>(List<T> collection, int size)
{
var chunks = new List<List<T>>();
var count = 0;
var temp = new List<T>();
foreach (var element in collection)
{
if (count++ == size)
{
chunks.Add(temp);
temp = new List<T>();
count = 1;
}
temp.Add(element);
}
chunks.Add(temp);
return chunks;
}
can we do it using Parallel.ForEach() for better performance as we have around 1 Million items in the list?
Thanks!
If performance is the concern, my thoughts (in increasing order of impact):
right-sizing the lists when you create them would save a lot of work, i.e. figure out the output batch sizes before you start copying, i.e. temp = new List<T>(thisChunkSize)
working with arrays would be more effective than working with lists - new T[thisChunkSize]
especially if you use BlockCopy (or CopyTo, which uses that internally) rather than copying individual elements one by one
once you've calculated the offsets for each of the chunks, the individual block-copies could probably be executed in parallel, but I wouldn't assume it will be faster - memory bandwidth will be the limiting factor at that point
but the ultimate fix is: don't copy the data at all, but instead just create ranges over the existing data; for example, if using arrays, ArraySegment<T> would help; if you're open to using newer .NET features, this is a perfect fit for Memory<T>/Span<T> - creating memory/span ranges over an existing array is essentially free and instant - i.e. take a T[] and return List<Memory<T>> or similar.
Even if you can't switch to ArraySegment<T> / Memory<T> etc, returning something like that could still be used - i.e. List<ListSegment<T>> where ListSegment<T> is something like:
readonly struct ListSegment<T> { // like ArraySegment<T>, but for List<T>
public List<T> List {get;}
public int Offset {get;}
public int Count {get;}
}
and have your code work with ListSegment<T> by processing the Offset and Count appropriately.
I have a List of Lists.
To do some Opertations with each of those lists, i separate the Lists by a property and set a temp List with its value;
The list can be sometimes empty.
That is why i use this function for assignment.
EDIT:
My current solution is this simple method.
It should be easily adaptable.
private List<string> setList(List<string> a, int count)
{
List < string > retr;
if(a.Capacity == 0)
{
retr = new List<string>();
for(int counter = 0; counter < count; counter++)
{
retr.Add(string.empty);
}
}
else
{
retr = a;
}
return retr;
}
Is there a better way to either take a list as values or initialize a list with element count?
Or should I implement my own "List" class that has this behavior?
You could use Enumerable.Repeat<T> if you wanted to avoid the loop:
var list = Enumerable.Repeat<string>("", count).ToList();
But there are several things that are problematic with your code:
If Capacity is not 0, it doesn't mean it's equal to your desired count. Even if it is equal to the specified count, it doesn't mean that the actual List.Count is equal to count. A safer way would be to do:
static List<string> PreallocateList(List<string> a, int count)
{
// reuse the existing list?
if (a.Count >= count)
return a;
return Enumerable.Repeat("", count).ToList();
}
Preallocating a List<T> is unusual. It's usually common to use arrays when you have a fixed length known in advance.
// this would (perhaps) make more sense
var array = new string[count];
And keep in mind, as mentioned in 1., that list's Capacity is not the same as Count:
var list = new List<string>(10);
// this will print 10
Console.WriteLine("Capacity is {0}", list.Capacity);
// but this will throw an exception
list[0] = "";
Most likely, however, this method is unnecessary and there is a better way to accomplish what you're doing. If nothing else, I would play the safe card and simply instantiate a new list each time (presuming that you have an algorithm which depends on a preallocated list):
static List<string> PreallocateList(int count)
{
return Enumerable.Repeat("", count).ToList();
}
Or, if you are only interested in having the right capacity (not count), then just use the appropriate constructor:
static List<string> PreallocateList(int count)
{
// this will prevent internal array resizing, if that's your concern
return new List<string>(count);
}
Your method is meaningless but equivalent to
static List<string> setList(List<string> a, int count) =>
a.Capacity == 0 ? Enumerable.Repeat("", count).ToList() : a;
if you want Linq.
As input i have object that implements IDataRecord(row of some abstract table), so it have indexer, and by giving it some integer i can retrive object of some type. As output my code must get some range of cells in that row as array of given type objects.
So I've written this method(yes, i know, it can be easly converted to extension method, but i don't need this, and also i don't really want to have this method visible outside of my class):
private static T[] GetRange<T>(IDataRecord row, int start, int length)
{
var result = new List<T>();
for (int i = start; i < (start + length); i++)
{
result.Add((T)row[i]);
}
return result.ToArray();
}
It works fine, but this method logic seems like something very common. So, is there any method that can give same(or almost same) result in .NET Framework FCL/BCL?
Use Skip and Take.
var rangeList = result.Skip(start - 1).Take(length);
No, it is not in the BCL.
You should however not create a List<> first and then copy that to the array. Either return the List<> itself (and construct it with the appropriate initial capacity), or create the array immediately like this:
private static T[] GetRange<T>(IDataRecord row, int start, int length)
{
var result = new T[length];
for (int i = 0; i < length; i++)
{
result[i] = (T)row[start + i];
}
return result;
}
Here is an alternative (for all you LINQ lovers):
// NB! Lazy enumeration
private static IEnumerable<T> GetRange<T>(IDataRecord row, int start, int length)
{
return Enumerable.Range(start, length).Select(i => (T)row[i]);
}
We repeat here what was stated in the comments to the question: The interface System.Data.IDataRecord (in System.Data.dll assembly) does not inherit IEnumerable<> or IEnumerable.
If you want to 'TakeARange' you should have a collection as input parameter.
Here you don't have one.
You just have a IDataRecord (eg. a single row) that has an indexer.
You should expose a property called Cells that return the list you work with in the indexer implementation.
Your method should look like this:
private static T[] TakeRange<T>(IEnumerable cells, int start, int length)
{
return cells.Skip(start - 1).Take(length)
}
Well.. Seem's like there is no such method in FCL/BCL
(edit: Slight tidy of the code.)
Using foreach like this works fine.
var a = new List<Vector2>();
a.ForEach(delegate(Vector2 b) {
b.Normalize(); });
The following however causes "No overload for method 'ForEach' takes 1 arguments".
byte[,,] a = new byte[2, 10, 10];
a.ForEach(delegate(byte b) {
b = 1; });
I would recommend you just use a normal foreach loop to transform the data. You're using a method that exists only on the List<T> implementation, but not on arrays.
Using a method for foreach really gains you nothing, unless for some reason you were wanting to do data mutation in a method chain. In that case, you may as well write your own extension method for IEnumerable<T>. I would recommend against that, though.
Having a separate foreach loop makes it clear to the reader that data mutation is occurring. It also removes the overhead of calling a delegate for each iteration of the loop. It will also work regardless of the collection type as long as it is an IEnumerable (not entirely true, you can write your own enumerators and enumerables, but that's a different question).
If you're looking to just do data transformations (i.e. projections and the like) then use LINQ.
Also keep in mind that with the array, you're getting a copy of the byte not a reference. You'll be modifying just that byte not the original. Here's an example with the output:
int[] numbers = new int[] { 1, 2, 3, 4, 5 };
Array.ForEach(numbers, number => number += 1);
foreach(int number in numbers)
{
Console.WriteLine(number);
}
Which yields the output:
1
2
3
4
5
As you can see, the number += 1 in the lambda had no effect. In fact, if you tried this in a normal foreach loop, you would get a compiler error.
You're using two different ForEach'es.
Array.ForEach in the byte[,,] example (though you're using it incorrectly) and List.ForEach in the List<...> example.
You've used syntax of List.ForEach() method for the array, but Array.ForEach() syntax is:
public static void ForEach<T>(
T[] array,
Action<T> action
)
One important point that array should be one-dimensional in order to use it in Array.ForEach(). Considering this I would suggest using simple for loop
// first dimension
for (int index = 0; index < bytesArray.GetLength(0); index++)
// second dimension
for (int index = 0; index < bytesArray.GetLength(1); index++)
// third dimension
for (int index = 0; index < bytesArray.GetLength(2); index++)
I don't know of a ForEach method that takes multi-dimensional arrays.
If you want one, i think you will have to create it yourself.
Here is how to do it:
private static void ForEach<T>(T[, ,] a, Action<T> action)
{
foreach (var item in a)
{
action(item);
}
}
Sample program, using the new ForEach method:
static class Program
{
static void Main()
{
byte[, ,] a = new byte[2, 10, 10];
ForEach(a, delegate(byte b)
{
Console.WriteLine(b);
});
}
private static void ForEach<T>(T[, ,] a, Action<T> action)
{
foreach (var item in a)
{
action(item);
}
}
}
Also, the ForEach method in the Array version, is not an instance method, it is statis method. You call it like this:
Array.ForEach(array, delegate);
raw arrays have much less instance methods than generic collections because they are not templated. These methods, such as ForEach() or Sort() are usually implemented as static methods which are themselves templated.
In this case, Array.Foreach(a, action) will do the trick for the array.
Of course, the classical foreach(var b in a) would work for both List and Array since it only requires an enumerator.
However:
I'm not sure how you'd loop over a multidimensional array.
Your assignment (b=1) won't work. Because you receive a value, not a reference.
List has the first instance method. Arrays do not.