How to page an array using LINQ? - c#

If I have an array like this :
string[] mobile_numbers = plst.Where(r => !string.IsNullOrEmpty(r.Mobile))
.Select(r => r.Mobile.ToString())
.ToArray();
I want to paging this array and loop according to those pages .
Say the array count is 400 and i wanna to take the first 20 then the second 20 and so on until the end of array to process each 20 item .
How to do this with linq ? .

Use Skip and Take methods for paging (but keep in mind that it will iterate collection for each page you are going to take):
int pageSize = 20;
int pageNumber = 2;
var result = mobile_numbers.Skip(pageNumber * pageSize).Take(pageSize);
If you need just split array on 'pages' then consider to use MoreLinq (available from NuGet) Batch method:
var pages = mobile_numbers.Batch(pageSize);
If you don't want to use whole library, then take a look on Batch method implementation. Or use this extension method:
public static IEnumerable<IEnumerable<T>> Batch<T>(
this IEnumerable<T> source, int size)
{
T[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket;
bucket = null;
count = 0;
}
if (bucket != null && count > 0)
yield return bucket.Take(count).ToArray();
}
Usage:
int pageSize = 20;
foreach(var page in mobile_numbers.Batch(pageSize))
{
foreach(var item in page)
// use items
}

You need a batching operator.
There is one in MoreLinq that you can use.
You would use it like this (for your example):
foreach (var batch in mobile_numbers.Batch(20))
process(batch);
batch in the above loop will be an IEnumerable of at most 20 items (the last batch may be smaller than 20; all the others will be 20 in length).

You can use .Skip(n).Take(x); to skip to the current index and take the amount required.
Take will only take the number available, i.e. what's left, when the number available is less than requested.

Related

C# Delete one of two successive and same lines in a list

how can i delete one of two same successive lines in a list?
For example:
load
testtest
cd /abc
cd /abc
testtest
exit
cd /abc
In this case ONLY line three OR four.The lists have about 50000 lines, so it is also about speed.
Do you have an idea?
Thank you!
Homeros
You just have to look at the last added element in the second list:
var secondList = new List<string>(firstList.Count){ firstList[0] };
foreach(string next in firstList.Skip(1))
if(secondList.Last() != next)
secondList.Add(next);
Since you wanted to delete the duplicates you have to assign this new list to the old variable:
firstList = secondList;
This approach is more efficient than deleting from a list.
Side-note: since Enumerable.Last is optimized for collections with an indexer(IList<T>), is is as efficient as secondList[secondList.Count-1], but more readable.
user a reverse for-loop and check the adjacent elements:
List<string> list = new List<string>();
for (int i = list.Count-1; i > 0 ; i--)
{
if (list[i] == list[i-1])
{
list.RemoveAt(i);
}
}
the reverse version is advantageous here, because the list might shrink in size with every removed element
I would first split the list, then use LINQ to only select items that don't have the same previous item:
string[] source = text.Split(Environment.NewLine);
var list = source.Select((l, idx) => new { Line = l, Index = idx } )
.Where(x => x.Index == 0 || source[x.Index - 1] != x.Line)
.Select(x => x.Line)
.ToList() // materialize
;
O(n) as extension method
public static IEnumerable<string> RemoveSameSuccessiveItems(this IEnumerable<string> items)
{
string previousItem = null;
foreach(var item in list)
{
if (item.Equals(previousItem) == false)
{
previousItem = item;
yield item;
}
}
}
Then use it
lines = lines.RemoveSameSuccessiveItems();

How can I divide each element in a list by an integer? C#

So I have created a list which holds doubles, is it possible to divide every element in this list by an integer variable?
List<Double> amount = new List<Double>();
Just create a new list with the modified contents:
var newAmounts = amount.Select(x => x / 10).ToList();
Creating new data is less error-prone than modifying existing data.
foreach
You can iterate over each item with foreach:
foreach(var item in amount)
{
var result = item / 3;
}
If you want to store the results in a new list you can do it inside the loop...
var newList = new List<double>(amount.Count); //<-- set capacity for performance
foreach(var item in amount)
{
newList.Add(item / 3);
}
LINQ
... or use Linq to an IEnumerable<double>:
var newList = from item in amount select item / 3;
You can also use Linq extension methods:
var newList = amount.Select(item => item / 3);
Or if you want a List<double> from Linq, you can do it with ToList():
var newList = (from item in amount select item / 3).ToList();
... or ...
var newList = amount.Select(item => item / 3).ToList();
for
As an alternative you can use a simple for:
for (int index = 0; index < amount.Count; index++)
{
var result = amount[index] / 3;
}
This approach will allow you to do the modifications in place:
for (int index = 0; index < amount.Count; index++)
{
amount[index] = amount[index] / 3;
}
PLINQ
You may also consider using Parallel LINQ (with AsParallel):
var newList = amount.AsParallel().Select(item => item / 3).ToList();
Warning: The result may be out of order.
This will take advantage of multicore CPU, by running the operations for each item in parallel. This is particularly good for large lists, and for operations that are independent for each item.
Comparison
foreach: Easy to read and write, easy to remember. Also allows for some optimizations.
Linq: Better if you are used to SQL, also allows for lazy execution.
for: Doing the operation in place requires less memory. Allows for a more control.
PLinq: All you love from Linq, optimized for multiple cores. Although some caution is needed.
In case you want to modify the same instance (rather than creating a new collection), do:
for (int i = 0; i < amount.Count; ++i)
amount[i] /= yourInt32Divisor;
Of course the simple way is to iterate the list and divide each number:
foreach(var d in amount) {
var result = d / 3;
}
You can store the result in a new list.

Is the order of execution of Linq the reason for this catch?

I have this function to repeat a sequence:
public static List<T> Repeat<T>(this IEnumerable<T> lst, int count)
{
if (count < 0)
throw new ArgumentOutOfRangeException("count");
var ret = Enumerable.Empty<T>();
for (var i = 0; i < count; i++)
ret = ret.Concat(lst);
return ret.ToList();
}
Now if I do:
var d = Enumerable.Range(1, 100);
var f = d.Select(t => new Person()).Repeat(10);
int i = f.Distinct().Count();
I expect i to be 100, but its giving me 1000! My question strictly is why is this happening? Shouldn't Linq be smart enough to figure out that it's the first selected 100 persons I need to concatenate with variable ret? I'm getting a feeling that here the Concat is being given preference when it's used with a Select when its executed at ret.ToList()..
Edit:
If I do this I get the correct result as expected:
var f = d.Select(t => new Person()).ToList().Repeat(10);
int i = f.Distinct().Count(); //prints 100
Edit again:
I have not overridden Equals. I'm just trying to get 100 unique persons (by reference of course). My question is can someone elucidate to me why is Linq not doing the select operation first and then concatenation (of course at the time of execution)?
The problem is that unless you call ToList, the d.Select(t => new Person()) is re-enumerated each time the Repeat goes through the list, creating duplicate Persons. The technique is known as the deferred execution.
In general, LINQ does not assume that each time it enumerates a sequence it would get the same sequence, or even a sequence of the same length. If this effect is not desirable, you can always "materialize" the sequence inside your Repeat method by calling ToList right away, like this:
public static List<T> Repeat<T>(this IEnumerable<T> lstEnum, int count) {
if (count < 0)
throw new ArgumentOutOfRangeException("count");
var lst = lstEnum.ToList(); // Enumerate only once
var ret = Enumerable.Empty<T>();
for (var i = 0; i < count; i++)
ret = ret.Concat(lst);
return ret.ToList();
}
I could break down my problem to something less trivial:
var d = Enumerable.Range(1, 100);
var f = d.Select(t => new Person());
Now essentially I am doing this:
f = f.Concat(f);
Mind you query hasn't been executed till now. At the time of execution f is still d.Select(t => new Person()) unexecuted. So the last statement at the time of execution can broken down to:
f = f.Concat(f);
//which is
f = d.Select(t => new Person()).Concat(d.Select(t => new Person()));
which is obvious to create 100 + 100 = 200 new instances of persons. So
f.Distinct().ToList(); //yields 200, not 100
which is the correct behaviour.
Edit: I could rewrite the extension method as simple as,
public static IEnumerable<T> Repeat<T>(this IEnumerable<T> source, int times)
{
source = source.ToArray();
return Enumerable.Range(0, times).SelectMany(_ => source);
}
I used dasblinkenlight's suggestion to fix the issue.
Each Person object is a separate object. All 1000 are distinct.
What is the definition of equality for the Person type? If you don't override it, that definition will be reference equality, meaning all 1000 objects are distinct.

Processing collection in sets

I've a C# generics list collection of customer Ids[customerIdsList].Lets say its count is 25.
I need to pass these Ids in sets of 10[a value which would be configurable and read from app.config]
to another method ProcessCustomerIds() which would process this customer Ids one by one.
ie. the first iteration will pass 10,next will pass the next 10 customer Ids and the last one will pass 5 Ids...and so on and so forth...
How do I achieve this using Linq?
Shall I be using Math.DivRem to do this?
int result=0;
int quotient = Math.DivRem(customerIdsList.Count, 10, out result)
Output:
quotient=2
result=5
So, I will iterate customerIdsList 2 times and invoke ProcessCustomerIds() in each step.
And if result value is greater than 0,then I will do customerIdsList.Skip(25-result) to get the last 5 customerIds from the collection.
Is there any other cleaner, more efficient way to do this? Please advise.
In our project, we have an extension method "Slice" which does exactly what you ask. It looks like this:
public static IEnumerable<IEnumerable<T>> Slice<T>(this IEnumerable<T> list, int size)
{
var slice = new List<T>();
foreach (T item in list)
{
slice.Add(item);
if (slice.Count >= size)
{
yield return slice;
slice = new List<T>();
}
}
if (slice.Count > 0) yield return slice;
}
You use it like this:
customerIdsList.Slice(10).ToList().ForEach(ProcessCustomerIds);
An important feature of this implementation is that it supports deferred execution. (Contrary to an approach using GroupBy). Granted, this doesn't matter most of the time, but sometimes it does.
You could always use this to group the collection:
var n = 10;
var groups = customerIdsList
.Select((id, index) => new { id, index = index / n })
.GroupBy(x => x.index);
Then just run through the groups and issue the members of the group to the server one group at a time.
Yes, you can use Skip and Take methods.
For example:
List <MyObject> list = ...;
int pageSize = 10;
int pageNumber = list.Count / pageSize;
for (int i =0; i<pageNumber; i++){
int currentItem = i * pageSize;
var query = (from obj in list orderby obj.Id).Skip(currentItem).Take(pageSize);
// call method
}
Remember to order the list if you want to use Skip and Take .
A simple extension:
public static class Extensions
{
public static IEnumerable<IEnumerable<T>> Chunks<T>(this List<T> source, int size)
{
for (int i = 0; i < source.Count; i += size)
{
yield return i - source.Count > size
? source.Skip(i)
: source.Skip(i).Take(size);
}
}
}
And then use it like:
var chunks = customerIdsList.Chunks(10);
foreach(var c in chunks)
{
ProcessCustomerIds(c);
}

fastest way to remove an item in a list

I have a list of User objects, and I have to remove ONE item from the list with a specific UserID.
This method has to be as fast as possible, currently I am looping through each item and checking if the ID matches the UserID, if not, then I add the row to a my filteredList collection.
List allItems = GetItems();
for(int x = 0; x < allItems.Count; x++)
{
if(specialUserID == allItems[x].ID)
continue;
else
filteredItems.Add( allItems[x] );
}
If it really has to be as fast as possible, use a different data structure. List isn't known for efficiency of deletion. How about a Dictionary that maps ID to User?
Well, if you want to create a new collection to leave the original untouched, you have to loop through all the items.
Create the new list with the right capacity from the start, that minimises allocations.
Your program logic with the continue seems a bit backwards... just use the != operator instead of the == operator:
List<User> allItems = GetItems();
List<User> filteredItems = new List<User>(allItems.Count - 1);
foreach (User u in allItems) {
if(u.ID != specialUserID) {
filteredItems.Add(u);
}
}
If you want to change the original collection instead of creating a new, storing the items in a Dictionary<int, User> would be the fastest option. Both locating the item and removing it are close to O(1) operations, so that would make the whole operation close to an O(1) operation instead of an O(n) operation.
Use a hashtable. Lookup time is O(1) for everything assuming a good hash algorithm with minimal collision potential. I would recommend something that implements IDictionary
If you must transfer from one list to another here is the fasted result I've found:
var filtered = new List<SomeClass>(allItems);
for (int i = 0; i < filtered.Count; i++)
if (filtered[i].id == 9999)
filtered.RemoveAt(i);
I tried comparing your method, the method above, and a linq "where" statement:
var allItems = new List<SomeClass>();
for (int i = 0; i < 10000000; i++)
allItems.Add(new SomeClass() { id = i });
Console.WriteLine("Tests Started");
var timer = new Stopwatch();
timer.Start();
var filtered = new List<SomeClass>();
foreach (var item in allItems)
if (item.id != 9999)
filtered.Add(item);
var y = filtered.Last();
timer.Stop();
Console.WriteLine("Transfer to filtered list: {0}", timer.Elapsed.TotalMilliseconds);
timer.Reset();
timer.Start();
filtered = new List<SomeClass>(allItems);
for (int i = 0; i < filtered.Count; i++)
if (filtered[i].id == 9999)
filtered.RemoveAt(i);
var s = filtered.Last();
timer.Stop();
Console.WriteLine("Removal from filtered list: {0}", timer.Elapsed.TotalMilliseconds);
timer.Reset();
timer.Start();
var linqresults = allItems.Where(x => (x.id != 9999));
var m = linqresults.Last();
timer.Stop();
Console.WriteLine("linq list: {0}", timer.Elapsed.TotalMilliseconds);
The results were as follows:
Tests Started
Transfer to filtered list: 610.5473
Removal from filtered list: 207.5675
linq list: 379.4382
using the "Add(someCollection)" and using a ".RemoveAt" was a good deal faster.
Also, subsequent .RemoveAt calls are pretty cheap.
I know it's not the fastest, but what about generic list and remove()? (msdn). Anybody knows how it performs compared to eg. the example in the question?
Here's a thought, how about you don't remove it per se. What I mean is something like this:
public static IEnumerable<T> LoopWithExclusion<T>(this IEnumerable<T> list, Func<T,bool> excludePredicate)
{
foreach(var item in list)
{
if(excludePredicate(item))
{
continue;
}
yield return item;
}
}
The point being, whenever you need a "filtered" list, just call this extension method, which loops through the original list, returns all of the items, EXCEPT the ones you don't want.
Something like this:
List<User> users = GetUsers();
//later in the code when you need the filtered list:
foreach(var user in users.LoopWithExclusion(u => u.Id == myIdToExclude))
{
//do what you gotta do
}
Assuming the count of the list is even, I would :
(a) get a list of the number of processors
(b) Divide your list into equal chunks for each processors
(c) spawn a thread for each processor with these data chunks, with the terminating condition being if the predicate is found to return a boolean flag.
public static void RemoveSingle<T>(this List<T> items, Predicate<T> match)
{
int i = -1;
while (i < items.Count && !match(items[++i])) ;
if (i < items.Count)
{
items[i] = items[items.Count - 1];
items.RemoveAt(items.Count - 1);
}
}
I cannot understand why the most easy, straight-forward and obvious solution (also the fastest among the List-based ones) wasn't given by anyone.
This code removes ONE item with a matching ID.
for(int i = 0; i < items.Count; i++) {
if(items[i].ID == specialUserID) {
items.RemoveAt[i];
break;
}
}
If you have a list and you want to mutate it in place to remove an item matching a condition the following is faster than any of the alternatives posted so far:
for (int i = allItems.Count - 1; i >= 0; i--)
if (allItems[i].id == 9999)
allItems.RemoveAt(i);
A Dictionary may be faster for some uses, but don't discount a List. For small collections, it will likely be faster and for large collections, it may save memory which may, in turn make you application faster overall. Profiling is the only way to determine which is faster in a real application.
Here is some code that is efficient if you have hundreds or thousands of items:
List allItems = GetItems();
//Choose the correct loop here
if((x % 5) == 0 && (X >= 5))
{
for(int x = 0; x < allItems.Count; x = x + 5)
{
if(specialUserID != allItems[x].ID)
filteredItems.Add( allItems[x] );
if(specialUserID != allItems[x+1].ID)
filteredItems.Add( allItems[x+1] );
if(specialUserID != allItems[x+2].ID)
filteredItems.Add( allItems[x+2] );
if(specialUserID != allItems[x+3].ID)
filteredItems.Add( allItems[x+3] );
if(specialUserID != allItems[x+4].ID)
filteredItems.Add( allItems[x+4] );
}
}
Start testing if the size of the loop is divisible by the largest number to the smallest number. if you want 10 if statements in the loop then test if the size of the list is bigger then ten and divisible by ten then go down from there. For example if you have 99 items --- you can use 9 if statements in the loop. The loop will iterate 11 times instead of 99 times
"if" statements are cheap and fast

Categories