Parallel iteration in C#?

Parallel iteration in C#? - c#

Is there a way to do foreach style iteration over parallel enumerables in C#? For subscriptable lists, I know one could use a regular for loop iterating an int over the index range, but I really prefer foreach to for for a number of reasons.
Bonus points if it works in C# 2.0

.NET 4's BlockingCollection makes this pretty easy. Create a BlockingCollection, return its .GetConsumingEnumerable() in the enumerable method. Then the foreach simply adds to the blocking collection.
E.g.
private BlockingCollection<T> m_data = new BlockingCollection<T>();
public IEnumerable<T> GetData( IEnumerable<IEnumerable<T>> sources )
{
Task.Factory.StartNew( () => ParallelGetData( sources ) );
return m_data.GetConsumingEnumerable();
}
private void ParallelGetData( IEnumerable<IEnumerable<T>> sources )
{
foreach( var source in sources )
{
foreach( var item in source )
{
m_data.Add( item );
};
}
//Adding complete, the enumeration can stop now
m_data.CompleteAdding();
}
Hope this helps.
BTW I posted a blog about this last night
Andre

Short answer, no. foreach works on only one enumerable at a time.
However, if you combine your parallel enumerables into a single one, you can foreach over the combined. I am not aware of any easy, built in method of doing this, but the following should work (though I have not tested it):
public IEnumerable<TSource[]> Combine<TSource>(params object[] sources)
{
foreach(var o in sources)
{
// Choose your own exception
if(!(o is IEnumerable<TSource>)) throw new Exception();
}
var enums =
sources.Select(s => ((IEnumerable<TSource>)s).GetEnumerator())
.ToArray();
while(enums.All(e => e.MoveNext()))
{
yield return enums.Select(e => e.Current).ToArray();
}
}
Then you can foreach over the returned enumerable:
foreach(var v in Combine(en1, en2, en3))
{
// Remembering that v is an array of the type contained in en1,
// en2 and en3.
}

Zooba's answer is good, but you might also want to look at the answers to "How to iterate over two arrays at once".

I wrote an implementation of EachParallel() from the .NET4 Parallel library. It is compatible with .NET 3.5: Parallel ForEach Loop in C# 3.5
Usage:
string[] names = { "cartman", "stan", "kenny", "kyle" };
names.EachParallel(name =>
{
try
{
Console.WriteLine(name);
}
catch { /* handle exception */ }
});
Implementation:
/// <summary>
/// Enumerates through each item in a list in parallel
/// </summary>
public static void EachParallel<T>(this IEnumerable<T> list, Action<T> action)
{
// enumerate the list so it can't change during execution
list = list.ToArray();
var count = list.Count();
if (count == 0)
{
return;
}
else if (count == 1)
{
// if there's only one element, just execute it
action(list.First());
}
else
{
// Launch each method in it's own thread
const int MaxHandles = 64;
for (var offset = 0; offset < list.Count() / MaxHandles; offset++)
{
// break up the list into 64-item chunks because of a limitiation // in WaitHandle
var chunk = list.Skip(offset * MaxHandles).Take(MaxHandles);
// Initialize the reset events to keep track of completed threads
var resetEvents = new ManualResetEvent[chunk.Count()];
// spawn a thread for each item in the chunk
int i = 0;
foreach (var item in chunk)
{
resetEvents[i] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem(new WaitCallback((object data) =>
{
int methodIndex = (int)((object[])data)[0];
// Execute the method and pass in the enumerated item
action((T)((object[])data)[1]);
// Tell the calling thread that we're done
resetEvents[methodIndex].Set();
}), new object[] { i, item });
i++;
}
// Wait for all threads to execute
WaitHandle.WaitAll(resetEvents);
}
}
}

If you want to stick to the basics - I rewrote the currently accepted answer in a simpler way:
public static IEnumerable<TSource[]> Combine<TSource> (this IEnumerable<IEnumerable<TSource>> sources)
{
var enums = sources
.Select (s => s.GetEnumerator ())
.ToArray ();
while (enums.All (e => e.MoveNext ())) {
yield return enums.Select (e => e.Current).ToArray ();
}
}
public static IEnumerable<TSource[]> Combine<TSource> (params IEnumerable<TSource>[] sources)
{
return sources.Combine ();
}

Would this work for you?
public static class Parallel
{
public static void ForEach<T>(IEnumerable<T>[] sources,
Action<T> action)
{
foreach (var enumerable in sources)
{
ThreadPool.QueueUserWorkItem(source => {
foreach (var item in (IEnumerable<T>)source)
action(item);
}, enumerable);
}
}
}
// sample usage:
static void Main()
{
string[] s1 = { "1", "2", "3" };
string[] s2 = { "4", "5", "6" };
IEnumerable<string>[] sources = { s1, s2 };
Parallel.ForEach(sources, s => Console.WriteLine(s));
Thread.Sleep(0); // allow background threads to work
}
For C# 2.0, you need to convert the lambda expressions above to delegates.
Note: This utility method uses background threads. You may want to modify it to use foreground threads, and probably you'll want to wait till all threads finish. If you do that, I suggest you create sources.Length - 1 threads, and use the current executing thread for the last (or first) source.
(I wish I could include waiting for threads to finish in my code, but I'm sorry that I don't know how to do that yet. I guess you should use a WaitHandle Thread.Join().)

Related

Permutation algorithm Optimization

I have this permutation code working perfectly but it does not generate the code fast enough, I need help with optimizing the code to run faster, please it is important that the result remains the same, I have seen other algorithms but they don't into consideration the output length and same character reputation which are all valid output. if I can have this converted into a for loop with 28 characters of alphanumeric, that would be awesome. below is the current code I am looking to optimize.
namespace CSharpPermutations
{
public interface IPermutable<T>
{
ISet<T> GetRange();
}
public class Digits : IPermutable<int>
{
public ISet<int> GetRange()
{
ISet<int> set = new HashSet<int>();
for (int i = 0; i < 10; ++i)
set.Add(i);
return set;
}
}
public class AlphaNumeric : IPermutable<char>
{
public ISet<char> GetRange()
{
ISet<char> set = new HashSet<char>();
set.Add('0');
set.Add('1');
set.Add('2');
set.Add('3');
set.Add('4');
set.Add('5');
set.Add('6');
set.Add('7');
set.Add('8');
set.Add('9');
set.Add('a');
set.Add('b');
return set;
}
}
public class PermutationGenerator<T,P> : IEnumerable<string>
where P : IPermutable<T>, new()
{
public PermutationGenerator(int number)
{
this.number = number;
this.range = new P().GetRange();
}
public IEnumerator<string> GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item.ToString();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
foreach (var item in Permutations(0,0))
{
yield return item;
}
}
private IEnumerable<StringBuilder> Permutations(int n, int k)
{
if (n == number)
yield return new StringBuilder();
foreach (var element in range.Skip(k))
{
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
}
}
private int number;
private ISet<T> range;
}
class MainClass
{
public static void Main(string[] args)
{
foreach (var element in new PermutationGenerator<char, AlphaNumeric>(2))
{
Console.WriteLine(element);
}
}
}
}
Thanks for your effort in advance.

What you're outputting there is the cartesian product of two sets; the first set is the characters "0123456789ab" and the second set is the characters "123456789ab".
Eric Lippert wrote a well-known article demonstrating how to use Linq to solve this.
We can apply this to your problem like so:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo;
static class Program
{
static void Main(string[] args)
{
char[][] source = new char[2][];
source[0] = "0123456789ab".ToCharArray();
source[1] = "0123456789ab".ToCharArray();
foreach (var perm in Combine(source))
{
Console.WriteLine(string.Concat(perm));
}
}
public static IEnumerable<IEnumerable<T>> Combine<T>(IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item }));
}
}
You can extend this to 28 characters by modifying the source data:
source[0] = "0123456789abcdefghijklmnopqr".ToCharArray();
source[1] = "0123456789abcdefghijklmnopqr".ToCharArray();
If you want to know how this works, read Eric Lipper's excellent article, which I linked above.

Consider
foreach (var result in Permutations(n + 1, k + 1))
{
yield return new StringBuilder().Append(element).Append(result);
}
Permutations is a recursive function that implements an iterator. So each time the .MoveNext() method is will advance one step of the loop, that will call MoveNext() in turn etc, resulting in N calls to MoveNext(), new StringBuilder, Append() etc. This is quite inefficient.
A can also not see that stringBuilder gives any advantage here. It is a benefit if you concatenate many strings, but as far as I can see you only add two strings together.
The first thing you should do is add code to measure the performance, or even better, use a profiler. That way you can tell if any changes actually improves the situation or not.
The second change I would try would be to try rewrite the recursion to an iterative implementation. This probably means that you need to keep track of an explicit stack of the numbers to process. Or if this is to difficult, stop using iterator blocks and let the recursive method take a list that it adds results to.

Best Way to compare 1 million List of object with another 1 million List of object in c#

i am differentiating 1 million list of object with another 1 million list of object.
i am using for , foreach but it takes too much of time to iterate those list.
can any one help me best way to do this
var SourceList = new List<object>(); //one million
var TargetList = new List<object>()); // one million
//getting data from database here
//SourceList with List of one million
//TargetList with List of one million
var DifferentList = new List<object>();
//ForEach
SourceList.ToList().ForEach(m =>
{
if (!TargetList.Any(s => s.Name == m.Name))
DifferentList.Add(m);
});
//for
for (int i = 0; i < SourceList .Count; i++)
{
if (!TargetList .Any(s => s == SourceList [i].Name))
DifferentList .Add(SourceList [i]);
}

I think it seems like a bad idea but IEnumerable magic will help you.
For starters, simplify your expression. It looks like this:
var result = sourceList.Where(s => targetList.Any(t => t.Equals(s)));
I recommend making a comparison in the Equals method:
public class CompareObject
{
public string prop { get; set; }
public new bool Equals(object o)
{
if (o.GetType() == typeof(CompareObject))
return this.prop == ((CompareObject)o).prop;
return this.GetHashCode() == o.GetHashCode();
}
}
Next add AsParallel. This can both speed up and slow down your program. In your case, you can add ...
var result = sourceList.AsParallel().Where(s => !targetList.Any(t => t.Equals(s)));
CPU 100% loaded if you try to list all at once like this:
var cnt = result.Count();
But it’s quite tolerable to work if you get the results in small portions.
result.Skip(10000).Take(10000).ToList();
Full code:
static Random random = new Random();
public class CompareObject
{
public string prop { get; private set; }
public CompareObject()
{
prop = random.Next(0, 100000).ToString();
}
public new bool Equals(object o)
{
if (o.GetType() == typeof(CompareObject))
return this.prop == ((CompareObject)o).prop;
return this.GetHashCode() == o.GetHashCode();
}
}
void Main()
{
var sourceList = new List<CompareObject>();
var targetList = new List<CompareObject>();
for (int i = 0; i < 10000000; i++)
{
sourceList.Add(new CompareObject());
targetList.Add(new CompareObject());
}
var stopWatch = new Stopwatch();
stopWatch.Start();
var result = sourceList.AsParallel().Where(s => !targetList.Any(t => t.Equals(s)));
var lr = result.Skip(10000).Take(10000).ToList();
stopWatch.Stop();
Console.WriteLine(stopWatch.Elapsed);
}
Update
I remembered what you can use Hashtable.Choos unique values from targetList and from sourceList next fill out the result whose values are not targetList.
Example:
static Random random = new Random();
public class CompareObject
{
public string prop { get; private set; }
public CompareObject()
{
prop = random.Next(0, 1000000).ToString();
}
public new int GetHashCode() {
return prop.GetHashCode();
}
}
void Main()
{
var sourceList = new List<CompareObject>();
var targetList = new List<CompareObject>();
for (int i = 0; i < 10000000; i++)
{
sourceList.Add(new CompareObject());
targetList.Add(new CompareObject());
}
var stopWatch = new Stopwatch();
stopWatch.Start();
var sourceHashtable = new Hashtable();
var targetHashtable = new Hashtable();
foreach (var element in targetList)
{
var hash = element.GetHashCode();
if (!targetHashtable.ContainsKey(hash))
targetHashtable.Add(element.GetHashCode(), element);
}
var result = new List<CompareObject>();
foreach (var element in sourceList)
{
var hash = element.GetHashCode();
if (!sourceHashtable.ContainsKey(hash))
{
sourceHashtable.Add(hash, element);
if(!targetHashtable.ContainsKey(hash)) {
result.Add(element);
}
}
}
stopWatch.Stop();
Console.WriteLine(stopWatch.Elapsed);
}

Scanning the target list to match the name is an O(n) operation, thus your loop is O(n^2). If you build a HashSet<string> of all the distinct names in the target list, you can check whether a name exists in the set in O(1) time using the Contains method.

//getting data from database here
You are getting the data out of a system that specializes in matching and sorting and filtering data, into your RAM that by default cannot yet do that task at all. And then you try to sort, filter and match yourself.
That will fail. No matter how hard you try, it is extremely unlikely that your computer with a single programmer working at a matching algorithm will outperform your specialized piece of hardware called a database server at the one operation this software is supposed to be really good at that was programmed by teams of experts and optimized for years.
You don't go into a fancy restaurant and ask them to give you huge bags of raw ingredients so you can throw them into a big bowl unpeeled and microwave them at home. No. You order a nice dish because it will be way better than anything you could do yourself.
The simple answer is: Do not do that. Do not take the raw data and rummage around in it for hours. Leave that job to the database. It's the one thing it's supposed to be good at. Use it's power. Write a query that will give you the result, don't get the raw data and then play database yourself.

Foreach performs a null check before each iteration, so using a standard for loop will provide slightly better performance that will be hard to beat.
If it is taking too long, can you break down the collection into smaller sets and/or process them in parallel?
Also you could look a PLinq (Parallel Linq) using .AsParallel()
Other areas to improve are the actual comparison logic that you are using, also how the data is stored in memory, depending on your problem, you may not have to load the entire object into memory for every iteration.
Please provide a code example so that we can assist further, when such large amounts of data are involved performance degredation is to be expected.
Again depending on the time that we are talking about here, you could upload the data into a database and use that for the comparison rather than trying to do it natively in C#, this type of solution is better suited to data sets that are already in a database or where the data changes much less frequently than the times you need to perform the comparison.

How do I order this list of site URLs in C#?

I have a list of site URLs,
/node1
/node1/sub-node1
/node2
/node2/sub-node1
The list is given to me in a random order, I need to order it so the the top level is first, followed by sub-levels and so on (because I cannot create /node2/sub-node1 without /node2 existing). Is there a clean way to do this?
Right now I'm just making a recursive call, saying if I can't create sub-node1 because node2 exists, create node2. I'd like to have the order of the list determine the creation and get rid of my recursive call.

My first thought was ordering by length of the string... but then I thought of a list like this, that might include something like aliases for short names:
/longsitename/
/a
/a/b/c/
/a
/a/b/
/otherlongsitename/
... and I thought a better option was to order by the number of level-separator characters first:
IEnumerable<string> SortURLs(IEnumerable<string> urls)
{
return urls.OrderBy(s => s.Count(c => c == '/')).ThenBy(s => s);
}
Then I thought about it some more and I saw this line in your question:
I cannot create /node2/sub-node1 without /node2 existing
Aha! The order of sections or within a section does not really matter, as long as children are always listed after parents. With that in mind, my original thought was okay and ordering by length of the string alone should be just fine:
IEnumerable<string> SortURLs(IEnumerable<string> urls)
{
return urls.OrderBy(s => s.Length);
}
Which lead me at last to wondering why I cared about the length at all? If I just sort the strings, regardless of length, strings with the same beginning will always sort the shorter string first. Thus, at last:
IEnumerable<string> SortURLs(IEnumerable<string> urls)
{
return urls.OrderBy(s => s);
}
I'll leave the first sample up because it may be useful if, at some point in the future, you need a more lexical or logical sort order.

Is there a clean way to do this?
Just sorting the list of URI's using a standard string sort should get you what you need. In general, "a" will order before "aa" in a string sort, so "/node1" should end up before "/node1/sub-node".
For example:
List<string> test = new List<string> { "/node1/sub-node1", "/node2/sub-node1", "/node1", "/node2" };
foreach(var uri in test.OrderBy(s => s))
Console.WriteLine(uri);
This will print:
/node1
/node1/sub-node1
/node2
/node2/sub-node1

Perhaps this works for you:
var nodes = new[] { "/node1", "/node1/sub-node1", "/node2", "/node2/sub-node1" };
var orderedNodes = nodes
.Select(n => new { Levels = Path.GetFullPath(n).Split('\\').Length, Node = n })
.OrderBy(p => p.Levels).ThenBy(p => p.Node);
Result:
foreach(var nodeInfo in orderedNodes)
{
Console.WriteLine("Path:{0} Depth:{1}", nodeInfo.Node, nodeInfo.Levels);
}
Path:/node1 Depth:2
Path:/node2 Depth:2
Path:/node1/sub-node1 Depth:3
Path:/node2/sub-node1 Depth:3

var values = new string[]{"/node1", "/node1/sub-node1" ,"/node2", "/node2/sub-node1"};
foreach(var val in values.OrderBy(e => e))
{
Console.WriteLine(val);
}

The best is to use natural sorting since your strings are mixed between strings and numbers. Because if you use other sorting methods or techniques and you have like this example:
List<string> test = new List<string> { "/node1/sub-node1" ,"/node13","/node10","/node2/sub-node1", "/node1", "/node2" };
the output will be:
/node1
/node1/sub-node1
/node10
/node13
/node2
/node2/sub-node1
which is not sorted.
You can look at this Implementation

If you mean you need all the first level nodes before all the second level nodes, sort by the number of slashes /:
string[] array = {"/node1","/node1/sub-node1", "/node2", "/node2/sub-node1"};
array = array.OrderBy(s => s.Count(c => c == '/')).ToArray();
foreach(string s in array)
System.Console.WriteLine(s);
Result:
/node1
/node2
/node1/sub-node1
/node2/sub-node1
If you just need parent nodes before child nodes, it doesn't get much simpler than
Array.Sort(array);
Result:
/node1
/node1/sub-node1
/node2
/node2/sub-node1

Recursion is actually exactly what you should use, since this is most easily represented by a tree structure.
public class PathNode {
public readonly string Name;
private readonly IDictionary<string, PathNode> _children;
public PathNode(string name) {
Name = name;
_children = new Dictionary<string, PathNode>(StringComparer.InvariantCultureIgnoreCase);
}
public PathNode AddChild(string name) {
PathNode child;
if (_children.TryGetValue(name, out child)) {
return child;
}
child = new PathNode(name);
_children.Add(name, child);
return child;
}
public void Traverse(Action<PathNode> action) {
action(this);
foreach (var pathNode in _children.OrderBy(kvp => kvp.Key)) {
pathNode.Value.Traverse(action);
}
}
}
Which you can then use like this:
var root = new PathNode(String.Empty);
var links = new[] { "/node1/sub-node1", "/node1", "/node2/sub-node-2", "/node2", "/node2/sub-node-1" };
foreach (var link in links) {
if (String.IsNullOrWhiteSpace(link)) {
continue;
}
var node = root;
var lastIndex = link.IndexOf("/", StringComparison.InvariantCultureIgnoreCase);
if (lastIndex < 0) {
node.AddChild(link);
continue;
}
while (lastIndex >= 0) {
lastIndex = link.IndexOf("/", lastIndex + 1, StringComparison.InvariantCultureIgnoreCase);
node = node.AddChild(lastIndex > 0
? link.Substring(0, lastIndex) // Still inside the link
: link // No more slashies
);
}
}
var orderedLinks = new List<string>();
root.Traverse(pn => orderedLinks.Add(pn.Name));
foreach (var orderedLink in orderedLinks.Where(l => !String.IsNullOrWhiteSpace(l))) {
Console.Out.WriteLine(orderedLink);
}
Which should print:
/node1
/node1/sub-node1
/node2
/node2/sub-node-1
/node2/sub-node-2

How to iterate through two collections of the same length using a single foreach

I know this question has been asked many times before but I tried out the answers and they don't seem to work.
I have two lists of the same length but not the same type, and I want to iterate through both of them at the same time as list1[i] is connected to list2[i].
Eg:
Assuming that i have list1 (as List<string>) and list2 (as List<int>)
I want to do something like
foreach( var listitem1, listitem2 in list1, list2)
{
// do stuff
}
Is this possible?

This is possible using .NET 4 LINQ Zip() operator or using open source MoreLINQ library which provides Zip() operator as well so you can use it in more earlier .NET versions
Example from MSDN:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
// The following example concatenates corresponding elements of the
// two input sequences.
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
foreach (var item in numbersAndWords)
{
Console.WriteLine(item);
}
// OUTPUT:
// 1 one
// 2 two
// 3 three
Useful links:
Soure code of the MoreLINQ Zip() implementation: MoreLINQ Zip.cs

Edit - Iterating whilst positioning at the same index in both collections
If the requirement is to move through both collections in a 'synchronized' fashion, i.e. to use the 1st element of the first collection with the 1st element of the second collection, then 2nd with 2nd, and so on, without needing to perform any side effecting code, then see #sll's answer and use .Zip() to project out pairs of elements at the same index, until one of the collections runs out of elements.
More Generally
Instead of the foreach, you can access the IEnumerator from the IEnumerable of both collections using the GetEnumerator() method and then call MoveNext() on the collection when you need to move on to the next element in that collection. This technique is common when processing two or more ordered streams, without needing to materialize the streams.
var stream1Enumerator = stream1.GetEnumerator();
var stream2Enumerator = stream2.GetEnumerator();
var currentGroupId = -1; // Initial value
// i.e. Until stream1Enumerator runs out of
while (stream1Enumerator.MoveNext())
{
// Now you can iterate the collections independently
if (stream1Enumerator.Current.Id != currentGroupId)
{
stream2Enumerator.MoveNext();
currentGroupId = stream2Enumerator.Current.Id;
}
// Do something with stream1Enumerator.Current and stream2Enumerator.Current
}
As others have pointed out, if the collections are materialized and support indexing, such as an ICollection interface, you can also use the subscript [] operator, although this feels rather clumsy nowadays:
var smallestUpperBound = Math.Min(collection1.Count, collection2.Count);
for (var index = 0; index < smallestUpperBound; index++)
{
// Do something with collection1[index] and collection2[index]
}
Finally, there is also an overload of Linq's .Select() which provides the index ordinal of the element returned, which could also be useful.
e.g. the below will pair up all elements of collection1 alternatively with the first two elements of collection2:
var alternatePairs = collection1.Select(
(item1, index1) => new
{
Item1 = item1,
Item2 = collection2[index1 % 2]
});

Short answer is no you can't.
Longer answer is that is because foreach is syntactic sugar - it gets an iterator from the collection and calls Next on it. This is not possible with two collections at the same time.
If you just want to have a single loop, you can use a for loop and use the same index value for both collections.
for(int i = 0; i < collectionsLength; i++)
{
list1[i];
list2[i];
}
An alternative is to merge both collections into one using the LINQ Zip operator (new to .NET 4.0) and iterate over the result.

foreach(var tup in list1.Zip(list2, (i1, i2) => Tuple.Create(i1, i2)))
{
var listItem1 = tup.Item1;
var listItem2 = tup.Item2;
/* The "do stuff" from your question goes here */
}
It can though be such that much of your "do stuff" can go in the lambda that here creates a tuple, which would be even better.
If the collections are such that they can be iterated, then a for() loop is probably simpler still though.
Update: Now with the built-in support for ValueTuple in C#7.0 we can use:
foreach ((var listitem1, var listitem2) in list1.Zip(list2, (i1, i2) => (i1, i2)))
{
/* The "do stuff" from your question goes here */
}

You can wrap the two IEnumerable<> in helper class:
var nums = new []{1, 2, 3};
var strings = new []{"a", "b", "c"};
ForEach(nums, strings).Do((n, s) =>
{
Console.WriteLine(n + " " + s);
});
//-----------------------------
public static TwoForEach<A, B> ForEach<A, B>(IEnumerable<A> a, IEnumerable<B> b)
{
return new TwoForEach<A, B>(a, b);
}
public class TwoForEach<A, B>
{
private IEnumerator<A> a;
private IEnumerator<B> b;
public TwoForEach(IEnumerable<A> a, IEnumerable<B> b)
{
this.a = a.GetEnumerator();
this.b = b.GetEnumerator();
}
public void Do(Action<A, B> action)
{
while (a.MoveNext() && b.MoveNext())
{
action.Invoke(a.Current, b.Current);
}
}
}

Instead of a foreach, why not use a for()? for example...
int length = list1.length;
for(int i = 0; i < length; i++)
{
// do stuff with list1[i] and list2[i] here.
}

What is the least amount of code needed to update one list with another list?

Suppose I have one list:
IList<int> originalList = new List<int>();
originalList.add(1);
originalList.add(5);
originalList.add(10);
And another list...
IList<int> newList = new List<int>();
newList.add(1);
newList.add(5);
newList.add(7);
newList.add(11);
How can I update originalList so that:
If the int appears in newList, keep
If the int does not appear in newList, remove
Add any ints from newList into originalList that aren't there already
Thus - making the contents of originalList:
{ 1, 5, 7, 11 }
The reason I'm asking is because I have an object with a collection of children. When the user updates this collection, instead of just deleting all children, then inserting their selections, I think it would be more efficient if I just acted on the children that were added or removed, rather than tearing down the whole collection, and inserting the newList children as if they are all new.
EDIT - Sorry - I wrote a horrible title... I should have written 'least amount of code' instead of 'efficient'. I think that threw off alot of the answers I've gotten. They are all great... thank you!

originalList = newList;
Or if you prefer them being distinct lists:
originalList = new List<int>(newList);
But, either way does what you want. By your rules, after updating, originalList will be identical to newList.
UPDATE: I thank you all for the support of this answer, but after a closer reading of the question, I believe my other response (below) is the correct one.

If you use some LINQ extension methods, you can do it in two lines:
originalList.RemoveAll(x => !newList.Contains(x));
originalList.AddRange(newList.Where(x => !originalList.Contains(x)));
This assumes (as do other people's solutions) that you've overridden Equals in your original object. But if you can't override Equals for some reason, you can create an IEqualityOperator like this:
class EqualThingTester : IEqualityComparer<Thing>
{
public bool Equals(Thing x, Thing y)
{
return x.ParentID.Equals(y.ParentID);
}
public int GetHashCode(Thing obj)
{
return obj.ParentID.GetHashCode();
}
}
Then the above lines become:
originalList.RemoveAll(x => !newList.Contains(x, new EqualThingTester()));
originalList.AddRange(newList.Where(x => !originalList.Contains(x, new EqualThingTester())));
And if you're passing in an IEqualityOperator anyway, you can make the second line even shorter:
originalList.RemoveAll(x => !newList.Contains(x, new EqualThingTester()));
originalList.AddRange(newList.Except(originalList, new EqualThingTester()));

Sorry, wrote my first response before I saw your last paragraph.
for(int i = originalList.length-1; i >=0; --i)
{
if (!newList.Contains(originalList[i])
originalList.RemoveAt(i);
}
foreach(int n in newList)
{
if (!originaList.Contains(n))
originalList.Add(n);
}

If you are not worried about the eventual ordering, a Hashtable/HashSet will likely be the fastest.

LINQ solution:
originalList = new List<int>(
from x in newList
join y in originalList on x equals y into z
from y in z.DefaultIfEmpty()
select x);

My initial thought was that you could call originalList.AddRange(newList) and then remove the duplicates - but i'm not sure if that would be any more efficient than clearing the list and repopulating it.

List<int> firstList = new List<int>() {1, 2, 3, 4, 5};
List<int> secondList = new List<int>() {1, 3, 5, 7, 9};
List<int> newList = new List<int>();
foreach (int i in firstList)
{
newList.Add(i);
}
foreach (int i in secondList)
{
if (!newList.Contains(i))
{
newList.Add(i);
}
}
Not very clean -- but it works.

There is no built in way of doing this, the closest I can think of is the way DataTable handles new and deleted items.
What #James Curran suggests is merely replace the originalList object with the newList object. It will dump the oldList, but keep the variable (i.e. the pointer is still there).
Regardless, you should consider if optimising this is time well spent. Is the majority of the run time spent copying values from one list to the next, it might be worth it. If it's not, but rather some premature optimising you are doing, you should ignore it.
Spend time polishing the GUI or profile the application before you start optimising is my $.02.

This is a common problem developers encounter when writing UIs to maintain many-to-many database relationships. I don't know how efficient this is, but I wrote a helper class to handle this scenario:
public class IEnumerableDiff<T>
{
private delegate bool Compare(T x, T y);
private List<T> _inXAndY;
private List<T> _inXNotY;
private List<T> _InYNotX;
/// <summary>
/// Compare two IEnumerables.
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <param name="compareKeys">True to compare objects by their keys using Data.GetObjectKey(); false to use object.Equals comparison.</param>
public IEnumerableDiff(IEnumerable<T> x, IEnumerable<T> y, bool compareKeys)
{
_inXAndY = new List<T>();
_inXNotY = new List<T>();
_InYNotX = new List<T>();
Compare comparer = null;
bool hit = false;
if (compareKeys)
{
comparer = CompareKeyEquality;
}
else
{
comparer = CompareObjectEquality;
}
foreach (T xItem in x)
{
hit = false;
foreach (T yItem in y)
{
if (comparer(xItem, yItem))
{
_inXAndY.Add(xItem);
hit = true;
break;
}
}
if (!hit)
{
_inXNotY.Add(xItem);
}
}
foreach (T yItem in y)
{
hit = false;
foreach (T xItem in x)
{
if (comparer(yItem, xItem))
{
hit = true;
break;
}
}
if (!hit)
{
_InYNotX.Add(yItem);
}
}
}
/// <summary>
/// Adds and removes items from the x (current) list so that the contents match the y (new) list.
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <param name="compareKeys"></param>
public static void SyncXList(IList<T> x, IList<T> y, bool compareKeys)
{
var diff = new IEnumerableDiff<T>(x, y, compareKeys);
foreach (T item in diff.InXNotY)
{
x.Remove(item);
}
foreach (T item in diff.InYNotX)
{
x.Add(item);
}
}
public IList<T> InXAndY
{
get { return _inXAndY; }
}
public IList<T> InXNotY
{
get { return _inXNotY; }
}
public IList<T> InYNotX
{
get { return _InYNotX; }
}
public bool ContainSameItems
{
get { return _inXNotY.Count == 0 && _InYNotX.Count == 0; }
}
private bool CompareObjectEquality(T x, T y)
{
return x.Equals(y);
}
private bool CompareKeyEquality(T x, T y)
{
object xKey = Data.GetObjectKey(x);
object yKey = Data.GetObjectKey(y);
return xKey.Equals(yKey);
}
}

if your using .Net 3.5
var List3 = List1.Intersect(List2);
Creates a new list that contains the intersection of the two lists, which is what I believe you are shooting for here.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parallel iteration in C#? - c#

Is there a way to do foreach style iteration over parallel enumerables in C#? For subscriptable lists, I know one could use a regular for loop iterating an int over the index range, but I really prefer foreach to for for a number of reasons. Bonus points if it works in C# 2.0

Zooba's answer is good, but you might also want to look at the answers to "How to iterate over two arrays at once".

Related

Permutation algorithm Optimization

Best Way to compare 1 million List of object with another 1 million List of object in c#

How do I order this list of site URLs in C#?

How to iterate through two collections of the same length using a single foreach

What is the least amount of code needed to update one list with another list?

Categories

Resources