I have created long chains of classes, with each link(class) knowing only the next and previous link.
I see great advantage over Arrays when the index is not important.
public class ChainLink
{
public ChainLink previousLink, nextLink;
// Accessors here
}
Questions :
What is this technique actually called? (I don't know what to search)
Is there a .Net class that does the same thing?
Is there a noticeable performance impact vs. Array or List?
Example of the Accessors I use
Assign to the chain :
public ChainLink NextLink {
get{ return _nextLink;}
set {
_nextLink = value;
if (value != null)
value._previousLink = this;
}
}
public void InsertNext (ChainLink link)
{
link.NextLink = _nextLink;
link.PreviousLink = this;
}
Shortening the chain :
If I un-assign the next link of a chain, leaving the remaining links un-referenced by the main program, the garbage collector will dispose of the data for me.
Testing circular referencing :
public bool IsCircular ()
{
ChainLink link = this;
while (link != null) {
link = link._nextLink;
if (link == this)
return true;
}
return false;
}
Offset index :
public ChainLink this [int offset] {
get {
if (offset > 0 && _nextLink != null)
return _nextLink [offset - 1];
if (offset < 0 && _previousLink != null)
return _previousLink [offset + 1];
return this;
}
}
1) This structure is called a double linked list
2) This implementation exists in C# through LinkedList
3) There is a lot of articles on this topic :
here or
this SO post
Others have answered the question about what this is called and the equivalent .Net class (a LinkedList), but I thought I'd quickly look at the speed of your ChainLink vs an array and a List.
I added a method Foo() to your ChainLink class so that there was something to access for each object instance:
public class ChainLink
{
public ChainLink previousLink, nextLink;
// Accessors here
public void Foo()
{ }
}
The first method creates an array and then times how long it takes to access each item in the array:
private void TestArray()
{
// Setup the Array
ChainLink[] Test = new ChainLink[1000000];
for (int i = 0; i < 1000000; i++)
Test[i] = new ChainLink();
// Use a Stopwatch to measure time
Stopwatch SW;
SW = new Stopwatch();
SW.Start();
// Go through items in the array
for (int i = 0; i < Test.Length; i++)
Test[i].Foo();
// Stop timer and report results
SW.Stop();
Console.WriteLine(SW.Elapsed);
}
Next, I created a method to use a List<T> and time how long it takes to access each item in it:
private void TestList()
{
// Setup the list
List<ChainLink> Test = new List<ChainLink>();
for (int i = 0; i < 1000000; i++)
Test.Add(new ChainLink());
// Use a Stopwatch to measure time
Stopwatch SW;
SW = new Stopwatch();
SW.Start();
// Go through items in the list
for (int i = 0; i < Test.Count; i++)
Test[i].Foo();
// Stop timer and report results
SW.Stop();
Console.WriteLine(SW.Elapsed);
}
Finally, I created a method to use your ChainLink and move through the next item until there are no more:
private void TestChainLink()
{
// Setup the linked list
ChainLink Test = new ChainLink();
for (int i = 0; i < 1000000; i++)
{
Test.nextLink = new ChainLink();
Test = Test.nextLink;
}
// Use a Stopwatch to measure time
Stopwatch SW;
SW = new Stopwatch();
SW.Start();
// Go through items in the linked list
while (Test != null)
{
Test.Foo();
Test = Test.nextLink;
}
// Stop timer and report results
SW.Stop();
Console.WriteLine(SW.Elapsed);
}
Running each of these yields some revealing results:
TestArray(): 00:00:00.0058576
TestList(): 00:00:00.0103650
TestChainLink(): 00:00:00.0000014
Multiple iterations reveal similar figures.
In conclusion, your ChainLink is about 4,100 times faster than an array, and about 7,400 times faster than a List.
Related
I have a very specific and demanding workload I am trying to multithreaded. This is very new to me so I am struggling to find an effective solution.
Program Description: UpdateEquations() is cycling through a list of mathematical functions to update the coordinates of rendered lines. By default, func.Count = 3, so this will call CordCalc() 1500 times every frame. I am using NClac to parse a function string and write the result to the Function list, which will later be used before the end of the frame (irrelevant).
Goal: I want to put each cycle of the for(int a) loop inside its own thread. Since for(int a) will only loop 3 times, I just need to start three threads. I cannot continue the for(int i) loop until for(int a) is fully calculated. I am calculating a very large about of small tasks so it would be too expensive to assign each task to the thread.
What I am currently trying to do: I am trying to use a ThreadPool queue, however I'm not sure how to wait for them all to finish before continuing onto the next for(int i) iteration. Furthermore, while the program compiles and executes, the performance is disastrous. Probably %5 of my original performance. I am not sure if creating a "new WaitCallback" is expensive or not. I was looking for a way to predefined threads somehow so that I don't have to reinitialize them 1500 times a frame. (Which is what I suspect the issue is).
Other things I've tried: I tried using new Thread(() => CordCalc(a, i)); however this seemed to have much worse performance. I saw online somewhere that using a ThreadPool would be less expensive.
(This code is shortened for readability and relevance)
public List<Function> func;
private Expression[] exp;
private int lines_i;
private int lines_a;
public void Start()
{
func = new List<Function>();
exp = new Expression[func.Count];
for (int i = 0; i < func.Count; i++) exp[i] = new Expression(func[i].function);
}
//Calculate
public void CordCalc(object state)
{
for (int b = 0; b < func.Count; b++)
exp[lines_a].Parameters[func[b].name] = func[b].mainCords[lines_i - 1];
exp[lines_a].Parameters["t"] = t;
try
{
func[lines_a].mainCords[lines_i] = Convert.ToSingle(exp[lines_a].Evaluate());
}
catch
{
Debug.Log("input Error");
func[lines_a].mainCords[lines_i] = 0;
}
}
private void UpdateEquations()
{
//Initialize equations
for (int a = 0; a < func.Count; a++)
{
func[a].mainCords[0] = t;
}
lines_i = 1;
for (int i = 1; i < 500; i++)
{
lines_a = 0;
for (int a = 0; a < func.Count; a++)
{
//Calculate
ThreadPool.QueueUserWorkItem(new WaitCallback(CordCalc));
//This was something else that I tried, which gave worse results:
//threads[a] = new Thread(() => CordCalc(a, i));
//threads[a].Start();
//t.Join();
//This was my original method call without multithreading
//func[a].mainCords[i] = CordCalc(a, i);
lines_a++;
}
lines_i++;
}
private void FixedUpdate()
{
t += step * (2 + step) * 0.05f;
UpdateEquations();
}
//Function List
public class Function
{
public string name;
public string function;
public float[] mainCords;
//Constructor
public Function(string nameIn, string funcIn)
{
name = nameIn;
function = funcIn;
}
public void SetSize(int len)
{
mainCords = new float[len];
}
}
I am trying to dynamically add and remove subscribers to an Observable BlockingCollection using the ThreadPoolScheduler.
I am not sure if this is a problem with my code or in RX itself, but I had assumed in this case I should be able to subscribe/unsubscribe as needed.
I have reduced the issue down to the test pasted below.
The code works correctly until I call Dispose on subscribers and then add new subscribers.
Essentially I seem to get old threads still de-queing the Observables but never doing anything with them.
Here is the unit test, it sets up 32 subscribers, adds 64 objects, then unsubscribes and repeats the same test. I have removed any additional code (and added some sleeps just for the test to make sure the threads are def done before I unsubscribe)
The first 64 are processed correctly, but the second set only 32 objects are passed to my subscriber.
[TestClass]
public class RxTests
{
public class ObservTest
{
public BlockingCollection<ObserverTests.UnitTestObservable> mBlockingCollection = new BlockingCollection<ObserverTests.UnitTestObservable>();
public IObservable<ObserverTests.UnitTestObservable> mObservableBlockingCollection;
private static readonly object ObservableLock = new object();
private static volatile ObservTest ObservableInstance;
public static ObservTest Instance
{
get
{
if (ObservableInstance != null)
return ObservableInstance;
lock (ObservableLock)
{
if (ObservableInstance == null)
{
ObservTest observable = new ObservTest();
observable.mObservableBlockingCollection = observable.mBlockingCollection.GetConsumingEnumerable().ToObservable(ThreadPoolScheduler.Instance);
ObservableInstance = observable;
}
return ObservableInstance;
}
}
}
private int count = 0;
public void Release()
{
Interlocked.Increment(ref count);
Console.WriteLine("Release {0} : {1}", count, Thread.CurrentThread.ManagedThreadId);
}
public void LogCount()
{
Console.WriteLine("Total :{0}", count);
}
}
[TestMethod]
public void TestMethod1()
{
IList<IDisposable> subscribers = new List<IDisposable>();
for (int count = 0; count < 32; count++)
{
IDisposable disposable = ObservTest.Instance.mObservableBlockingCollection.Subscribe(Observe);
subscribers.Add(disposable);
}
for (int count = 0; count < 64; count++)
{
ObserverTests.UnitTestObservable observable = new ObserverTests.UnitTestObservable
{
Name = string.Format("{0}", count)
};
ObservTest.Instance.mBlockingCollection.Add(observable);
}
Thread.Sleep(5000);
foreach (IDisposable disposable in subscribers)
{
disposable.Dispose();
}
subscribers.Clear();
for (int count = 0; count < 32; count++)
{
IDisposable disposable = ObservTest.Instance.mObservableBlockingCollection.Subscribe(Observe);
subscribers.Add(disposable);
}
for (int count = 0; count < 64; count++)
{
ObserverTests.UnitTestObservable observable = new ObserverTests.UnitTestObservable
{
Name = string.Format("{0}", count)
};
ObservTest.Instance.mBlockingCollection.Add(observable);
}
Thread.Sleep(3000);
ObservTest.Instance.LogCount();
}
public static void Observe(ObserverTests.UnitTestObservable observable)
{
Console.WriteLine("Observe {0} : {1}", observable.Name, Thread.CurrentThread.ManagedThreadId);
ObservTest.Instance.Release();
}
}
So the final count in the output is 96 when I would expect it to be 128
If I reduce the number of initial subscribers from 32 the processed count increases.
e.g. if I reduce the count from 32 to 16 in the first loop I get a count of 112.
if I reduce it to 8 I get 120
I am aiming for a system where as the number of tasks being executed is increased so too are the number of subscribers available to process them.
I've searched for the differences between the * operator and the Math.BigMul method, and found nothing. So I've decided I would try and test their efficiency against each other. Consider the following code :
public class Program
{
static void Main()
{
Stopwatch MulOperatorWatch = new Stopwatch();
Stopwatch MulMethodWatch = new Stopwatch();
MulOperatorWatch.Start();
// Creates a new MulOperatorClass to perform the start method 100 times.
for (int i = 0; i < 100; i++)
{
MulOperatorClass mOperator = new MulOperatorClass();
mOperator.start();
}
MulOperatorWatch.Stop();
MulMethodWatch.Start();
for (int i = 0; i < 100; i++)
{
MulMethodClass mMethod = new MulMethodClass();
mMethod.start();
}
MulMethodWatch.Stop();
Console.WriteLine("Operator = " + MulOperatorWatch.ElapsedMilliseconds.ToString());
Console.WriteLine("Method = " + MulMethodWatch.ElapsedMilliseconds.ToString());
Console.ReadLine();
}
public class MulOperatorClass
{
public void start()
{
List<long> MulOperatorList = new List<long>();
for (int i = 0; i < 15000000; i++)
{
MulOperatorList.Add(i * i);
}
}
}
public class MulMethodClass
{
public void start()
{
List<long> MulMethodList = new List<long>();
for (int i = 0; i < 15000000; i++)
{
MulMethodList.Add(Math.BigMul(i,i));
}
}
}
}
To sum it up : I've created two classes - MulMethodClass and MulOperatorClass that performs both the start method, which fills a varible of type List<long with the values of i multiply by i many times. The only difference between these methods are the use of the * operator in the operator class, and the use of the Math.BigMul in the method class.
I'm creating 100 instances of each of these classes, just to prevent and overflow of the lists (I can't create a 1000000000 items list).
I then measure the time it takes for each of the 100 classes to execute. The results are pretty peculiar : I've did this process about 15 times and the average results were (in milliseconds) :
Operator = 20357
Method = 24579
That about 4.5 seconds difference, which I think is a lot. I've looked at the source code of the BigMul method - it uses the * operator, and practically does the same exact thing.
So, for my quesitons :
Why such method even exist? It does exactly the same thing.
If it does exactly the same thing, why there is a huge efficiency difference between these two?
I'm just curious :)
Microbenchmarking is art. You are right the method is around 10% slower on x86. Same speed on x64. Note that you have to multiply two longs, so ((long)i) * ((long)i), because it is BigMul!
Now, some easy rules if you want to microbenchmark:
A) Don't allocate memory in the benchmarked code... You don't want the GC to run (you are enlarging the List<>)
B) Preallocate the memory outside the timed zone (create the List<> with the right capacity before running the code)
C) Run at least once or twice the methods before benchmarking it.
D) Try to not do anything but what you are benchmarking, but to force the compiler to run your code. For example checking for an always true condition based on the result of the operation, and throwing an exception if it is false is normally good enough to fool the compiler.
static void Main()
{
// Check x86 or x64
Console.WriteLine(IntPtr.Size == 4 ? "x86" : "x64");
// Check Debug/Release
Console.WriteLine(IsDebug() ? "Debug, USELESS BENCHMARK" : "Release");
// Check if debugger is attached
Console.WriteLine(System.Diagnostics.Debugger.IsAttached ? "Debugger attached, USELESS BENCHMARK!" : "Debugger not attached");
// High priority
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
Stopwatch MulOperatorWatch = new Stopwatch();
Stopwatch MulMethodWatch = new Stopwatch();
// Prerunning of the benchmarked methods
MulMethodClass.start();
MulOperatorClass.start();
{
// No useless method allocation here
MulMethodWatch.Start();
for (int i = 0; i < 100; i++)
{
MulMethodClass.start();
}
MulMethodWatch.Stop();
}
{
// No useless method allocation here
MulOperatorWatch.Start();
for (int i = 0; i < 100; i++)
{
MulOperatorClass.start();
}
MulOperatorWatch.Stop();
}
Console.WriteLine("Operator = " + MulOperatorWatch.ElapsedMilliseconds.ToString());
Console.WriteLine("Method = " + MulMethodWatch.ElapsedMilliseconds.ToString());
Console.ReadLine();
}
public class MulOperatorClass
{
// The method is static. No useless memory allocation
public static void start()
{
for (int i = 2; i < 15000000; i++)
{
// This condition will always be false, but the compiler
// won't be able to remove the code
if (((long)i) * ((long)i) == ((long)i))
{
throw new Exception();
}
}
}
}
public class MulMethodClass
{
public static void start()
{
// The method is static. No useless memory allocation
for (int i = 2; i < 15000000; i++)
{
// This condition will always be false, but the compiler
// won't be able to remove the code
if (Math.BigMul(i, i) == i)
{
throw new Exception();
}
}
}
}
private static bool IsDebug()
{
// Taken from http://stackoverflow.com/questions/2104099/c-sharp-if-then-directives-for-debug-vs-release
object[] customAttributes = Assembly.GetExecutingAssembly().GetCustomAttributes(typeof(DebuggableAttribute), false);
if ((customAttributes != null) && (customAttributes.Length == 1))
{
DebuggableAttribute attribute = customAttributes[0] as DebuggableAttribute;
return (attribute.IsJITOptimizerDisabled && attribute.IsJITTrackingEnabled);
}
return false;
}
E) If you are really sure your code is ok, try changing the order of the tests
F) Put your program in higher priority
but be happy :-)
at least another persons had the same question, and wrote a blog article: http://reflectivecode.com/2008/10/mathbigmul-exposed/
He did the same errors you did.
I was reviewing some code, and I found something that looked like this:
public class MyClass
{
public bool IsEditable { get; set; }
public void HandleInput()
{
if (IsEditable.Equals(false))
{
//do stuff
}
}
}
As far as I know, (IsEditable.Equals(false)) is identical to (IsEditable == false) (and also the same as (!IsEditable)).
Besides personal preference, is there any difference at all between .Equals() and ==, specifically when used to compare bools?
This is mostly a readability issue. I'd normally use == because that's what I'm used to looking at.
Specifically with bools, you don't have to compare them at all
if(!IsEditable)
will suffice
although, Sometimes I myself do write things like if (val == false) just to be extra sure that i don't misread it when i have to modify the code.
In fact, for basic types such as int, bool etc. there is a difference between calling Equals() and == due to the fact that the CIL has instructions for handling such types. Calling Equals() forces boxing of the value and making a virtual method call, whereas usage of == leads to usage of a single CIL instruction.
!value and value == false is actually the same, at least in Microsoft's C# compiler bundled with .NET 4.0.
Hence, the comparisons within the following methods
public static int CompareWithBoxingAndVirtualMethodCall(bool value)
{
if (value.Equals(false)) { return 0; } else { return 1; }
}
public static int CompareWithCILInstruction(bool value)
{
if (value == false) { return 0; } else { return 1; }
if (!value) { return 0; } else { return 1; } // comparison same as line above
}
will compile to to the following CIL instructions:
// CompareWithBoxingAndVirtualMethodCall
ldarga.s 'value'
ldc.i4.0
call instance bool [mscorlib]System.Boolean::Equals(bool) // virtual method call
brfalse.s IL_000c // additional boolean comparison, jump for if statement
// CompareWithCILInstruction
ldarg.0
brtrue.s IL_0005 // actual single boolean comparison, jump for if statement
The Equals way appears to be significantly slower - roughly 2.7 times in debug mode, and more than seven times in release mode.
Here is my quick and dirty benchmark:
public static void Main() {
bool a = bool.Parse("false");
bool b = bool.Parse("true");
bool c = bool.Parse("true");
var sw = new Stopwatch();
const int Max = 1000000000;
int count = 0;
sw.Start();
// The loop will increment count Max times; let's measure how long it takes
for (int i = 0; i != Max; i++) {
count++;
}
sw.Stop();
var baseTime = sw.ElapsedMilliseconds;
sw.Start();
count = 0;
for (int i = 0; i != Max; i++) {
if (a.Equals(c)) count++;
if (b.Equals(c)) count++;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds - baseTime);
sw.Reset();
count = 0;
sw.Start();
for (int i = 0; i != Max; i++) {
if (a==c) count++;
if (b==c) count++;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds - baseTime);
sw.Reset();
count = 0;
sw.Start();
for (int i = 0; i != Max; i++) {
if (!a) count++;
if (!b) count++;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds - baseTime);
}
Running this produces the following results:
In debug mode
8959
2950
1874
In release mode
5348
751
7
Equals appears to be the slowest. There appears to be little difference between == and !=. However, if (!boolExpr) appears to be the clear winner.
If you decompile System.Boolean and look at it, It's Equals overloads are defined thus:
public override bool Equals(object obj)
{
if (!(obj is bool))
return false;
else
return this == (bool) obj;
}
public bool Equals(bool obj)
{
return this == obj;
}
I would like to think the C# compiler's optimizer and the .Net JIT compiler would be smart enough to inline these, at least for release/optimized compilations, making them exactly the same.
There is a difference - at least in .NET 4.8 - I believe the reason is because of the boxing described in Oliver Hanappi's answer:
static void Main(string[] args)
{
object lhs = true;
object rhs = true;
Console.WriteLine($"Are Equal - {(lhs == rhs ? "Yes" : "No")}"); // Outputs no
Console.WriteLine($"Are Equal - {(lhs.Equals(rhs) ? "Yes" : "No")}"); // Outputs yes
Console.ReadLine();
}
Take a look at the following quote Taken from here:
The Equals method is just a virtual one defined in System.Object, and
overridden by whichever classes choose to do so. The == operator is an
operator which can be overloaded by classes, but which usually has
identity behaviour.
For reference types where == has not been overloaded, it compares
whether two references refer to the same object - which is exactly
what the implementation of Equals does in System.Object.
So in short, Equals is really just doing a == anyways.
In this case, with bools, it doesn't make any difference, however with other built-in non reference types it can.
== allows for conversions of types if it can .Equals won't
== is always better than .Equals. In the case of integer comparison, == runs faster than .Equals. In the below test, the elapsed time using == 157, while for .Equals the elapsed time is 230.
class Program
{
static void Main(string[] args)
{
Program programObj = new Program();
programObj.mymethod();
programObj.mynextmethod();
}
void mynextmethod()
{
var watch = Stopwatch.StartNew();
for (int i = 0; i < 60000000; i++)
{
int j = 0;
if (i.Equals(j))
j = j + 1;
}
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine("Time take in method" + elapsedMs);
Console.ReadLine();
}
void mymethod()
{
var watch = Stopwatch.StartNew();
for (int i = 0; i < 60000000; i++)
{
int j = 0;
if (i == j)
j = j + 1;
}
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
Console.WriteLine("Time take in method" + elapsedMs);
}
}
I'm testing a self written element generator (ICollection<string>) and compare the calculated count to the actual count to get an idea if there's an error or not in my algorithm.
As this generator can generate lots of elements on demand I'm looking in Partitioner<string> and I have implemented a basic one which seems to also produce valid enumerators which together give the same amount of strings as calculated.
Now I want to test how this behaves if run parallel (again first testing for correct count):
MyGenerator generator = new MyGenerator();
MyPartitioner partitioner = new MyPartitioner(generator);
int isCount = partitioner.AsParallel().Count();
int shouldCount = generator.Count;
bool same = isCount == shouldCount; // false
I don't get why this count is not equal! What is the ParallelQuery<string> doing?
generator.Count() == generator.Count // true
partitioner.GetPartitions(xyz).Select(enumerator =>
{
int count = 0;
while (enumerator.MoveNext())
{
count++;
}
return count;
}).Sum() == generator.Count // true
So, I'm currently not seeing an error in my code. Next I tried to manualy count that ParallelQuery<string>:
int count = 0;
partitioner.AsParallel().ForAll(e => Interlocked.Increment(ref count));
count == generator.Count // true
Summed up: Everyone counts my enumerable correct, ParallelQuery.ForAll enumerates exactly generator.Count elements. But what does ParallelQuery.Count()?
If the correct count is something about 10k, ParallelQuery sees 40k.
internal sealed class PartialWordEnumerator : IEnumerator<string>
{
private object sync = new object();
private readonly IEnumerable<char> characters;
private readonly char[] limit;
private char[] buffer;
private IEnumerator<char>[] enumerators;
private int position = 0;
internal PartialWordEnumerator(IEnumerable<char> characters, char[] state, char[] limit)
{
this.characters = new List<char>(characters);
this.buffer = (char[])state.Clone();
if (limit != null)
{
this.limit = (char[])limit.Clone();
}
this.enumerators = new IEnumerator<char>[this.buffer.Length];
for (int i = 0; i < this.buffer.Length; i++)
{
this.enumerators[i] = SkipTo(state[i]);
}
}
private IEnumerator<char> SkipTo(char c)
{
IEnumerator<char> first = this.characters.GetEnumerator();
IEnumerator<char> second = this.characters.GetEnumerator();
while (second.MoveNext())
{
if (second.Current == c)
{
return first;
}
first.MoveNext();
}
throw new InvalidOperationException();
}
private bool ReachedLimit
{
get
{
if (this.limit == null)
{
return false;
}
for (int i = 0; i < this.buffer.Length; i++)
{
if (this.buffer[i] != this.limit[i])
{
return false;
}
}
return true;
}
}
public string Current
{
get
{
if (this.buffer == null)
{
throw new ObjectDisposedException(typeof(PartialWordEnumerator).FullName);
}
return new string(this.buffer);
}
}
object IEnumerator.Current
{
get { return this.Current; }
}
public bool MoveNext()
{
lock (this.sync)
{
if (this.position == this.buffer.Length)
{
this.position--;
}
if (this.position == -1)
{
return false;
}
IEnumerator<char> enumerator = this.enumerators[this.position];
if (enumerator.MoveNext())
{
this.buffer[this.position] = enumerator.Current;
this.position++;
if (this.position == this.buffer.Length)
{
return !this.ReachedLimit;
}
else
{
return this.MoveNext();
}
}
else
{
this.enumerators[this.position] = this.characters.GetEnumerator();
this.position--;
return this.MoveNext();
}
}
}
public void Dispose()
{
this.position = -1;
this.buffer = null;
}
public void Reset()
{
throw new NotSupportedException();
}
}
public override IList<IEnumerator<string>> GetPartitions(int partitionCount)
{
IEnumerator<string>[] enumerators = new IEnumerator<string>[partitionCount];
List<char> characters = new List<char>(this.generator.Characters);
int length = this.generator.Length;
int characterCount = this.generator.Characters.Count;
int steps = Math.Min(characterCount, partitionCount);
int skip = characterCount / steps;
for (int i = 0; i < steps; i++)
{
char c = characters[i * skip];
char[] state = new string(c, length).ToCharArray();
char[] limit = null;
if ((i + 1) * skip < characterCount)
{
c = characters[(i + 1) * skip];
limit = new string(c, length).ToCharArray();
}
if (i == steps - 1)
{
limit = null;
}
enumerators[i] = new PartialWordEnumerator(characters, state, limit);
}
for (int i = steps; i < partitionCount; i++)
{
enumerators[i] = Enumerable.Empty<string>().GetEnumerator();
}
return enumerators;
}
EDIT: I believe I have found the solution. According to the documentation on IEnumerable.MoveNext (emphasis mine):
If MoveNext passes the end of the collection, the enumerator is
positioned after the last element in the collection and MoveNext
returns false. When the enumerator is at this position, subsequent
calls to MoveNext also return false until Reset is called.
According to the following logic:
private bool ReachedLimit
{
get
{
if (this.limit == null)
{
return false;
}
for (int i = 0; i < this.buffer.Length; i++)
{
if (this.buffer[i] != this.limit[i])
{
return false;
}
}
return true;
}
}
The call to MoveNext() will return false only one time - when the buffer is exactly equal to the limit. Once you have passed the limit, the return value from ReachedLimit will start to become false again, making return !this.ReachedLimit return true, so the enumerator will continue past the end of the limit all the way until it runs out of characters to enumerate. Apparently, in the implementation of ParallelQuery.Count(), MoveNext() is called multiple times when it has reached the end, and since it starts to return a true value again, the enumerator happily continues returning more elements (this is not the case in your custom code that walks the enumerator manually, and apparently also is not the case for the ForAll call, so they "accidentally" return the correct results).
The simplest fix to this is to remember the return value from MoveNext() once it becomes false:
private bool _canMoveNext = true;
public bool MoveNext()
{
if (!_canMoveNext) return false;
...
if (this.position == this.buffer.Length)
{
if (this.ReachedLimit) _canMoveNext = false;
...
}
Now once it begins returning false, it will return false for every future call and this returns the correct result from AsParallel().Count(). Hope this helps!
The documentation on Partitioner notes (emphasis mine):
The static methods on Partitioner are all thread-safe and may
be used concurrently from multiple threads. However, while a created
partitioner is in use, the underlying data source should not be
modified, whether from the same thread that is using a partitioner or
from a separate thread.
From what I can understand of the code you have given, it would seem that ParallelQuery.Count() is most likely to have thread-safety issues because it may possibly be iterating multiple enumerators at the same time, whereas all the other solutions would require the enumerators to be run synchronized. Without seeing the code you are using for MyGenerator and MyPartitioner is it difficult to determine if thread-safety issues could be the culprit.
To demonstrate, I have written a simple enumerator that returns the first hundred numbers as strings. Also, I have a partitioner, that distributes the elements in the underlying enumerator over a collection of numPartitions separate lists. Using all the methods you described above on our 12-core server (when I output numPartitions, it uses 12 by default on this machine), I get the expected result of 100 (this is LINQPad-ready code):
void Main()
{
var partitioner = new SimplePartitioner(GetEnumerator());
GetEnumerator().Count().Dump();
partitioner.GetPartitions(10).Select(enumerator =>
{
int count = 0;
while (enumerator.MoveNext())
{
count++;
}
return count;
}).Sum().Dump();
var theCount = 0;
partitioner.AsParallel().ForAll(e => Interlocked.Increment(ref theCount));
theCount.Dump();
partitioner.AsParallel().Count().Dump();
}
// Define other methods and classes here
public IEnumerable<string> GetEnumerator()
{
for (var i = 1; i <= 100; i++)
yield return i.ToString();
}
public class SimplePartitioner : Partitioner<string>
{
private IEnumerable<string> input;
public SimplePartitioner(IEnumerable<string> input)
{
this.input = input;
}
public override IList<IEnumerator<string>> GetPartitions(int numPartitions)
{
var list = new List<string>[numPartitions];
for (var i = 0; i < numPartitions; i++)
list[i] = new List<string>();
var index = 0;
foreach (var s in input)
list[(index = (index + 1) % numPartitions)].Add(s);
IList<IEnumerator<string>> result = new List<IEnumerator<string>>();
foreach (var l in list)
result.Add(l.GetEnumerator());
return result;
}
}
Output:
100
100
100
100
This clearly works. Without more information it is impossible to tell you what is not working in your particular implementation.