Using ImmutableSortedSet<T> for a thread safe cache - c#

I have a method that takes a DateTime and returns the date marking the end of that quarter. Because of some complexity involving business days and holiday calendars, I want to cache the result to speed up subsequent calls. I'm using a SortedSet<DateTime> to maintain a cache of data, and I use the GetViewBetween method in order to do cache lookups as follows:
private static SortedSet<DateTime> quarterEndCache = new SortedSet<DateTime>();
public static DateTime GetNextQuarterEndDate(DateTime date)
{
var oneDayLater = date.AddDays(1.0);
var fiveMonthsLater = date.AddMonths(5);
var range = quarterEndCache.GetViewBetween(oneDayLater, fiveMonthsLater);
if (range.Count > 0)
{
return range.Min;
}
// Perform expensive calc here
}
Now I want to make my cache threadsafe. Rather than use a lock everywhere which would incur a performance hit on every lookup, I'm exploring the new ImmutableSortedSet<T> collection which would allow me to avoid locks entirely. The problem is that ImmutableSortedSet<T> doesn't have the method GetViewBetween. Is there any way to get similar functionality from the ImmutableSortedSet<T>?
[EDIT]
Servy has convinced me just using a lock with a normal SortedSet<T> is the easiest solution. I'll leave the question open though just because I'm interested to know whether the ImmutableSortedSet<T> can handle this scenario efficiently.

Let's divide the question into two parts:
How to get a functionality similar to GetViewBetween with ImmutableSortedSet<T>? I'd suggest using the IndexOf method. In the snippet below, I created an extension method GetRangeBetween which should do the job.
How to implement lock-free, thread-safe updates with data immutable data structures? Despite this is not the original question, there are some skeptical comments with respect to this issue.
The immutables framework implements a method for exactly that purpose: System.Collections.Immutable.Update<T>(ref T location, Func<T, T> transformer) where T : class; The method internally relies on atomic compare/exchange operations. If you want to do this by hand, you'll find an alternative implementation below which should behave the same like Immutable.Update.
So here is the code:
public static class ImmutableExtensions
{
public static IEnumerable<T> GetRangeBetween<T>(
this ImmutableSortedSet<T> set, T min, T max)
{
int i = set.IndexOf(min);
if (i < 0) i = ~i;
while (i < set.Count)
{
T x = set[i++];
if (set.KeyComparer.Compare(x, min) >= 0 &&
set.KeyComparer.Compare(x, max) <= 0)
{
yield return x;
}
else
{
break;
}
}
}
public static void LockfreeUpdate<T>(ref T item, Func<T, T> fn)
where T: class
{
T x, y;
do
{
x = item;
y = fn(x);
} while (Interlocked.CompareExchange(ref item, y, x) != x);
}
}
Usage:
private static volatile ImmutableSortedSet<DateTime> quarterEndCache =
ImmutableSortedSet<DateTime>.Empty;
private static volatile int counter; // test/verification purpose only
public static DateTime GetNextQuarterEndDate(DateTime date)
{
var oneDayLater = date.AddDays(1.0);
var fiveMonthsLater = date.AddMonths(5);
var range = quarterEndCache.GetRangeBetween(oneDayLater, fiveMonthsLater);
if (range.Any())
{
return range.First();
}
// Perform expensive calc here
// -> Meaningless dummy computation for verification purpose only
long x = Interlocked.Increment(ref counter);
DateTime test = DateTime.FromFileTime(x);
ImmutableExtensions.LockfreeUpdate(
ref quarterEndCache,
c => c.Add(test));
return test;
}
[TestMethod]
public void TestIt()
{
var tasks = Enumerable
.Range(0, 100000)
.Select(x => Task.Factory.StartNew(
() => GetNextQuarterEndDate(DateTime.Now)))
.ToArray();
Task.WaitAll(tasks);
Assert.AreEqual(100000, counter);
}

Related

C# LINQ performance when extension method called inside where clause

I have a LINQ query like this
public static bool CheckIdExists(int searchId)
{
return itemCollection.Any(item => item.Id.Equals(searchId.ConvertToString()));
}
item.Id is a string while searchId is an int. .ConvertToString() is an extension which which converts int to string
Code for ConvertToString:
public static string ConvertToString(this object input)
{
return Convert.ToString(input, CultureInfo.InvariantCulture);
}
Now my query is, does searchId.ConvertToString() gets executed for each item in itemCollection?
Is computing searchId.ConvertToString() beforehand and calling the method like below improves performance?
public static bool CheckIdExists(int searchId)
{
string sId=searchId.ConvertToString();
return itemCollection.Any(item => item.Id.Equals(sId));
}
How to debug these two scenarios and observe their performances?
I re-generated the scenarios you talked about in your question. I tried following code and got this output.
But this is how you can debug this.
static List<string> itemCollection = new List<string>();
static void Main(string[] args)
{
for (int i = 0; i < 10000000; i++)
{
itemCollection.Add(i.ToString());
}
var watch = new Stopwatch();
watch.Start();
Console.WriteLine(CheckIdExists(580748));
watch.Stop();
Console.WriteLine($"Took {watch.ElapsedMilliseconds}");
var watch1 = new Stopwatch();
watch1.Start();
Console.WriteLine(CheckIdExists1(580748));
watch1.Stop();
Console.WriteLine($"Took {watch1.ElapsedMilliseconds}");
Console.ReadLine();
}
public static bool CheckIdExists(int searchId)
{
return itemCollection.Any(item => item.Equals(ConvertToString(searchId)));
}
public static bool CheckIdExists1(int searchId)
{
string sId =ConvertToString(searchId);
return itemCollection.Any(item => item.Equals(sId));
}
public static string ConvertToString(int input)
{
return Convert.ToString(input, CultureInfo.InvariantCulture);
}
OUTPUT:
True
Took 170
True
Took 11
How long it takes is the ultimate guide. You can create a stopwatch to log the performance of any code. Just use the ElapsedMilliseconds to see how long has been taken. For very short operations I suggest using very long loops to get a more accurate length of time.
var watch = new Stopwatch();
watch.Start();
/// CODE HERE (IDEALLY IN A LONG LOOP)
Debub.WriteLine($"Took {watch.ElapsedMilliseconds}");
Yes, it should be faster to get the string once. But I guess that compiler does optimize that thing for you (I just suspect this, don't ave anything to back it up. I just remeber that compilers are very good at detecting things that are not changing).
And no, it's not computed for every item, since LINQ method Any does not necessarily check all items. It return true for the first matching item. The only scenario when it checks all items, is where for none the lambda returns true.
If you want to test the speed difference,make sure to have more data - otherwise the difference may be too small.
Just do:
itemCollection = Enumerable.Range(0, 1000).SelectMany(x => itemCollection).ToList() // or array or whatever the type of collection you have
Than measure the times with StopWatch, just like #RobSedgwick said
I think you have two solution:
1- make log and meke inside this log datetime.now
2- you can use the diagnostic Tools tab
hopefully this help you

Buffering a LINQ query

FINAL EDIT:
I've chosen Timothy's answer but if you want a cuter implementation that leverages the C# yield statement check Eamon's answer: https://stackoverflow.com/a/19825659/145757
By default LINQ queries are lazily streamed.
ToArray/ToList give full buffering but first they're eager and secondly it may take quite some time to complete with an infinite sequence.
Is there any way to have a combination of both behaviors : streaming and buffering values on the fly as they are generated, so that the next querying won't trigger the generation of the elements that have already been queried.
Here is a basic use-case:
static IEnumerable<int> Numbers
{
get
{
int i = -1;
while (true)
{
Console.WriteLine("Generating {0}.", i + 1);
yield return ++i;
}
}
}
static void Main(string[] args)
{
IEnumerable<int> evenNumbers = Numbers.Where(i => i % 2 == 0);
foreach (int n in evenNumbers)
{
Console.WriteLine("Reading {0}.", n);
if (n == 10) break;
}
Console.WriteLine("==========");
foreach (int n in evenNumbers)
{
Console.WriteLine("Reading {0}.", n);
if (n == 10) break;
}
}
Here is the output:
Generating 0.
Reading 0.
Generating 1.
Generating 2.
Reading 2.
Generating 3.
Generating 4.
Reading 4.
Generating 5.
Generating 6.
Reading 6.
Generating 7.
Generating 8.
Reading 8.
Generating 9.
Generating 10.
Reading 10.
==========
Generating 0.
Reading 0.
Generating 1.
Generating 2.
Reading 2.
Generating 3.
Generating 4.
Reading 4.
Generating 5.
Generating 6.
Reading 6.
Generating 7.
Generating 8.
Reading 8.
Generating 9.
Generating 10.
Reading 10.
The generation code is triggered 22 times.
I'd like it to be triggered 11 times, the first time the enumerable is iterated.
Then the second iteration would benefit from the already generated values.
It would be something like:
IEnumerable<int> evenNumbers = Numbers.Where(i => i % 2 == 0).Buffer();
For those familiar with Rx it's a behavior similar to a ReplaySubject.
IEnumerable<T>.Buffer() extension method
public static EnumerableExtensions
{
public static BufferEnumerable<T> Buffer(this IEnumerable<T> source)
{
return new BufferEnumerable<T>(source);
}
}
public class BufferEnumerable<T> : IEnumerable<T>, IDisposable
{
IEnumerator<T> source;
List<T> buffer;
public BufferEnumerable(IEnumerable<T> source)
{
this.source = source.GetEnumerator();
this.buffer = new List<T>();
}
public IEnumerator<T> GetEnumerator()
{
return new BufferEnumerator<T>(source, buffer);
}
public void Dispose()
{
source.Dispose()
}
}
public class BufferEnumerator<T> : IEnumerator<T>
{
IEnumerator<T> source;
List<T> buffer;
int i = -1;
public BufferEnumerator(IEnumerator<T> source, List<T> buffer)
{
this.source = source;
this.buffer = buffer;
}
public T Current
{
get { return buffer[i]; }
}
public bool MoveNext()
{
i++;
if (i < buffer.Count)
return true;
if (!source.MoveNext())
return false;
buffer.Add(source.Current);
return true;
}
public void Reset()
{
i = -1;
}
public void Dispose()
{
}
}
Usage
using (var evenNumbers = Numbers.Where(i => i % 2 == 0).Buffer())
{
...
}
Comments
The key point here is that the IEnumerable<T> source given as input to the Buffer method only has GetEnumerator called once, regardless of how many times the result of Buffer is enumerated. All enumerators for the result of Buffer share the same source enumerator and internal list.
You can use the Microsoft.FSharp.Collections.LazyList<> type from the F# power pack (yep, from C# without F# installed - no problem!) for this. It's in Nuget package FSPowerPack.Core.Community.
In particular, you want to call LazyListModule.ofSeq(...) which returns a LazyList<T> that implements IEnumerable<T> and is lazy and cached.
In your case, usage is just a matter of...
var evenNumbers = LazyListModule.ofSeq(Numbers.Where(i => i % 2 == 0));
var cachedEvenNumbers = LazyListModule.ofSeq(evenNumbers);
Though I personally prefer var in all such cases, note that this does mean the compile-time type will be more specific than just IEnumerable<> - not that this is likely to ever be a downside. Another advantage of the F# non-interface types is that they expose some efficient operations you can't do efficienly with plain IEnumerables, such as LazyListModule.skip.
I'm not sure whether LazyList is thread-safe, but I suspect it is.
Another alternative pointed out in the comments below (if you have F# installed) is SeqModule.Cache (namespace Microsoft.FSharp.Collections, it'll be in GACed assembly FSharp.Core.dll) which has the same effective behavior. Like other .NET enumerables, Seq.cache doesn't have a tail (or skip) operator you can efficiently chain.
Thread-safe: unlike other solutions to this question Seq.cache is thread-safe in the sense that you can have multiple enumerators running in parallel (each enumerator is not thread safe).
Performance I did a quick benchmark, and the LazyList enumerable has at least 4 times more overhead than the SeqModule.Cache variant, which has at least three times more overhead than the custom implementation answers. So, while the F# variants work, they're not quite as fast. Note that 3-12 times slower still isn't very slow compared to an enumerable that does (say) I/O or any non-trivial computation, so this probably won't matter most of the time, but it's good to keep in mind.
TL;DR If you need an efficient, thread-safe cached enumerable, just use SeqModule.Cache.
Building upon Eamon's answer above, here's another functional solution (no new types) that works also with simultaneous evaluation. This demonstrates that a general pattern (iteration with shared state) underlies this problem.
First we define a very general helper method, meant to allow us to simulate the missing feature of anonymous iterators in C#:
public static IEnumerable<T> Generate<T>(Func<Func<Tuple<T>>> generator)
{
var tryGetNext = generator();
while (true)
{
var result = tryGetNext();
if (null == result)
{
yield break;
}
yield return result.Item1;
}
}
Generate is like an aggregator with state. It accepts a function that returns initial state, and a generator function that would have been an anonymous with yield return in it, if it were allowed in C#. The state returned by initialize is meant to be per-enumeration, while a more global state (shared between all enumerations) can be maintained by the caller to Generate e.g. in closure variables as we'll show below.
Now we can use this for the "buffered Enumerable" problem:
public static IEnumerable<T> Cached<T>(IEnumerable<T> enumerable)
{
var cache = new List<T>();
var enumerator = enumerable.GetEnumerator();
return Generate<T>(() =>
{
int pos = -1;
return () => {
pos += 1;
if (pos < cache.Count())
{
return new Tuple<T>(cache[pos]);
}
if (enumerator.MoveNext())
{
cache.Add(enumerator.Current);
return new Tuple<T>(enumerator.Current);
}
return null;
};
});
}
I hope this answer combines the brevity and clarity of sinelaw's answer and the support for multiple enumerations of Timothy's answer:
public static IEnumerable<T> Cached<T>(this IEnumerable<T> enumerable) {
return CachedImpl(enumerable.GetEnumerator(), new List<T>());
}
static IEnumerable<T> CachedImpl<T>(IEnumerator<T> source, List<T> buffer) {
int pos=0;
while(true) {
if(pos == buffer.Count)
if (source.MoveNext())
buffer.Add(source.Current);
else
yield break;
yield return buffer[pos++];
}
}
Key ideas are to use the yield return syntax to make for a short enumerable implementation, but you still need a state-machine to decide whether you can get the next element from the buffer, or whether you need to check the underlying enumerator.
Limitations: This makes no attempt to be thread-safe, nor does it dispose the underlying enumerator (which, in general, is quite tricky to do as the underlying uncached enumerator must remain undisposed as long as any cached enumerabl might still be used).
As far as I know there is no built-in way to do this, which - now that you mention it - is slightly surprising (my guess is, given the frequency with which one would want to use this option, it was probably not worth the effort needed to analyse the code to make sure that the generator gives the exact same sequence every time).
You can however implement it yourself. The easy way would be on the call-site, as
var evenNumbers = Numbers.Where(i => i % 2 == 0).
var startOfList = evenNumbers.Take(10).ToList();
// use startOfList instead of evenNumbers in the loop.
More generally and accurately, you could do it in the generator: create a List<int> cache and every time you generate a new number add it to the cache before you yield return it. Then when you loop through again, first serve up all the cached numbers. E.g.
List<int> cachedEvenNumbers = new List<int>();
IEnumerable<int> EvenNumbers
{
get
{
int i = -1;
foreach(int cached in cachedEvenNumbers)
{
i = cached;
yield return cached;
}
// Note: this while loop now starts from the last cached value
while (true)
{
Console.WriteLine("Generating {0}.", i + 1);
yield return ++i;
}
}
}
I guess if you think about this long enough you could come up with a general implementation of a IEnumerable<T>.Buffered() extension method - again, the requirement is that the enumeration doesn't change between calls and the question is if it is worth it.
Here's an incomplete yet compact 'functional' implementation (no new types defined).
The bug is that it does not allow simultaneous enumeration.
Original description:
The first function should have been an anonymous lambda inside the second, but C# does not allow yield in anonymous lambdas:
// put these in some extensions class
private static IEnumerable<T> EnumerateAndCache<T>(IEnumerator<T> enumerator, List<T> cache)
{
while (enumerator.MoveNext())
{
var current = enumerator.Current;
cache.Add(current);
yield return current;
}
}
public static IEnumerable<T> ToCachedEnumerable<T>(this IEnumerable<T> enumerable)
{
var enumerator = enumerable.GetEnumerator();
var cache = new List<T>();
return cache.Concat(EnumerateAndCache(enumerator, cache));
}
Usage:
var enumerable = Numbers.ToCachedEnumerable();
Full credit to Eamon Nerbonne and sinelaw for their answers, just a couple of tweaks! First, to release the enumerator when it is completed. Secondly to protect the underlying enumerator with a lock so the enumerable can be safely used on multiple threads.
// This is just the same as #sinelaw's Generator but I didn't like the name
public static IEnumerable<T> AnonymousIterator<T>(Func<Func<Tuple<T>>> generator)
{
var tryGetNext = generator();
while (true)
{
var result = tryGetNext();
if (null == result)
{
yield break;
}
yield return result.Item1;
}
}
// Cached/Buffered/Replay behaviour
public static IEnumerable<T> Buffer<T>(this IEnumerable<T> self)
{
// Rows are stored here when they've been fetched once
var cache = new List<T>();
// This counter is thread-safe in that it is incremented after the item has been added to the list,
// hence it will never give a false positive. It may give a false negative, but that falls through
// to the code which takes the lock so it's ok.
var count = 0;
// The enumerator is retained until it completes, then it is discarded.
var enumerator = self.GetEnumerator();
// This lock protects the enumerator only. The enumerable could be used on multiple threads
// and the enumerator would then be shared among them, but enumerators are inherently not
// thread-safe so a) we must protect that with a lock and b) we don't need to try and be
// thread-safe in our own enumerator
var lockObject = new object();
return AnonymousIterator<T>(() =>
{
int pos = -1;
return () =>
{
pos += 1;
if (pos < count)
{
return new Tuple<T>(cache[pos]);
}
// Only take the lock when we need to
lock (lockObject)
{
// The counter could have been updated between the check above and this one,
// so now we have the lock we must check again
if (pos < count)
{
return new Tuple<T>(cache[pos]);
}
// Enumerator is set to null when it has completed
if (enumerator != null)
{
if (enumerator.MoveNext())
{
cache.Add(enumerator.Current);
count += 1;
return new Tuple<T>(enumerator.Current);
}
else
{
enumerator = null;
}
}
}
}
return null;
};
});
}
I use the following extension method.
This way, the input is read at maximum speed, and the consumer processes at maximum speed.
public static IEnumerable<T> Buffer<T>(this IEnumerable<T> input)
{
var blockingCollection = new BlockingCollection<T>();
//read from the input
Task.Factory.StartNew(() =>
{
foreach (var item in input)
{
blockingCollection.Add(item);
}
blockingCollection.CompleteAdding();
});
foreach (var item in blockingCollection.GetConsumingEnumerable())
{
yield return item;
}
}
Example Usage
This example has a fast producer (find files), and a slow consumer (upload files).
long uploaded = 0;
long total = 0;
Directory
.EnumerateFiles(inputFolder, "*.jpg", SearchOption.AllDirectories)
.Select(filename =>
{
total++;
return filename;
})
.Buffer()
.ForEach(filename =>
{
//pretend to do something slow, like upload the file.
Thread.Sleep(1000);
uploaded++;
Console.WriteLine($"Uploaded {uploaded:N0}/{total:N0}");
});

How to populate an IEnumerable with consecutive uints

Is this a clean and correct way to generate a list of consecutive uints?
The cast looks kind of ugly, but I'm a beginner...might be there is a method without casting around?
public class Test
{
static readonly IEnumerable<uint> AvailableChannels
= (IEnumerable<uint>)Enumerable.Range(1,1000);
}
static readonly IEnumerable<uint> AvailableChannels
= Enumerable.Range(1,1000)
.Select(i => (uint)i)
.ToList();
It's still a cast though ...
EDIT
The .ToList() is so the full list doesn't need to be recreated every time you loop over it. (OK, a 1000 uints isn't much, but it's the principle of it - if they were classes you would create new ones every time and get unexpected results, like lost changes)
EDIT2
The Cast<uint>() doesn't work at runtime ("Specified cast is not valid"). Changed to a .Select to perform the cast.
You can write your own enumerable method :
public static IEnumerable<uint> Foo(
uint startValue =0,
uint maxValue = uint.MaxValue
)
{
uint index = startValue;
while(index < maxValue) {
yield return index++;
}
}
public static void Main()
{
var myUints = Foo().Take(100);
var myUints2 = Foo(startValue:0, maxValue:1000);
var myUints3 = Foo(0, 1000);
foreach(uint x in myUints) {
Console.WriteLine(x);
}
}
A side note: If performance is a critical point in your application, you may read this question : Why is Enumerable.Range faster than a direct yield loop? (especially the answer marked as the answer)

Closures and java anonymous inner classes

Would anyone be so kind to post the equivalent Java code for a closure like this one (obtained using C#) with anonymous inner classes?
public static Func<int, int> IncrementByN()
{
int n = 0; // n is local to the method
Func<int, int> increment = delegate(int x)
{
n++;
return x + n;
};
return increment;
}
static void Main(string[] args)
{
var v = IncrementByN();
Console.WriteLine(v(5)); // output 6
Console.WriteLine(v(6)); // output 8
}
Furthermore, can anyone explain how partial applications can be obtained if lexical closures are available and viceversa? For this second question, C# would be appreciated but it's your choice.
Thanks so much.
There is no closure yet in Java. Lambda expressions are coming in java 8. However, the only issue with what you're trying to translate is that it has state, which not something that lamba expressions will support i don't think. Keep in mind, it's really just a shorthand so that you can easily implement single method interfaces. You can however still simulate this I believe:
final AtomicInteger n = new AtomicInteger(0);
IncrementByN v = (int x) -> x + n.incrementAndGet();
System.out.println(v.increment(5));
System.out.println(v.increment(6));
I have not tested this code though, it's just meant as an example of what might possibly work in java 8.
Think of the collections api. Let's say they have this interface:
public interface CollectionMapper<S,T> {
public T map(S source);
}
And a method on java.util.Collection:
public interface Collection<K> {
public <T> Collection<T> map(CollectionMapper<K,T> mapper);
}
Now, let's see that without closures:
Collection<Long> mapped = coll.map(new CollectionMapper<Foo,Long>() {
public Long map(Foo foo) {
return foo.getLong();
}
}
Why not just write this:
Collection<Long> mapped = ...;
for (Foo foo : coll) {
mapped.add(foo.getLong());
}
Much more concise right?
Now introduce lambdas:
Collection<Long> mapped = coll.map( (Foo foo) -> foo.getLong() );
See how much nicer the syntax is? And you can chain it too (we'll assume there's an interface to do filtering which which returns boolean values to determine whether to filter out a value or not):
Collection<Long> mappedAndFiltered =
coll.map( (Foo foo) -> foo.getLong() )
.filter( (Long val) -> val.longValue() < 1000L );
This code is equivalent I believe (at least it produces the desired output):
public class Test {
static interface IncrementByN {
int increment(int x);
}
public static void main(String[] args) throws InterruptedException {
IncrementByN v = new IncrementByN() { //anonymous class
int n = 0;
#Override
public int increment(int x) {
n++;
return x + n;
}
};
System.out.println(v.increment(5)); // output 6
System.out.println(v.increment(6)); // output 8
}
}
Assuming we have a generic function interface:
public interface Func<A, B> {
B call A();
}
Then we can write it like this:
public class IncrementByN {
public static Func<Integer, Integer> IncrementByN()
{
final int n_outer = 0; // n is local to the method
Func<Integer, Integer> increment = new Func<Integer, Integer>() {
int n = n_outer; // capture it into a non-final instance variable
// we can really just write int n = 0; here
public Integer call(Integer x) {
n++;
return x + n;
}
};
return increment;
}
public static void main(String[] args) {
Func<Integer, Integer> v = IncrementByN();
System.out.println(v.call(5)); // output 6
System.out.println(v.call(6)); // output 8
}
}
Some notes:
In your program, you capture the variable n by reference from the enclosing scope, and can modify that variable from the closure. In Java, you can only capture final variables (thus capture is only by value).
What I did here is capture the final variable from the outside, and then assign it into a non-final instance variable inside the anonymous class. This allows "passing info" into the closure and at the same time having it be assignable inside the closure. However, this information flow only works "one way" -- changes to n inside the closure is not reflected in the enclosing scope. This is appropriate for this example because that local variable in the method is not used again after being captured by the closure.
If, instead, you want to be able to pass information "both ways", i.e. have the closure also be able to change things in the enclosing scope, and vice versa, you will need to instead capture a mutable data structure, like an array, and then make changes to elements inside that. That is uglier, and is rarer to need to do.

How do I create an IComparer for a Nunit CollectionAssert test?

I wish to create the following test in NUnit for the following scenario: we wish to test the a new calculation method being created yields results similar to that of an old system. An acceptable difference (or rather a redefinition of equality) between all values has been defined as
abs(old_val - new_val) < 0.0001
I know that I can loop through every value from the new list and compare to values from the old list and test the above condition.
How would achieve this using Nunit's CollectionAssert.AreEqual method (or some CollectionAssert method)?
The current answers are outdated. Since NUnit 2.5, there is an overload of CollectionAssert.AreEqual that takes a System.Collections.IComparer.
Here is a minimal implementation:
public class Comparer : System.Collections.IComparer
{
private readonly double _epsilon;
public Comparer(double epsilon)
{
_epsilon = epsilon;
}
public int Compare(object x, object y)
{
var a = (double)x;
var b = (double)y;
double delta = System.Math.Abs(a - b);
if (delta < _epsilon)
{
return 0;
}
return a.CompareTo(b);
}
}
[NUnit.Framework.Test]
public void MyTest()
{
var a = ...
var b = ...
NUnit.Framework.CollectionAssert.AreEqual(a, b, new Comparer(0.0001));
}
Well there is method from the NUnit Framework that allows me to do tolerance checks on collections. Refer to the Equal Constraint. One uses the AsCollection and Within extension methods. On that note though I am not 100% sure regarding the implications of this statement made
If you want to treat the arrays being compared as simple collections,
use the AsCollection modifier, which causes the comparison to be made
element by element, without regard for the rank or dimensions of the
array.
[Test]
//[ExpectedException()]
public void CheckLists_FailsAt0()
{
var expected = new[] { 0.0001, 0.4353245, 1.3455234, 345345.098098 };
var result1 = new[] { -0.0004, 0.43520, 1.3454, 345345.0980 };
Assert.That(result1, Is.EqualTo(expected).AsCollection.Within(0.0001), "fail at [0]"); // fail on [0]
}
[Test]
//[ExpectedException()]
public void CheckLists_FailAt1()
{
var expected = new[] { 0.0001, 0.4353245, 1.3455234, 345345.098098 };
var result1a = new[] { 0.0001000000 , 0.4348245000 , 1.3450234000 , 345345.0975980000 };
Assert.That(result1a, Is.EqualTo(expected).AsCollection.Within(0.0001), "fail at [1]"); // fail on [3]
}
[Test]
public void CheckLists_AllPass_ForNegativeDiff_of_1over10001()
{
var expected = new[] { 0.0001, 0.4353245, 1.3455234, 345345.098098 };
var result2 = new[] { 0.00009900 , 0.43532350 , 1.34552240 , 345345.09809700 };
Assert.That(result2, Is.EqualTo(expected).AsCollection.Within(0.0001)); // pass
}
[Test]
public void CheckLists_StillPass_ForPositiveDiff_of_1over10001()
{
var expected = new[] { 0.0001, 0.4353245, 1.3455234, 345345.098098 };
var result3 = new[] { 0.00010100 , 0.43532550 , 1.34552440 , 345345.09809900 };
Assert.That(result3, Is.EqualTo(expected).AsCollection.Within(0.0001)); // pass
}
NUnit does not define any delegate object or interface to perform custom checks to lists, and determine that a expected result is valid.
But I think that the best and simplest option is writing a small static method that achieve your checks:
private const float MIN_ACCEPT_VALUE = 0.0001f;
public static void IsAcceptableDifference(IList collection, IList oldCollection)
{
if (collection == null)
throw new Exception("Source collection is null");
if (oldCollection == null)
throw new Exception("Old collection is null");
if (collection.Count != oldCollection.Count)
throw new Exception("Different lenghts");
for (int i = 0; i < collection.Count; i++)
{
float newValue = (float)collection[i];
float oldValue = (float)oldCollection[i];
float difference = Math.Abs(oldValue - newValue);
if (difference < MIN_ACCEPT_VALUE)
{
throw new Exception(
string.Format(
"Found a difference of {0} at index {1}",
difference,
i));
}
}
}
You've asked how to achieve your desired test using a CollectionAssert method without looping through the list. I'm sure this is obvious, but looping is exactly what such a method would do...
The short answer to your exact question is that you can't use CollectionAssert methods to do what you want. However, if what you really want is an easy way to compare lists of floating point numbers and assert their equality, then read on.
The method Assert.AreEqual( double expected, double actual, double tolerance ) releases you from the need to write the individual item assertions yourself. Using LINQ, you could do something like this:
double delta = 0.0001;
IEnumerable<double> expectedValues;
IEnumerable<double> actualValues;
// code code code
foreach (var pair in expectedValues.Zip(actualValues, Tuple.Create))
{
Assert.AreEqual(pair.Item1, pair.Item2, delta, "Collections differ.");
}
If you wanted to get fancier, you could pull this out into a method of its own, catch the AssertionException, massage it and rethrow it for a cleaner interface.
If you don't care about which items differ:
var areEqual = expectedValues
.Zip(actualValues, Tuple.Create)
.Select(tup => Math.Abs(tup.Item1 - tup.Item2) < delta)
.All(b => b);
Assert.IsTrue(areEqual, "Collections differ.");

Categories