Clean way to reduce many TimeSpans into fewer, average TimeSpans? - c#

I have a C# Queue<TimeSpan> containing 500 elements.
I need to reduce those into 50 elements by taking groups of 10 TimeSpans and selecting their average.
Is there a clean way to do this? I'm thinking LINQ will help, but I can't figure out a clean way. Any ideas?

I would use the Chunk function and a loop.
foreach(var set in source.ToList().Chunk(10)){
target.Enqueue(TimeSpan.FromMilliseconds(
set.Average(t => t.TotalMilliseconds)));
}
Chunk is part of my standard helper library.
http://clrextensions.codeplex.com/
Source for Chunk

Take a look at the .Skip() and .Take() extension methods to partition your queue into sets. You can then use .Average(t => t.Ticks) to get the new TimeSpan that represents the average. Just jam each of those 50 averages into a new Queue and you are good to go.
Queue<TimeSpan> allTimeSpans = GetQueueOfTimeSpans();
Queue<TimeSpan> averages = New Queue<TimeSpan>(50);
int partitionSize = 10;
for (int i = 0; i <50; i++) {
var avg = allTimeSpans.Skip(i * partitionSize).Take(partitionSize).Average(t => t.Ticks)
averages.Enqueue(new TimeSpan(avg));
}
I'm a VB.NET guy, so there may be some syntax that isn't 100% write in that example. Let me know and I'll fix it!

Probably nothing beats a good old procedural execution in a method call in this case. It's not fancy, but it's easy, and it can be maintained by Jr. level devs.
public static Queue<TimeSpan> CompressTimeSpan(Queue<TimeSpan> original, int interval)
{
Queue<TimeSpan> newQueue = new Queue<TimeSpan>();
if (original.Count == 0) return newQueue;
int current = 0;
TimeSpan runningTotal = TimeSpan.Zero;
TimeSpan currentTimeSpan = original.Dequeue();
while (original.Count > 0 && current < interval)
{
runningTotal += currentTimeSpan;
if (++current >= interval)
{
newQueue.Enqueue(TimeSpan.FromTicks(runningTotal.Ticks / interval));
runningTotal = TimeSpan.Zero;
current = 0;
}
currentTimeSpan = original.Dequeue();
}
if (current > 0)
newQueue.Enqueue(TimeSpan.FromTicks(runningTotal.Ticks / current));
return newQueue;
}

You could just use
static public TimeSpan[] Reduce(TimeSpan[] spans, int blockLength)
{
TimeSpan[] avgSpan = new TimeSpan[original.Count / blockLength];
int currentIndex = 0;
for (int outputIndex = 0;
outputIndex < avgSpan.Length;
outputIndex++)
{
long totalTicks = 0;
for (int sampleIndex = 0; sampleIndex < blockLength; sampleIndex++)
{
totalTicks += spans[currentIndex].Ticks;
currentIndex++;
}
avgSpan[outputIndex] =
TimeSpan.FromTicks(totalTicks / blockLength);
}
return avgSpan;
}
It's a little more verbose (it doesn't use LINQ), but it's pretty easy to see what it's doing... (you can a Queue to/from an array pretty easily)

I'd use a loop, but just for fun:
IEnumerable<TimeSpan> AverageClumps(Queue<TimeSpan> lots, int clumpSize)
{
while (lots.Any())
{
var portion = Math.Min(clumpSize, lots.Count);
yield return Enumerable.Range(1, portion).Aggregate(TimeSpan.Zero,
(t, x) => t.Add(lots.Dequeue()),
(t) => new TimeSpan(t.Ticks / portion));
}
}
}
That only examines each element once, so the performance is a lot better than the other LINQ offerings. Unfortunately, it mutates the queue, but maybe it's a feature and not a bug?
It does have the nice bonus of being an iterator, so it gives you the averages one at a time.

Zipping it together with the integers (0..n) and grouping by the sequence number div 10?
I'm not a linq user, but I believe it would look something like this:
for (n,item) from Enumerable.Range(0, queue.length).zip(queue) group by n/10
The take(10) solution is probably better.

How is the grouping going to be performed?
Assuming something very simple (take 10 at a time ), you can start with something like:
List<TimeSpan> input = Enumerable.Range(0, 500)
.Select(i => new TimeSpan(0, 0, i))
.ToList();
var res = input.Select((t, i) => new { time=t.Ticks, index=i })
.GroupBy(v => v.index / 10, v => v.time)
.Select(g => new TimeSpan((long)g.Average()));
int n = 0;
foreach (var t in res) {
Console.WriteLine("{0,3}: {1}", ++n, t);
}
Notes:
Overload of Select to get the index, then use this and integer division pick up groups of 10. Could use modulus to take every 10th element into one group, every 10th+1 into another, ...
The result of the grouping is a sequence of enumerations with a Key property. But just need those separate sequences here.
There is no Enumerable.Average overload for IEnumerable<TimeSpan> so use Ticks (a long).
EDIT: Take groups of 10 to fit better with question.
EDIT2: Now with tested code.

Related

Parallel.ForEach slower than normal foreach

I'm playing around with the Parallel.ForEach in a C# console application, but can't seem to get it right. I'm creating an array with random numbers and i have a sequential foreach and a Parallel.ForEach that finds the largest value in the array. With approximately the same code in c++ i started to see a tradeoff to using several threads at 3M values in the array. But the Parallel.ForEach is twice as slow even at 100M values. What am i doing wrong?
class Program
{
static void Main(string[] args)
{
dostuff();
}
static void dostuff() {
Console.WriteLine("How large do you want the array to be?");
int size = int.Parse(Console.ReadLine());
int[] arr = new int[size];
Random rand = new Random();
for (int i = 0; i < size; i++)
{
arr[i] = rand.Next(0, int.MaxValue);
}
var watchSeq = System.Diagnostics.Stopwatch.StartNew();
var largestSeq = FindLargestSequentially(arr);
watchSeq.Stop();
var elapsedSeq = watchSeq.ElapsedMilliseconds;
Console.WriteLine("Finished sequential in: " + elapsedSeq + "ms. Largest = " + largestSeq);
var watchPar = System.Diagnostics.Stopwatch.StartNew();
var largestPar = FindLargestParallel(arr);
watchPar.Stop();
var elapsedPar = watchPar.ElapsedMilliseconds;
Console.WriteLine("Finished parallel in: " + elapsedPar + "ms Largest = " + largestPar);
dostuff();
}
static int FindLargestSequentially(int[] arr) {
int largest = arr[0];
foreach (int i in arr) {
if (largest < i) {
largest = i;
}
}
return largest;
}
static int FindLargestParallel(int[] arr) {
int largest = arr[0];
Parallel.ForEach<int, int>(arr, () => 0, (i, loop, subtotal) =>
{
if (i > subtotal)
subtotal = i;
return subtotal;
},
(finalResult) => {
Console.WriteLine("Thread finished with result: " + finalResult);
if (largest < finalResult) largest = finalResult;
}
);
return largest;
}
}
It's performance ramifications of having a very small delegate body.
We can achieve better performance using the partitioning. In this case the body delegate performs work with a high data volume.
static int FindLargestParallelRange(int[] arr)
{
object locker = new object();
int largest = arr[0];
Parallel.ForEach(Partitioner.Create(0, arr.Length), () => arr[0], (range, loop, subtotal) =>
{
for (int i = range.Item1; i < range.Item2; i++)
if (arr[i] > subtotal)
subtotal = arr[i];
return subtotal;
},
(finalResult) =>
{
lock (locker)
if (largest < finalResult)
largest = finalResult;
});
return largest;
}
Pay attention to synchronize the localFinally delegate. Also note the need for proper initialization of the localInit: () => arr[0] instead of () => 0.
Partitioning with PLINQ:
static int FindLargestPlinqRange(int[] arr)
{
return Partitioner.Create(0, arr.Length)
.AsParallel()
.Select(range =>
{
int largest = arr[0];
for (int i = range.Item1; i < range.Item2; i++)
if (arr[i] > largest)
largest = arr[i];
return largest;
})
.Max();
}
I highly recommend free book Patterns of Parallel Programming by Stephen Toub.
As the other answerers have mentioned, the action you're trying to perform against each item here is so insignificant that there are a variety of other factors which end up carrying more weight than the actual work you're doing. These may include:
JIT optimizations
CPU branch prediction
I/O (outputting thread results while the timer is running)
the cost of invoking delegates
the cost of task management
the system incorrectly guessing what thread strategy will be optimal
memory/cpu caching
memory pressure
environment (debugging)
etc.
Running each approach a single time is not an adequate way to test, because it enables a number of the above factors to weigh more heavily on one iteration than on another. You should start with a more robust benchmarking strategy.
Furthermore, your implementation is actually dangerously incorrect. The documentation specifically says:
The localFinally delegate is invoked once per task to perform a final action on each task’s local state. This delegate might be invoked concurrently on multiple tasks; therefore, you must synchronize access to any shared variables.
You have not synchronized your final delegate, so your function is prone to race conditions that would make it produce incorrect results.
As in most cases, the best approach to this one is to take advantage of work done by people smarter than we are. In my testing, the following approach appears to be the fastest overall:
return arr.AsParallel().Max();
The Parallel Foreach loop should be running slower because the algorithm used is not parallel and a lot more work is being done to run this algorithm.
In the single thread, to find the max value, we can take the first number as our max value and compare it to every other number in the array. If one of the numbers larger than our first number, we swap and continue. This way we access each number in the array once, for a total of N comparisons.
In the Parallel loop above, the algorithm creates overhead because each operation is wrapped inside a function call with a return value. So in addition to doing the comparisons, it is running overhead of adding and removing these calls onto the call stack. In addition, since each call is dependent on the value of the function call before, it needs to run in sequence.
In the Parallel For Loop below, the array is divided into an explicit number of threads determined by the variable threadNumber. This limits the overhead of function calls to a low number.
Note, for low values, the parallel loops performs slower. However, for 100M, there is a decrease in time elapsed.
static int FindLargestParallel(int[] arr)
{
var answers = new ConcurrentBag<int>();
int threadNumber = 4;
int partitionSize = arr.Length/threadNumber;
Parallel.For(0, /* starting number */
threadNumber+1, /* Adding 1 to threadNumber in case array.Length not evenly divisible by threadNumber */
i =>
{
if (i*partitionSize < arr.Length) /* check in case # in array is divisible by # threads */
{
var max = arr[i*partitionSize];
for (var x = i*partitionSize;
x < (i + 1)*partitionSize && x < arr.Length;
++x)
{
if (arr[x] > max)
max = arr[x];
}
answers.Add(max);
}
});
/* note the shortcut in finding max in the bag */
return answers.Max(i=>i);
}
Some thoughts here: In the parallel case, there is thread management logic involved that determines how many threads it wants to use. This thread management logic presumably possibly runs on your main thread. Every time a thread returns with the new maximum value, the management logic kicks in and determines the next work item (the next number to process in your array). I'm pretty sure that this requires some kind of locking. In any case, determining the next item may even cost more than performing the comparison operation itself.
That sounds like a magnitude more work (overhead) to me than a single thread that processes one number after the other. In the single-threaded case there are a number of optimization at play: No boundary checks, CPU can load data into the first level cache within the CPU, etc. Not sure, which of these optimizations apply for the parallel case.
Keep in mind that on a typical desktop machine there are only 2 to 4 physical CPU cores available so you will never have more than that actually doing work. So if the parallel processing overhead is more than 2-4 times of a single-threaded operation, the parallel version will inevitably be slower, which you are observing.
Have you attempted to run this on a 32 core machine? ;-)
A better solution would be determine non-overlapping ranges (start + stop index) covering the entire array and let each parallel task process one range. This way, each parallel task can internally do a tight single-threaded loop and only return once the entire range has been processed. You could probably even determine a near optimal number of ranges based on the number of logical cores of the machine. I haven't tried this but I'm pretty sure you will see an improvement over the single-threaded case.
Try splitting the set into batches and running the batches in parallel, where the number of batches corresponds to your number of CPU cores.
I ran some equations 1K, 10K and 1M times using the following methods:
A "for" loop.
A "Parallel.For" from the System.Threading.Tasks lib, across the entire set.
A "Parallel.For" across 4 batches.
A "Parallel.ForEach" from the System.Threading.Tasks lib, across the entire set.
A "Parallel.ForEach" across 4 batches.
Results: (Measured in seconds)
Conclusion:
Processing batches in parallel using the "Parallel.ForEach" has the best outcome in cases above 10K records. I believe the batching helps because it utilizes all CPU cores (4 in this example), but also minimizes the amount of threading overhead associated with parallelization.
Here is my code:
public void ParallelSpeedTest()
{
var rnd = new Random(56);
int range = 1000000;
int numberOfCores = 4;
int batchSize = range / numberOfCores;
int[] rangeIndexes = Enumerable.Range(0, range).ToArray();
double[] inputs = rangeIndexes.Select(n => rnd.NextDouble()).ToArray();
double[] weights = rangeIndexes.Select(n => rnd.NextDouble()).ToArray();
double[] outputs = new double[rangeIndexes.Length];
/// Series "for"...
var startTimeSeries = DateTime.Now;
for (var i = 0; i < range; i++)
{
outputs[i] = Math.Sqrt(Math.Pow(inputs[i] * weights[i], 2));
}
var durationSeries = DateTime.Now - startTimeSeries;
/// "Parallel.For"...
var startTimeParallel = DateTime.Now;
Parallel.For(0, range, (i) => {
outputs[i] = Math.Sqrt(Math.Pow(inputs[i] * weights[i], 2));
});
var durationParallelFor = DateTime.Now - startTimeParallel;
/// "Parallel.For" in Batches...
var startTimeParallel2 = DateTime.Now;
Parallel.For(0, numberOfCores, (c) => {
var endValue = (c == numberOfCores - 1) ? range : (c + 1) * batchSize;
var startValue = c * batchSize;
for (var i = startValue; i < endValue; i++)
{
outputs[i] = Math.Sqrt(Math.Pow(inputs[i] * weights[i], 2));
}
});
var durationParallelForBatches = DateTime.Now - startTimeParallel2;
/// "Parallel.ForEach"...
var startTimeParallelForEach = DateTime.Now;
Parallel.ForEach(rangeIndexes, (i) => {
outputs[i] = Math.Sqrt(Math.Pow(inputs[i] * weights[i], 2));
});
var durationParallelForEach = DateTime.Now - startTimeParallelForEach;
/// Parallel.ForEach in Batches...
List<Tuple<int,int>> ranges = new List<Tuple<int, int>>();
for (var i = 0; i < numberOfCores; i++)
{
int start = i * batchSize;
int end = (i == numberOfCores - 1) ? range : (i + 1) * batchSize;
ranges.Add(new Tuple<int,int>(start, end));
}
var startTimeParallelBatches = DateTime.Now;
Parallel.ForEach(ranges, (range) => {
for(var i = range.Item1; i < range.Item1; i++) {
outputs[i] = Math.Sqrt(Math.Pow(inputs[i] * weights[i], 2));
}
});
var durationParallelForEachBatches = DateTime.Now - startTimeParallelBatches;
Debug.Print($"=================================================================");
Debug.Print($"Given: Set-size: {range}, number-of-batches: {numberOfCores}, batch-size: {batchSize}");
Debug.Print($".................................................................");
Debug.Print($"Series For: {durationSeries}");
Debug.Print($"Parallel For: {durationParallelFor}");
Debug.Print($"Parallel For Batches: {durationParallelForBatches}");
Debug.Print($"Parallel ForEach: {durationParallelForEach}");
Debug.Print($"Parallel ForEach Batches: {durationParallelForEachBatches}");
Debug.Print($"");
}

Calculate number of hours filtered by jobNum

I'm creating a string[] array from an xml file that lists every job number assigned to a time keeping application. I can successfully return string[] from the xml files. I'm trying to match the array for time with the array position for job num and I'm having difficulty. I'm hoping someone here has the paitience to help and/or direct a NEWB to good information already displayed somewhere.
Here is what I have so far. I just can't seem to sew them together. Get Max occurrences of a job number.
public static string totalHoursbyJob()
{
int count = 0;
if (jobNum.Length > count)
{
var items = jobNum.Distinct();
count = items.Count();
foreach (string value in items)
{
string[,] itemArr = new string[count, Convert.ToInt32(jobNum)];
}
}
}
This gets the time component and calculates the values, but it does not filter by job number. It accurately calculates the values found in the .innerText of the nodes in the xml file.
for (i = 0; i < ticks.Length; i++)
{
ticksInt = double.Parse(ticks[i]);
if (ticksInt > 1)
{
double small = ((((ticksInt / 10000) / 1000) / 60) / 60);
sum2 += small;
}
}
Can somebody please point me to what I'm doing wrong here? Thank you in advance. I really appreciate you stopping by to look and help. :-D Have a Great Day
EDIT1 Cleared an error!! Yay
EDIT2 Thank you user910683. I removed the code that does nothing at the moment and modified the code that creates the comparative array. Please see next.
if (jobNum.Length > count)
{
string[] items = jobNum.Distinct().ToArray();//This change made to clear
//error at items.length below
count = items.Count();
//itemArr.Join(++items);
foreach (string value in items)
{
string[,] itemArr = new string[count, items.Length];
}
}
for (jn = 0; jn < jobNum.Length; jn++)
{
string test = jobNum.ToString();
}
You seem to have a logic error here. You're not using jn within the loop, so each time the loop is executed you're setting test to jobNum.ToString(). You're doing the same thing over and over again, and you're not using the 'test' string for anything. Is this what you want?
Also, consider this line from the previous block:
string[,] itemArr = new string[count, Convert.ToInt32(jobNum)];
You have removed the exception here by converting jobNum to an Int32, but is this what you want? I would guess that you actually want the number of strings in jobNum, in which case you should use jobNum.Length.
UPDATE
I think I have a better sense of what you want now. Do you want a multidimensional array matching a job number string to the amount of time spent on that job? If so, change:
foreach (string value in items)
{
string[,] itemArr = new string[count, Convert.ToInt32(jobNum)];
}
to something like:
string[,] itemArr = new string[count, 2];
var items = jobNum.Distinct();
for(int x=0; x<items.Count, x++)
{
itemArr[x][0] = items[x];
}
}
Then with the ticks, change the ticks conversion from this:
double small = ((((ticksInt / 10000) / 1000) / 60) / 60);
sum2 += small;
to something like:
sum2 += TimeSpan.FromTicks(ticks[i]).TotalHours;
I'd have to see the declaration and initialisation of jobNum and ticks to explain how to put sum2 into your job/time array.
You might consider using something like XmlSerializer instead, it seems like you're doing a lot of manual work with your XML-derived data, you might be able to simplify it for yourself.

Whats the most concise way to pick a random element by weight in c#?

Lets assume:
List<element> which element is:
public class Element {
int Weight { get; set; }
}
What I want to achieve is, select an element randomly by the weight.
For example:
Element_1.Weight = 100;
Element_2.Weight = 50;
Element_3.Weight = 200;
So
the chance Element_1 got selected is 100/(100+50+200)=28.57%
the chance Element_2 got selected is 50/(100+50+200)=14.29%
the chance Element_3 got selected is 200/(100+50+200)=57.14%
I know I can create a loop, calculate total, etc...
What I want to learn is, whats the best way to do this by Linq in ONE line (or as short as possible), thanks.
UPDATE
I found my answer below. First thing I learn is: Linq is NOT magic, it's slower then well-designed loop.
So my question becomes find a random element by weight, (without as short as possible stuff :)
If you want a generic version (useful for using with a (singleton) randomize helper, consider whether you need a constant seed or not)
usage:
randomizer.GetRandomItem(items, x => x.Weight)
code:
public T GetRandomItem<T>(IEnumerable<T> itemsEnumerable, Func<T, int> weightKey)
{
var items = itemsEnumerable.ToList();
var totalWeight = items.Sum(x => weightKey(x));
var randomWeightedIndex = _random.Next(totalWeight);
var itemWeightedIndex = 0;
foreach(var item in items)
{
itemWeightedIndex += weightKey(item);
if(randomWeightedIndex < itemWeightedIndex)
return item;
}
throw new ArgumentException("Collection count and weights must be greater than 0");
}
// assuming rnd is an already instantiated instance of the Random class
var max = list.Sum(y => y.Weight);
var rand = rnd.Next(max);
var res = list
.FirstOrDefault(x => rand >= (max -= x.Weight));
This is a fast solution with precomputation. The precomputation takes O(n), the search O(log(n)).
Precompute:
int[] lookup=new int[elements.Length];
lookup[0]=elements[0].Weight-1;
for(int i=1;i<lookup.Length;i++)
{
lookup[i]=lookup[i-1]+elements[i].Weight;
}
To generate:
int total=lookup[lookup.Length-1];
int chosen=random.GetNext(total);
int index=Array.BinarySearch(lookup,chosen);
if(index<0)
index=~index;
return elements[index];
But if the list changes between each search, you can instead use a simple O(n) linear search:
int total=elements.Sum(e=>e.Weight);
int chosen=random.GetNext(total);
int runningSum=0;
foreach(var element in elements)
{
runningSum+=element.Weight;
if(chosen<runningSum)
return element;
}
This could work:
int weightsSum = list.Sum(element => element.Weight);
int start = 1;
var partitions = list.Select(element =>
{
var oldStart = start;
start += element.Weight;
return new { Element = element, End = oldStart + element.Weight - 1};
});
var randomWeight = random.Next(weightsSum);
var randomElement = partitions.First(partition => (partition.End > randomWeight)).
Select(partition => partition.Element);
Basically, for each element a partition is created with an end weight.
In your example, Element1 would associated to (1-->100), Element2 associated to (101-->151) and so on...
Then a random weight sum is calculated and we look for the range which is associated to it.
You could also compute the sum in the method group but that would introduce another side effect...
Note that I'm not saying this is elegant or fast. But it does use linq (not in one line...)
.Net 6 introduced .MaxBy making this much easier.
This could now be simplified to the following one-liner:
list.MaxBy(x => rng.GetNext(x.weight));
This works best if the weights are large or floating point numbers, otherwise there will be collisions, which can be prevented by multiplying the weight by some factor.

What is the best way in linq to calculate the percentage from a list?

I have a list of 1s and 0s and I have to now calculate the percent meaning if 1 he achieved it else he doesn't. So e.g -
{1,1,0,0,0}
So for e.g If List has 5 items and he got 2 ones then his percent is 40%. Is there a function or way in LINQ I could do it easily maybe in one line ? I am sure LINQ experts have a suave way of doing it ?
What about
var list = new List<int>{1,1,0,0,0};
var percentage = ((double)list.Sum())/list.Count*100;
or if you want to get the percentage of a specific element
var percentage = ((double)list.Count(i=>i==1))/list.Count*100;
EDIT
Note BrokenGlass's solution and use the Average extension method for the first case as in
var percentage = list.Average() * 100;
In this special case you can also use Average() :
var list = new List<int> {1,1,0,0,0};
double percent = list.Average() * 100;
If you're working with any ICollection<T> (such as List<T>) the Count property will probably be O(1); but in the more general case of any sequence the Count() extension method is going to be O(N), making it less than ideal. Thus for the most general case you might consider something like this which counts elements matching a specified predicate and all elements in one go:
public static double Percent<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
int total = 0;
int count = 0;
foreach (T item in source)
{
++count;
if (predicate(item))
{
total += 1;
}
}
return (100.0 * total) / count;
}
Then you'd just do:
var list = new List<int> { 1, 1, 0, 0, 0 };
double percent = list.Percent(i => i == 1);
Output:
40
Best way to do it:
var percentage = ((double)list.Count(i=>i==1))/list.Count*100;
or
var percentage = ((double)list.Count(i=>i <= yourValueHere))/list.Count*100;
If You
want to do it in one line
don't want to maintain an extension method
can't take advantage of list.Sum() because your list data isn't 1s and 0s
you can do something like this:
percentAchieved = (int)
((double)(from MyClass myClass
in myList
where MyClass.SomeProperty == "SomeValue"
select myClass).ToList().Count /
(double)myList.Count *
100.0
);

What is the simplest way to initialize an Array of N numbers following a simple pattern?

Let's say the first N integers divisible by 3 starting with 9.
I'm sure there is some one line solution using lambdas, I just don't know it that area of the language well enough yet.
Just to be different (and to avoid using a where statement) you could also do:
var numbers = Enumerable.Range(0, n).Select(i => i * 3 + 9);
Update This also has the benefit of not running out of numbers.
Using Linq:
int[] numbers =
Enumerable.Range(9,10000)
.Where(x => x % 3 == 0)
.Take(20)
.ToArray();
Also easily parallelizeable using PLinq if you need:
int[] numbers =
Enumerable.Range(9,10000)
.AsParallel() //added this line
.Where(x => x % 3 == 0)
.Take(20)
.ToArray();
const int __N = 100;
const int __start = 9;
const int __divisibleBy = 3;
var array = Enumerable.Range(__start, __N * __divisibleBy).Where(x => x % __divisibleBy == 0).Take(__N).ToArray();
int n = 10; // Take first 10 that meet criteria
int[] ia = Enumerable
.Range(0,999)
.Where(a => a % 3 == 0 && a.ToString()[0] == '9')
.Take(n)
.ToArray();
I want to see how this solution stacks up to the above Linq solutions. The trick here is modifying the predicate using the fact that the set of (q % m) starting from s is (s + (s % m) + m*n) (where n represent's the nth value in the set). In our case s=q.
The only problem with this solution is that it has the side effect of making your implementation depend on the specific pattern you choose (and not all patterns have a suitable predicate). But it has the advantage of:
Always running in exactly n iterations
Never failing like the above proposed solutions (wrt to the limited Range).
Besides, no matter what pattern you choose, you will always need to modify the predicate, so you might as well make it mathematically efficient:
static int[] givemeN(int n)
{
const int baseVal = 9;
const int modVal = 3;
int i = 0;
return Array.ConvertAll<int, int>(
new int[n],
new Converter<int, int>(
x => baseVal + (baseVal % modVal) +
((i++) * modVal)
));
}
edit: I just want to illustrate how you could use this method with a delegate to improve code re-use:
static int[] givemeN(int n, Func<int, int> func)
{
int i = 0;
return Array.ConvertAll<int, int>(new int[n], new Converter<int, int>(a => func(i++)));
}
You can use it with givemeN(5, i => 9 + 3 * i). Again note that I modified the predicate, but you can do this with most simple patterns too.
I can't say this is any good, I'm not a C# expert and I just whacked it out, but I think it's probably a canonical example of the use of yield.
internal IEnumerable Answer(N)
{
int n=0;
int i=9;
while (true)
{
if (i % 3 == 0)
{
n++;
yield return i;
}
if (n>=N) return;
i++;
}
}
You have to iterate through 0 or 1 to N and add them by hand. Or, you could just create your function f(int n), and in that function, you cache the results inside session or a global hashtable or dictionary.
Pseudocode, where ht is a global Hashtable or Dictionary (strongly recommend the later, because it is strongly typed.
public int f(int n)
{
if(ht[n].containsValue)
return ht[n];
else
{
//do calculation
ht[n] = result;
return result;
}
}
Just a side note. If you do this type of functional programming all the time, you might want to check out F#, or maybe even Iron Ruby or Python.

Categories