I'm currently working on an exponential falloff (http://www.acodersjourney.com/2016/02/26-handle-transient-errors-in-c/) implementation.
I'm calculating the delay to wait, such that:
var delay = (int) Math.Round(Math.Pow(timeBetweenAttempts, attempt), MidpointRounding.AwayFromZero);
Obviously, this delay starts to get very large, very quickly, even with the time between attempts being 10 milliseconds.
I would like to be able to do something akin:
var maxTimeBetweenAttempts = 5000; // 5 seconds is the hard limit
var nominalTimeBetweenAttempts = 10;
var maxNumberOfAttempts = // calculate the maximum number of raises that would hold below 5000.
Obviously this could be calculated using a loop, but I was wondering if there was a more elegant way to do this?
The inverse of exponentiation is logarithm so you can use that.
You would use
if (attempt < Math.Log(maxTimeBetweenAttempts) / Math.Log(nominalTimeBetweenAttempts))
{
Retry(++attempt);
}
To see if the attempt was below the threshold after rounding.
However, as this requires an extra calculation and is a little less obvious (as it uses logarithmic maths that most people haven't event seen since school) than just doing the exponential, I would personally recommend just calculating the power and testing that.
ie.
if(Math.Pow(nominalTimeBetweenAttempts, attempt) < maxTimeBetweenAttempts)
{
Retry(++attempt);
}
Edit:
Re-reading your question, you explicitly state that you want to know the maximum number of retries before they reach a certain length. That would be:
var maximumNumberOfAttempts = Math.Floor(Math.Log(maxTimeBetweenAttempts) / Math.Log(nominalTimeBetweenAttempts))
Related
I am trying to find a way to create a 2 Hz square wave to an LED (Basically toggle between high and low at 2Hz) I have a general sense of how I think it should go, but not really sure what to do as I am new to C#, any help would be greatly appreciated!
Here's my general thought process:
While(Programruns){
read input
(I feel like there should be a for loop here to keep the square wave going forever until I stop it)
if(button is pressed){
output square wave
}
else {
off}
}
You want a loop with a sleep operation in it.
You didn't tell us how you turn on and off that LED, so I will assume you have a method called SetLED(val). A val of 0 turns the LED off, and 1 turns it on. And, let's have a method called ButtonPressed() that comes back true when the button is pressed.
To get a 2Hz square wave you want to flip val every 250 milliseconds. This code will do that for you.
var ledState = 1;
while (ButtonPressed()) {
SetLED(ledState);
Thread.Sleep(250); //milliseconds
ledState = 1 - ledState; //flip the value
}
SetLED(0); // turn the LED off when done.
That will basically work. But there's a complication: C# and its various underlying operating systems are not hard real-time systems. Therefore Thread.Sleep() sometimes wakes up a bit late. These late wakeups can accumulate and make your square wave a little jittery and, cumulatively, less than 2Hz. You can't do much about the jitter.
But you can avoid the frequency degradation by looking at the time-of-day clock and computing how long to sleep in each loop.
The expression DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond gets you the present time in milliseconds (since some long-ago epoch).
So you can compute the number of milliseconds for the next sleep in your code. That corrects for .Sleep() oversleeping.
var ledState = 1;
long nextTickTime = DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond;
while (ButtonPressed()) {
SetLED(ledState);
nextTickTime += 250;
long nowTime = DateTime.Now.Ticks / TimeSpan.TicksPerMillisecond;
int sleepInterval = nextTickTime - nowTime;
if (sleepInterval > 0)
Thread.Sleep(sleepInterval);
ledState = 1 - ledState; //flip the value
}
SetLED(0); // turn the LED off when done.
That's a fairly robust blinker. You can also raise the priority of the thread while the loop is running, but that's a question and answer for another day.
This is a mathematical problem, not programming to be something useful!
I want to count factorials of very big numbers (10^n where n>6).
I reached to arbitrary precision, which is very helpful in tasks like 1000!. But it obviously dies(StackOverflowException :) ) at much higher values. I'm not looking for a direct answer, but some clues on how to proceed further.
static BigInteger factorial(BigInteger i)
{
if (i < 1)
return 1;
else
return i * factorial(i - 1);
}
static void Main(string[] args)
{
long z = (long)Math.Pow(10, 12);
Console.WriteLine(factorial(z));
Console.Read();
}
Would I have to resign from System.Numerics.BigInteger? I was thinking of some way of storing necessary data in files, since RAM will obviously run out. Optimization is at this point very important. So what would You recommend?
Also, I need values to be as precise as possible. Forgot to mention that I don't need all of these numbers, just about 20 last ones.
As other answers have shown, the recursion is easily removed. Now the question is: can you store the result in a BigInteger, or are you going to have to go to some sort of external storage?
The number of bits you need to store n! is roughly proportional to n log n. (This is a weak form of Stirling's Approximation.) So let's look at some sizes: (Note that I made some arithmetic errors in an earlier version of this post, which I am correcting here.)
(10^6)! takes order of 2 x 10^6 bytes = a few megabytes
(10^12)! takes order of 3 x 10^12 bytes = a few terabytes
(10^21)! takes order of 10^22 bytes = ten billion terabytes
A few megs will fit into memory. A few terabytes is easily within your grasp but you'll need to write a memory manager probably. Ten billion terabytes will take the combined resources of all the technology companies in the world, but it is doable.
Now consider the computation time. Suppose we can perform a million multiplications per second per machine and that we can parallelize the work out to multiple machines somehow.
(10^6)! takes order of one second on one machine
(10^12)! takes order of 10^6 seconds on one machine =
10 days on one machine =
a few minutes on a thousand machines.
(10^21)! takes order of 10^15 seconds on one machine =
30 million years on one machine =
3 years on 10 million machines
1 day on 10 billion machines (each with a TB drive.)
So (10^6)! is within your grasp. (10^12)! you are going to have to write your own memory manager and math library, and it will take you some time to get an answer. (10^21)! you will need to organize all the resources of the world to solve this problem, but it is doable.
Or you could find another approach.
The solution is easy: Calculate the factorials without using recursion, and you won't blow out your stack.
I.e. you're not getting this error because the numbers are too large, but because you have too many levels of function calls. And fortunately, for factorials there's no reason to calculate them recursively.
Once you've solved your stack problem, you can worry about whether your number format can handle your "very big" factorials. Since you don't need the exact values, use one of the many efficient numeric approximations (which you can count on to get all of the most significant digits right). The most common one is Stirling's approximation:
n! ~ n^n e^{-n} sqrt(2 \pi n)
The image is from this page, where you'll find discussion and a second, more accurate formula (although "in most cases the difference is quite small", they say). Of course this number is still too large for you to store, but now you can work with logarithms and drop the unimportant digits before you extract the number. Or use the Wikipedia version of the approximation, which is already expressed as a logarithm.
Unroll recursion:
static BigInteger factorial(BigInteger n)
{
BigInteger res = 1;
for (BigInteger i = 2; i <= n; ++i)
res *= i;
return res;
}
I have a few inequalities regarding {x,y}, that satisfies the following equations:
x>=0
y>=0
f(x,y)=x^2+y^2>=100
g(x,y)=x^2+y^2<=200
Note that x and y must be integer.
Graphically it can be represented as follows, the blue region is the region that satisfies the above inequalities:
The question now is, is there any function in Matlab that finds every admissible pair of {x,y}? If there is an algorithm to do this kind of thing I would be glad to hear about it as well.
Of course, one approach we can always use is brute force approach where we test every possible combination of {x,y} to see whether the inequalities are satisfied. But this is the last resort, because it's time consuming. I'm looking for a clever algorithm that does this, or in the best case, an existing library that I can use straight-away.
The x^2+y^2>=100 and x^2+y^2<=200 are just examples; in reality f and g can be any polynomial functions of any degree.
Edit: C# code are welcomed as well.
This is surely not possible to do in general for a general set of polynomial inequalities, by any method other than enumerative search, even if there are a finite number of solutions. (Perhaps I should say not trivial, as it is possible. Enumerative search will work, subject to floating point issues.) Note that the domain of interest need not be simply connected for higher order inequalities.
Edit: The OP has asked about how one might proceed to do a search.
Consider the problem
x^3 + y^3 >= 1e12
x^4 + y^4 <= 1e16
x >= 0, y >= 0
Solve for all integer solutions of this system. Note that integer programming in ANY form will not suffice here, since ALL integer solutions are requested.
Use of meshgrid here would force us to look at points in the domain (0:10000)X(0:10000). So it would force us to sample a set of 1e8 points, testing every point to see if they satisfy the constraints.
A simple loop can potentially be more efficient than that, although it will still require some effort.
% Note that I will store these in a cell array,
% since I cannot preallocate the results.
tic
xmax = 10000;
xy = cell(1,xmax);
for x = 0:xmax
% solve for y, given x. This requires us to
% solve for those values of y such that
% y^3 >= 1e12 - x.^3
% y^4 <= 1e16 - x.^4
% These are simple expressions to solve for.
y = ceil((1e12 - x.^3).^(1/3)):floor((1e16 - x.^4).^0.25);
n = numel(y);
if n > 0
xy{x+1} = [repmat(x,1,n);y];
end
end
% flatten the cell array
xy = cell2mat(xy);
toc
The time required was...
Elapsed time is 0.600419 seconds.
Of the 100020001 combinations that we might have tested for, how many solutions did we find?
size(xy)
ans =
2 4371264
Admittedly, the exhaustive search is simpler to write.
tic
[x,y] = meshgrid(0:10000);
k = (x.^3 + y.^3 >= 1e12) & (x.^4 + y.^4 <= 1e16);
xy = [x(k),y(k)];
toc
I ran this on a 64 bit machine, with 8 gig of ram. But even so the test itself was a CPU hog.
Elapsed time is 50.182385 seconds.
Note that floating point considerations will sometimes cause a different number of points to be found, depending on how the computations are done.
Finally, if your constraint equations are more complex, you might need to use roots in the expression for the bounds on y, to help identify where the constraints are satisfied. The nice thing here is it still works for more complicated polynomial bounds.
I have a program that needs to repeatedly compute the approximate percentile (order statistic) of a dataset in order to remove outliers before further processing. I'm currently doing so by sorting the array of values and picking the appropriate element; this is doable, but it's a noticable blip on the profiles despite being a fairly minor part of the program.
More info:
The data set contains on the order of up to 100000 floating point numbers, and assumed to be "reasonably" distributed - there are unlikely to be duplicates nor huge spikes in density near particular values; and if for some odd reason the distribution is odd, it's OK for an approximation to be less accurate since the data is probably messed up anyhow and further processing dubious. However, the data isn't necessarily uniformly or normally distributed; it's just very unlikely to be degenerate.
An approximate solution would be fine, but I do need to understand how the approximation introduces error to ensure it's valid.
Since the aim is to remove outliers, I'm computing two percentiles over the same data at all times: e.g. one at 95% and one at 5%.
The app is in C# with bits of heavy lifting in C++; pseudocode or a preexisting library in either would be fine.
An entirely different way of removing outliers would be fine too, as long as it's reasonable.
Update: It seems I'm looking for an approximate selection algorithm.
Although this is all done in a loop, the data is (slightly) different every time, so it's not easy to reuse a datastructure as was done for this question.
Implemented Solution
Using the wikipedia selection algorithm as suggested by Gronim reduced this part of the run-time by about a factor 20.
Since I couldn't find a C# implementation, here's what I came up with. It's faster even for small inputs than Array.Sort; and at 1000 elements it's 25 times faster.
public static double QuickSelect(double[] list, int k) {
return QuickSelect(list, k, 0, list.Length);
}
public static double QuickSelect(double[] list, int k, int startI, int endI) {
while (true) {
// Assume startI <= k < endI
int pivotI = (startI + endI) / 2; //arbitrary, but good if sorted
int splitI = partition(list, startI, endI, pivotI);
if (k < splitI)
endI = splitI;
else if (k > splitI)
startI = splitI + 1;
else //if (k == splitI)
return list[k];
}
//when this returns, all elements of list[i] <= list[k] iif i <= k
}
static int partition(double[] list, int startI, int endI, int pivotI) {
double pivotValue = list[pivotI];
list[pivotI] = list[startI];
list[startI] = pivotValue;
int storeI = startI + 1;//no need to store # pivot item, it's good already.
//Invariant: startI < storeI <= endI
while (storeI < endI && list[storeI] <= pivotValue) ++storeI; //fast if sorted
//now storeI == endI || list[storeI] > pivotValue
//so elem #storeI is either irrelevant or too large.
for (int i = storeI + 1; i < endI; ++i)
if (list[i] <= pivotValue) {
list.swap_elems(i, storeI);
++storeI;
}
int newPivotI = storeI - 1;
list[startI] = list[newPivotI];
list[newPivotI] = pivotValue;
//now [startI, newPivotI] are <= to pivotValue && list[newPivotI] == pivotValue.
return newPivotI;
}
static void swap_elems(this double[] list, int i, int j) {
double tmp = list[i];
list[i] = list[j];
list[j] = tmp;
}
Thanks, Gronim, for pointing me in the right direction!
The histogram solution from Henrik will work. You can also use a selection algorithm to efficiently find the k largest or smallest elements in an array of n elements in O(n). To use this for the 95th percentile set k=0.05n and find the k largest elements.
Reference:
http://en.wikipedia.org/wiki/Selection_algorithm#Selecting_k_smallest_or_largest_elements
According to its creator a SoftHeap can be used to:
compute exact or approximate medians
and percentiles optimally. It is also
useful for approximate sorting...
I used to identify outliers by calculating the standard deviation. Everything with a distance more as 2 (or 3) times the standard deviation from the avarage is an outlier. 2 times = about 95%.
Since your are calculating the avarage, its also very easy to calculate the standard deviation is very fast.
You could also use only a subset of your data to calculate the numbers.
You could estimate your percentiles from just a part of your dataset, like the first few thousand points.
The Glivenko–Cantelli theorem ensures that this would be a fairly good estimate, if you can assume your data points to be independent.
Divide the interval between minimum and maximum of your data into (say) 1000 bins and calculate a histogram. Then build partial sums and see where they first exceed 5000 or 95000.
There are a couple basic approaches I can think of. First is to compute the range (by finding the highest and lowest values), project each element to a percentile ((x - min) / range) and throw out any that evaluate to lower than .05 or higher than .95.
The second is to compute the mean and standard deviation. A span of 2 standard deviations from the mean (in both directions) will enclose 95% of a normally-distributed sample space, meaning your outliers would be in the <2.5 and >97.5 percentiles. Calculating the mean of a series is linear, as is the standard dev (square root of the sum of the difference of each element and the mean). Then, subtract 2 sigmas from the mean, and add 2 sigmas to the mean, and you've got your outlier limits.
Both of these will compute in roughly linear time; the first one requires two passes, the second one takes three (once you have your limits you still have to discard the outliers). Since this is a list-based operation, I do not think you will find anything with logarithmic or constant complexity; any further performance gains would require either optimizing the iteration and calculation, or introducing error by performing the calculations on a sub-sample (such as every third element).
A good general answer to your problem seems to be RANSAC.
Given a model, and some noisy data, the algorithm efficiently recovers the parameters of the model.
You will have to chose a simple model that can map your data. Anything smooth should be fine. Let say a mixture of few gaussians. RANSAC will set the parameters of your model and estimate a set of inliners at the same time. Then throw away whatever doesn't fit the model properly.
You could filter out 2 or 3 standard deviation even if the data is not normally distributed; at least, it will be done in a consistent manner, that should be important.
As you remove the outliers, the std dev will change, you could do this in a loop until the change in std dev is minimal. Whether or not you want to do this depends upon why are you manipulating the data this way. There are major reservations by some statisticians to removing outliers. But some remove the outliers to prove that the data is fairly normally distributed.
Not an expert, but my memory suggests:
to determine percentile points exactly you need to sort and count
taking a sample from the data and calculating the percentile values sounds like a good plan for decent approximation if you can get a good sample
if not, as suggested by Henrik, you can avoid the full sort if you do the buckets and count them
One set of data of 100k elements takes almost no time to sort, so I assume you have to do this repeatedly. If the data set is the same set just updated slightly, you're best off building a tree (O(N log N)) and then removing and adding new points as they come in (O(K log N) where K is the number of points changed). Otherwise, the kth largest element solution already mentioned gives you O(N) for each dataset.
We were having a performance issue in a C# while loop. The loop was super slow doing only one simple math calc. Turns out that parmIn can be a huge number anywhere from 999999999 to MaxInt. We hadn't anticipated the giant value of parmIn. We have fixed our code using a different methodology.
The loop, coded for simplicity below, did one math calc. I am just curious as to what the actual execution time for a single iteration of a while loop containing one simple math calc is?
int v1=0;
while(v1 < parmIn) {
v1+=parmIn2;
}
There is something else going on here. The following will complete in ~100ms for me. You say that the parmIn can approach MaxInt. If this is true, and the ParmIn2 is > 1, you're not checking to see if your int + the new int will overflow. If ParmIn >= MaxInt - parmIn2, your loop might never complete as it will roll back over to MinInt and continue.
static void Main(string[] args)
{
int i = 0;
int x = int.MaxValue - 50;
int z = 42;
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
while (i < x)
{
i += z;
}
st.Stop();
Console.WriteLine(st.Elapsed.Milliseconds.ToString());
Console.ReadLine();
}
Assuming an optimal compiler, it should be one operation to check the while condition, and one operation to do the addition.
The time, small as it is, to execute just one iteration of the loop shown in your question is ... surprise ... small.
However, it depends on the actual CPU speed and whatnot exactly how small it is.
It should be just a few machine instructions, so not many cycles to pass once through the iteration, but there could be a few cycles to loop back up, especially if branch prediction fails.
In any case, the code as shown either suffers from:
Premature optimization (in that you're asking about timing for it)
Incorrect assumptions. You can probably get a much faster code if parmIn is big by just calculating how many loop iterations you would have to perform, and do a multiplication. (note again that this might be an incorrect assumption, which is why there is only one sure way to find performance issues, measure measure measure)
What is your real question?
It depends on the processor you are using and the calculation it is performing. (For example, even on some modern architectures, an add may take only one clock cycle, but a divide may take many clock cycles. There is a comparison to determine if the loop should continue, which is likely to be around one clock cycle, and then a branch back to the start of the loop, which may take any number of cycles depending on pipeline size and branch prediction)
IMHO the best way to find out more is to put the code you are interested into a very large loop (millions of iterations), time the loop, and divide by the number of iterations - this will give you an idea of how long it takes per iteration of the loop. (on your PC). You can try different operations and learn a bit about how your PC works. I prefer this "hands on" approach (at least to start with) because you can learn so much more from physically trying it than just asking someone else to tell you the answer.
The while loop is couple of instructions and one instruction for the math operation. You're really looking at a minimal execution time for one iteration. it's the sheer number of iterations you're doing that is killing you.
Note that a tight loop like this has implications on other things as well, as it bogs down one CPU and it blocks the UI thread (if it's running on it). Thus, not only it is slow due to the number of operations, it also adds a perceived perf impact due to making the whole machine look unresponsive.
If you're interested in the actual execution time, why not time it for yourself and find out?
int parmIn = 10 * 1000 * 1000; // 10 million
int v1=0;
Stopwatch sw = Stopwatch.StartNew();
while(v1 < parmIn) {
v1+=parmIn2;
}
sw.Stop();
double opsPerSec = (double)parmIn / sw.Elapsed.TotalSeconds;
And, of course, the time for one iteration is 1/opsPerSec.
Whenever someone asks about how fast control structures in any language you know they are trying to optimize the wrong thing. If you find yourself changing all your i++ to ++i or changing all your switch to if...else for speed you are micro-optimizing. And micro optimizations almost never give you the speed you want. Instead, think a bit more about what you are really trying to do and devise a better way to do it.
I'm not sure if the code you posted is really what you intend to do or if it is simply the loop stripped down to what you think is causing the problem. If it is the former then what you are trying to do is find the largest value of a number that is smaller than another number. If this is really what you want then you don't really need a loop:
// assuming v1, parmIn and parmIn2 are integers,
// and you want the largest number (v1) that is
// smaller than parmIn but is a multiple of parmIn2.
// AGAIN, assuming INTEGER MATH:
v1 = (parmIn/parmIn2)*parmIn2;
EDIT: I just realized that the code as originally written gives the smallest number that is a multiple of parmIn2 that is larger than parmIn. So the correct code is:
v1 = ((parmIn/parmIn2)*parmIn2)+parmIn2;
If this is not what you really want then my advise remains the same: think a bit on what you are really trying to do (or ask on Stackoverflow) instead of trying to find out weather while or for is faster. Of course, you won't always find a mathematical solution to the problem. In which case there are other strategies to lower the number of loops taken. Here's one based on your current problem: keep doubling the incrementer until it is too large and then back off until it is just right:
int v1=0;
int incrementer=parmIn2;
// keep doubling the incrementer to
// speed up the loop:
while(v1 < parmIn) {
v1+=incrementer;
incrementer=incrementer*2;
}
// now v1 is too big, back off
// and resume normal loop:
v1-=incrementer;
while(v1 < parmIn) {
v1+=parmIn2;
}
Here's yet another alternative that speeds up the loop:
// First count at 100x speed
while(v1 < parmIn) {
v1+=parmIn2*100;
}
// back off and count at 50x speed
v1-=parmIn2*100;
while(v1 < parmIn) {
v1+=parmIn2*50;
}
// back off and count at 10x speed
v1-=parmIn2*50;
while(v1 < parmIn) {
v1+=parmIn2*10;
}
// back off and count at normal speed
v1-=parmIn2*10;
while(v1 < parmIn) {
v1+=parmIn2;
}
In my experience, especially with graphics programming where you have millions of pixels or polygons to process, speeding up code usually involve adding even more code which translates to more processor instructions instead of trying to find the fewest instructions possible for the task at hand. The trick is to avoid processing what you don't have to.