I've seen in several applications I'm supporting the following sentence:
Random rnd = new Random();
if (rnd.NextDouble() < 1)
{
' Do stuff
}
What could be the purpose of this? rnd is going to return always a value below 1. The only thing I can think of is that if you mock Random then you would be able to deactivate some sections of the code.
What do you think? Have you found things like this?
EDIT: The thing is that this sentences are located on different but related sections of code and always covering entire features. That's why I tend to think that is was coded on purpose. And the code seems to have a certaing degree of quality, it this was a mistake I would be suprised providen all the other code.
I am not sure but this check is not required. rnd.NextDouble will always return in the range
of 0.0 to 1.0.
Random.NextDouble Method - MSDN
A double-precision floating point number greater than or equal to 0.0,
and less than 1.0.
Assuming that the Random in question is indeed System.Random, I can't see any functional reason for having this.
All I could speculate is that perhaps the writer wanted a code block that they could easily (while debugging / developing) run conditionally (by changing the 1 to 0) or only some of the time (by changing the 1 to some value between 0 and 1). But really, this isn't a well-known idiom, so you'd have to ask the person who wrote it, hence my vote to close as too localized.
Related
I suspect the answer is 'Because of Math', but I was hoping someone could give a little more insight at a basic level...
I was poking around in the BCL source code today, having a look at how some of the classes I've used before were actually implemented. I'd never thought about how to generate (pseudo) random numbers before, so I decided to see how it was done.
Full source here: http://referencesource.microsoft.com/#mscorlib/system/random.cs#29
private const int MSEED = 161803398;
This MSEED value is used every time a Random() class is seeded.
Anyway, I saw this 'magic number' - 161803398 - and I don't have the foggiest idea of why that number was selected. It's not a prime number or a power of 2. It's not 'half way' to a number that seemed more significant. I looked at it in binary and hex and well, it just looked like a number to me.
I tried searching for the number in Google, but I found nothing.
No, but it's based on Phi (the "golden ratio").
161803398 = 1.61803398 * 10^8 ≈ φ * 10^8
More about the golden ratio here.
And a really good read for the casual mathematician here.
And I found a research paper on random number generators that agrees with this assertion. (See page 53.)
This number is taken from golden ratio 1.61803398 * 10^8. Matt gave a nice answer what is this number, therefore I will just explain a little bit about an algorithm.
This is not a special number for this algorithm. The algorithm is Knuth's subtractive random number generator algorithm and the main points of it are:
store a circular list of 56 random numbers
initialization is process of filling the list, then randomize those values with a specific deterministic algorithm
two indices are kept which are 31 apart
new random number is the difference of the two values at the two indices
store new random number in the list
The generator is based on the following recursion: Xn = (Xn-55 - Xn-24) mod m, where n ≥ 0. This is a partial case of lagged Fibonacci generator: Xn = (Xn-j # Xn-k) mod m, where 0 < k < j and # is any binary operation (subtraction, addition, xor).
There are several implementations of this generator. Knuth offers an implementation in
FORTRAN in his book. I found the following code, with the following comment:
PARAMETER (MBIG=1000000000,MSEED=161803398,MZ=0,FAC=1.E-9)
According
to Knuth, any large MBIG, and any smaller (but still large) MSEED can
be substituted for the above values.
A little bit more can be found here Note, that this is not actually a research paper (as stated by Math), this is just a master degree thesis.
People in cryptography like to use irrational number (pi, e, sqrt(5)) because there is a conjecture that digits of such numbers appears with equal frequency and thus have high entropy. You can find this related question on security stackexchange to learn more about such numbers. Here is a quote:
"If the constants are chosen at random, then with high probability, no
attacker will be able to break it." But cryptographers, being a
paranoid lot, are skeptical when someone says, "Let's use this set of
constants. I picked them at random, I swear." So as a compromise,
they'll use constants like, say, the binary expansion of π. While we
no longer have the mathematical benefit of having chosen them at
random from some large pool of numbers, we can at least be more
confident there was no sabotage.
Is there a generally accepted best approach to coding complex math? For example:
double someNumber = .123 + .456 * Math.Pow(Math.E, .789 * Math.Pow((homeIndex + .22), .012));
Is this a point where hard-coding the numbers is okay? Or should each number have a constant associated with it? Or is there even another way, like storing the calculations in config and invoking them somehow?
There will be a lot of code like this, and I'm trying to keep it maintainable.
Note: The example shown above is just one line. There would be tens or hundreds of these lines of code. And not only could the numbers change, but the formula could as well.
Generally, there are two kinds of constants - ones with the meaning to the implementation, and ones with the meaning to the business logic.
It is OK to hard-code the constants of the first kind: they are private to understanding your algorithm. For example, if you are using a ternary search and need to divide the interval in three parts, dividing by a hard-coded 3 is the right approach.
Constants with the meaning outside the code of your program, on the other hand, should not be hard-coded: giving them explicit names gives someone who maintains your code after you leave the company non-zero chances of making correct modifications without having to rewrite things from scratch or e-mailing you for help.
"Is it okay"? Sure. As far as I know, there's no paramilitary police force rounding up those who sin against the one true faith of programming. (Yet.).
Is it wise?
Well, there are all sorts of ways of deciding that - performance, scalability, extensibility, maintainability etc.
On the maintainability scale, this is pure evil. It make extensibility very hard; performance and scalability are probably not a huge concern.
If you left behind a single method with loads of lines similar to the above, your successor would have no chance maintaining the code. He'd be right to recommend a rewrite.
If you broke it down like
public float calculateTax(person)
float taxFreeAmount = calcTaxFreeAmount(person)
float taxableAmount = calcTaxableAmount(person, taxFreeAmount)
float taxAmount = calcTaxAmount(person, taxableAmount)
return taxAmount
end
and each of the inner methods is a few lines long, but you left some hardcoded values in there - well, not brilliant, but not terrible.
However, if some of those hardcoded values are likely to change over time (like the tax rate), leaving them as hardcoded values is not okay. It's awful.
The best advice I can give is:
Spend an afternoon with Resharper, and use its automatic refactoring tools.
Assume the guy picking this up from you is an axe-wielding maniac who knows where you live.
I usually ask myself whether I can maintain and fix the code at 3 AM being sleep deprived six months after writing the code. It has served me well. Looking at your formula, I'm not sure I can.
Ages ago I worked in the insurance industry. Some of my colleagues were tasked to convert the actuarial formulas into code, first FORTRAN and later C. Mathematical and programming skills varied from colleague to colleague. What I learned was the following reviewing their code:
document the actual formula in code; without it, years later you'll have trouble remember the actual formula. External documentation goes missing, become dated or simply may not be accessible.
break the formula into discrete components that can be documented, reused and tested.
use constants to document equations; magic numbers have very little context and often require existing knowledge for other developers to understand.
rely on the compiler to optimize code where possible. A good compiler will inline methods, reduce duplication and optimize the code for the particular architecture. In some cases it may duplicate portions of the formula for better performance.
That said, there are times where hard coding just simplify things, especially if those values are well understood within a particular context. For example, dividing (or multiplying) something by 100 or 1000 because you're converting a value to dollars. Another one is to multiply something by 3600 when you'd like to convert hours to seconds. Their meaning is often implied from the greater context. The following doesn't say much about magic number 100:
public static double a(double b, double c)
{
return (b - c) * 100;
}
but the following may give you a better hint:
public static double calculateAmountInCents(double amountDue, double amountPaid)
{
return (amountDue - amountPaid) * 100;
}
As the above comment states, this is far from complex.
You can however store the Magic numbers in constants/app.config values, so as to make it easier for the next developer to maitain your code.
When storing such constants, make sure to explain to the next developer (read yourself in 1 month) what your thoughts were, and what they need to keep in mind.
Also ewxplain what the actual calculation is for and what it is doing.
Do not leave in-line like this.
Constant so you can reuse, easily find, easily change and provides for better maintaining when someone comes looking at your code for the first time.
You can do a config if it can/should be customized. What is the impact of a customer altering the value(s)? Sometimes it is best to not give them that option. They could change it on their own then blame you when things don't work. Then again, maybe they have it in flux more often than your release schedules.
Its worth noting that the C# compiler (or is it the CLR) will automatically inline 1 line methods so if you can extract certain formulas into one liners you can just extract them as methods without any performance loss.
EDIT:
Constants and such more or less depends on the team and the quantity of use. Obviously if you're using the same hard-coded number more than once, constant it. However if you're writing a formula that its likely only you will ever edit (small team) then hard coding the values is fine. It all depends on your teams views on documentation and maintenance.
If the calculation in your line explains something for the next developer then you can leave it, otherwise its better to have calculated constant value in your code or configuration files.
I found one line in production code which was like:
int interval = 1 * 60 * 60 * 1000;
Without any comment, it wasn't hard that the original developer meant 1 hour in milliseconds, rather than seeing a value of 3600000.
IMO May be leaving out calculations is better for scenarios like that.
Names can be added for documentation purposes. The amount of documentation needed depends largely on the purpose.
Consider following code:
float e = m * 8.98755179e16;
And contrast it with the following one:
const float c = 299792458;
float e = m * c * c;
Even though the variable names are not very 'descriptive' in the latter you'll have much better idea what the code is doing the the first one - arguably there is no need to rename the c to speedOfLight, m to mass and e to energy as the names are explanatory in their domains.
const float speedOfLight = 299792458;
float energy = mass * speedOfLight * speedOfLight;
I would argue that the second code is the clearest one - especially if programmer can expect to find STR in the code (LHC simulator or something similar). To sum up - you need to find an optimal point. The more verbose code the more context you provide - which might both help to understand the meaning (what is e and c vs. we do something with mass and speed of light) and obscure the big picture (we square c and multiply by m vs. need of scanning whole line to get equation).
Most constants have some deeper meening and/or established notation so I would consider at least naming it by the convention (c for speed of light, R for gas constant, sPerH for seconds in hour). If notation is not clear the longer names should be used (sPerH in class named Date or Time is probably fine while it is not in Paginator). The really obvious constants could be hardcoded (say - division by 2 in calculating new array length in merge sort).
In my project i face a scenario where i have a function with numerous inputs. At a certain point i am provided with an result and i need to find one combination of inputs that generates that result.
Here is some pseudocode that illustrates the problem:
Double y = f(x_0,..., x_n)
I am provided with y and i need to find any combination that fits the input.
I tried several things on paper that could generate something, but my each parameter has a range of 6.5 x 10^9 possible values - so i would like to get an optimal execution time.
Can someone name an algorithm or a topic that will be useful for me so i can read up on how other people solved simmilar problems.
I was thinking along the lines of creating a vector from the inputs and judjing how good that vektor fits the problem. This sounds awful lot like an NN, but there is no training phase available.
Edit:
Thank you all for the feedback. The comments sum up the Problems i have and i will try something along the lines of hill climbing.
The general case for your problem might be impossible to solve, but for some cases there are numerical methods that can help you solve your problem.
For example, in 1D space, if you can find a number that is smaller then y and one that is higher then y - you can use the numerical method regula-falsi in order to numerically find the "root" (which is y in your case, by simply invoking the method onf(x) -y).
Other numerical method to find roots is newton-raphson
I admit, I am not familiar with how to apply these methods on multi dimensional space - but it could be a starter. I'd search the literature for these if I were you.
Note: using such a method almost always requires some knowledge on the function.
Another possible solution is to take g(X) = |f(X) - y)|, and use some heuristical algorithms in order to find a minimal value of g. The problem with heuristical methods is they will get you "close enough" - but seldom will get you exactly to the target (unless the function is convex)
Some optimizations algorithms are: Genethic Algorithm, Hill Climbing, Gradient Descent (where you can numerically find the gradient)
On a whim, I've decided to go back and seek certification, starting with 98-361, Fundamentals of Software Development. (I'm doing this more for myself than anything else. I want to fill in gaps in my knowledge.)
In the very early course of the book, they present this interesting scenario in the Proficient Assessment section:
You are developing a library of utility functions for your
application. You need to write a method that takes an integer and
counts the number of significant digits in it. You need to create a recursive program
to solve this problem. How would you write such a
program?
I find myself gaping at this scenario in befuddlement. If I understand "significant digits" correctly, there's no need whatsoever for a function that counts an integer's significant digits to be recursive. And, any architect who insisted that it be recursive should have his head examined.
Or am I not getting it? Did I completely miss something here? From what I understand, the significant digits are the digits of a number, starting from the left, and proceeding right, excluding any leading zeroes.
Under what conditions would this need to be recursive? (The whole point of this exercise for me is to learn new things. Someone throw me a bone.)
EDIT: I don't want an answer to the problem question. I can figure that out on my own. It just seems to me that this "problem" could be solved far more easily with a simple foreach loop over the characters in a string.
Final Edit
Given the sage advice of the awesome posters below, this was the simple solution I came up with to solve the problem. (Despite what misgivings I may have.)
using System;
class Program
{
static void Main(string[] args)
{
var values = new[] { 5, 15, 150, 250, 2500, 25051, 255500005, -10, -1005 };
foreach (var value in values)
{
Console.WriteLine("Signficiant digits for {0} is {1}.", value, SignificantDigits(value));
}
}
public static int SignificantDigits(int n)
{
if (n == 0)
{
return 0;
}
return 1 + SignificantDigits((int)(n / 10));
}
}
There's no need for such an algorithm to be recursive. But the intent here is not to write real-world code, it's to ensure you understand recursion.
Since you stated you weren't after code, I'll be careful here, but I need to provide something to compare the complexity of the solutions, so I'll use pseudo-code. A recursive solution may be something like:
def sigDigits (n):
# Handle negative numbers.
if n < 0:
return sigDigits (-n)
# 0..9 is one significant digit.
if n < 10:
return 1
# Otherwise it's one plus the count in n/10 (truncated).
return 1 + sigDigits (n / 10)
And you're right, it equally doable as iteration.
def sigDigits (n):
# Handle negative numbers.
if n < 0:
n = -n
# All numbers have at least one significant digit.
digits = 1
# Then we add one and divide by ten (truncated), until we get low enough.
while n > 9:
n = n / 10
digits = digits + 1
return digits
There are some (usually of a mathematical bent, and including myself) that consider recursive algorithms much more elegant where they're suitable (such as where the "solution search space" reduces very quickly so as to not blow out your stack).
I question the suitability in this particular case since the iterative solution is not too complex, but the questioner had to provide some problem and this one is relatively easy to solve.
And, as per your edit:
... could be solved far more easily with a simple foreach loop over the characters in a string
You don't have a string, you have an integer. I don't doubt that you could turn that into a string and then count characters but that seems a roundabout way of doing it.
It doesn't need to be recursive. It's simply that the question is asking you to write a recursive implementation, presumably to test your understanding of how a recursive function works.
That seems like a pretty forced example. The problem can be solved with an simpler iterative algorithm.
A lot of teaching resources really struggle to provide useful examples of when to use recursion. Technically you never need to use it, but for a large class of (mostly algorithmic) problems, it can really simplify things.
For example, consider any operation on a binary tree. Because the physical structure of a binary tree is recursive, the algorithms that operate on it are also naturally recursive. You can also write imperative algorithms to operate on binary trees, but the recursive ones are simpler to write and understand.
I was wondering if anyone had any suggestions for minimizing a function, f(x,y), where x and y are integers. I have researched lots of minimization and optimization techniques, like BFGS and others out of GSL, and things out of Numerical Recipes. So far, I have tried implenting a couple of different schemes. The first works by picking the direction of largest descent f(x+1,y),f(x-1,y),f(x,y+1),f(x,y-1), and follow that direction with line minimization. I have also tried using a downhill simplex (Nelder-Mead) method. Both methods get stuck far away from a minimum. They both appear to work on simpler functions, like finding the minimum of a paraboloid, but I think that both, and especially the former, are designed for functions where x and y are real-valued (doubles). One more problem is that I need to call f(x,y) as few times as possible. It talks to external hardware, and takes a couple of seconds for each call. Any ideas for this would be greatly appreciated.
Here's an example of the error function. Sorry I didn't post this before. This function takes a couple of seconds to evaluate. Also, the information we query from the device does not add to the error if it is below our desired value, only if it is above
double Error(x,y)
{
SetDeviceParams(x,y);
double a = QueryParamA();
double b = QueryParamB();
double c = QueryParamC();
double _fReturnable = 0;
if(a>=A_desired)
{
_fReturnable+=(A_desired-a)*(A_desired-a);
}
if(b>=B_desired)
{
_fReturnable+=(B_desired-b)*(B_desired-b);
}
if(c>=C_desired)
{
_fReturnable+=(C_desired-c)*(C_desired-c);
}
return Math.sqrt(_fReturnable)
}
There are many, many solutions here. In fact, there are entire books and academic disciplines based on the subject. I am reading an excellent one right now: How to Solve It: Modern Heuristics.
There is no one solution that is correct - different solutions have different advantages based on specific knowledge of your function. It has even been proven that there is no one heuristic that performs the best at all optimization tasks.
If you know that your function is quadratic, you can use Newton-Gauss to find the minimum in one step. A genetic algorithm can be a great general-purpose tool, or you can try simulated annealing, which is less complicated.
Have you looked at genetic algorithms? They are very, very good at finding minimums and maximums, while avoiding local minimum/maximums.
How do you define f(x,y) ? Minimisation is a hard problem, depending on the complexity of your function.
Genetic Algorithms could be a good candidate.
Resources:
Genetic Algorithms in Search, Optimization, and Machine Learning
Implementing a Genetic Algorithms in C#
Simple C# GA
If it's an arbitrary function, there's no neat way of doing this.
Suppose we have a function defined as:
f(x, y) = 0 for x==100, y==100
100 otherwise
How could any algorithm realistically find (100, 100) as the minimum? It could be any possible combination of values.
Do you know anything about the function you're testing?
What you are generally looking for is called an optimisation technique in mathematics. In general, they apply to real-valued functions, but many can be adapted for integral-valued functions.
In particular, I would recommend looking into non-linear programming and gradient descent. Both would seem quite suitable for your application.
If you could perhaps provide any more details, I might be able to suggest somethign a little more specific.
Jon Skeet's answer is correct. You really do need information about f and it's derivatives even if f is everywhere continuous.
The easiest way to appreciate the difficulties of what you ask(minimization of f at integer values only) is just to think about an f: R->R (f is a real valued function of the reals) of one variable that makes large excursions between individual integers. You can easily construct such a function so that there is NO correllation between the local minimums on the real line and the minimums at the integers as well as having no relationship to the first derivative.
For an arbitrary function I see no way except brute force.
So let's look at your problem in math-speak. This is all assuming I understand
your problem fully. Feel free to correct me if I am mistaken.
we want to minimize the following:
\sqrt((a-a_desired)^2 + (b-b_desired)^2 + (c-c_desired)^2)
or in other notation
||Pos(x - x_desired)||_2
where x = (a,b,c) and Pos(y) = max(y, 0) means we want the "positive part"(this accounts
for your if statements). Finally, we wish to restrict ourself
to solutions where x is integer valued.
Unlike the above posters, I don't think genetic algorithms are what you want at all.
In fact, I think the solution is much easier (assuming I am understanding your problem).
1) Run any optimization routine on the function above. THis will give you
the solution x^* = (a^*, b^*,c^*). As this function is increasing with respect
to the variables, the best integer solution you can hope for is
(ceil(a^*),ceil(b^*),ceil(c^*)).
Now you say that your function is possibly hard to evaluate. There exist tools
for this which are not based on heuristics. The go under the name Derivative-Free
Optimization. People use these tools to optimize objective based on simulations (I have
even heard of a case where the objective function is based on crop crowing yields!)
Each of these methods have different properties, but in general they attempt to
minimize not only the objective, but the number of objective function evaluations.
Sorry the formatting was so bad previously. Here's an example of the error function
double Error(x,y)
{
SetDeviceParams(x,y);
double a = QueryParamA();
double b = QueryParamB();
double c = QueryParamC();
double _fReturnable = 0;
if(a>=A_desired)
{
_fReturnable+=(A_desired-a)*(A_desired-a);
}
if(b>=B_desired)
{
_fReturnable+=(B_desired-b)*(B_desired-b);
}
if(c>=C_desired)
{
_fReturnable+=(C_desired-c)*(C_desired-c);
}
return Math.sqrt(_fReturnable)
}