.net 4
vs2010
winform
c#
added some points using
chart1.Series[0].Points.AddXY(x,y);
when I click on the chart, the cursor may not fall on any points.
Are there any functions to return the nearest point? (forget y, just x distance.)
Or I have to write my own binary search function?
private void Chart_MouseClick(object sender, MouseButtonEventArgs e)
{
LineSeries line = (LineSeries)mychart.Series[0];
Point point = e.GetPosition(line);
Int32? selectIndex = FindNearestPointIndex(line.Points, point);
// ...
}
private Int32? FindNearestPointIndex(PointCollection points, Point point)
{
if ((points == null || (points.Count == 0))
return null;
Func<Point, Point, Double> getLength = (p1, p2) => Math.Sqrt(Math.Pow(p1.X - p2.X, 2) + Math.Pow(p1.Y - p2.Y, 2)); // C^2 = A^2 + B^2
List<Points> results = points.Select((p,i) => new { Point = p, Length = getLength(p, point), Index = i }).ToList();
Int32 minLength = results.Min(i => i.Length);
return results.First(i => (i.Length == minLength)).Index;
}
To find the nearest point in a set of unordered points, you have to iterate through them all and keep track of the minimum distance. This has a time complexity of O(n).
You could significantly improve this by maintaining the points in a more organized data structure (such as an R-tree). There are third-party libraries available if you'd rather not implement your own. Many databases already support the R-tree for spatial indices.
If you really want to only search for the point with the nearest X-coordinate, this could be further simplified by storing the points in a sorted collection (such as a SortedList<TKey, TValue>) and performing a binary search (which SortedList<TKey, TValue>.IndexOfKey already implements).
/*My Fuzzy Binary Search*/
private int FindNearestId(System.Windows.Forms.DataVisualization.Charting.DataPointCollection p, uint ClickedX)
{
int ret = 0;
int low = 0;
int high = p.Count - 1;
bool bLoop = true;
while (bLoop)
{
ret = (low + high) / 2;
switch (FindNearestId_Match(p, ClickedX, ret))
{
case 0:
high = ret+1;
break;
case 1:
bLoop = false;
break;
case 2:
low = ret-1;
break;
}
}
return ret+1;
}
private int FindNearestId_Match(System.Windows.Forms.DataVisualization.Charting.DataPointCollection p, uint ClickedX, int id)
{
uint id0 = Convert.ToUInt32(p[id].XValue);
uint id1 = Convert.ToUInt32(p[id+1].XValue);
if ( (id0 <= ClickedX) && (ClickedX < id1) )
{
return 1;
}
else if ((id0 < ClickedX) && (ClickedX > id1))
{
return 2;
}
else
{
return 0;
}
}
Soultion can be more clear.
( as above you should use log complexity for accessing item )
double x-values solution:
double FindNearestPointYValueInSeries( System::Windows::Forms::DataVisualization::Charting::Series ^pxSeries, double dSearchedPosition )
{
int i_min = 0;
int i_max = pxSeries->Points->Count - 1;
int i_mean = 0;
double d ;
if ( i_max < 0 ) // not defined - minimum one point required
return Double::NaN;
while ( i_min <= i_max )
{
i_mean = (i_max + i_min ) / 2; // index of compared value in series
d = pxSeries->Points[ i_mean ]->XValue; // compared value
if ( d > dSearchedPosition ) // greater - search in right part
i_max = i_mean - 1;
else if ( d < dSearchedPosition ) // lower - search in left part
i_min = i_mean + 1;
else // equal ?
return d;
}
// delta is dSearchedPosition - pxSeries->Points[ i_mean ]->YValues[0]
// get Y value ( on index 0 )
return pxSeries->Points[ i_mean ]->YValues[0];
}
Related
I'm looking for the algorithm to convert a lotto ticket number to an integer value an back again.
Let's say the lotto number can be between 1 and 45 and a tickets contains 6 unique numbers. This means there are a maximum of 8145060 unique lotto tickets.
eg:
01-02-03-04-05-06 = 1
01-02-03-04-05-07 = 2
.
.
.
39-41-42-43-44-45 = 8145059
40-41-42-43-44-45 = 8145060
I'd like to have a function (C# preferable but any language will do) which converts between a lotto ticket and an integer and back again. At the moment I use the quick and dirty method of pre-calculating everything, which needs a lot of memory.
For enumerating integer combinations, you need to use the combinatorial number system. Here's a basic implementation in C#:
using System;
using System.Numerics;
using System.Collections.Generic;
public class CombinatorialNumberSystem
{
// Helper functions for calculating values of (n choose k).
// These are not optimally coded!
// ----------------------------------------------------------------------
protected static BigInteger factorial(int n) {
BigInteger f = 1;
while (n > 1) f *= n--;
return f;
}
protected static int binomial(int n, int k) {
if (k > n) return 0;
return (int)(factorial(n) / (factorial(k) * factorial(n-k)));
}
// In the combinatorial number system, a combination {c_1, c_2, ..., c_k}
// corresponds to the integer value obtained by adding (c_1 choose 1) +
// (c_2 choose 2) + ... + (c_k choose k)
// NOTE: combination values are assumed to start from zero, so
// a combination like {1, 2, 3, 4, 5} will give a non-zero result
// ----------------------------------------------------------------------
public static int combination_2_index(int[] combo) {
int ix = 0, i = 1;
Array.Sort(combo);
foreach (int c in combo) {
if (c > 0) ix += binomial(c, i);
i++;
}
return ix;
}
// The reverse of this process is a bit fiddly. See Wikipedia for an
// explanation: https://en.wikipedia.org/wiki/Combinatorial_number_system
// ----------------------------------------------------------------------
public static int[] index_2_combination(int ix, int k) {
List<int> combo_list = new List<int>();
while (k >= 1) {
int n = k - 1;
if (ix == 0) {
combo_list.Add(n);
k--;
continue;
}
int b = 0;
while (true) {
// (Using a linear search here, but a binary search with
// precomputed binomial values would be faster)
int b0 = b;
b = binomial(n, k);
if (b > ix || ix == 0) {
ix -= b0;
combo_list.Add(n-1);
break;
}
n++;
}
k--;
}
int[] combo = combo_list.ToArray();
Array.Sort(combo);
return combo;
}
}
The calculations are simpler if you work with combinations of integers that start from zero, so for example:
00-01-02-03-04-05 = 0
00-01-02-03-04-06 = 1
.
.
.
38-40-41-42-43-44 = 8145058
39-40-41-42-43-44 = 8145059
You can play around with this code at ideone if you like.
there seem to be actually 45^6 distinct numbers, a simple way is to treat the ticket number as a base-45 number and convert it to base 10:
static ulong toDec(string input){
ulong output = 0;
var lst = input.Split('-').ToList();
for (int ix =0; ix< lst.Count; ix++)
{
output = output + ( (ulong.Parse(lst[ix])-1) *(ulong) Math.Pow(45 , 5-ix));
}
return output;
}
examples:
01-01-01-01-01-01 => 0
01-01-01-01-01-02 => 1
01-01-01-01-02-01 => 45
45-45-45-45-45-45 => 8303765624
I would like to find distinct random numbers within a range that sums up to given number.
Note: I found similar questions in stackoverflow, however they do not address exactly this problem (ie they do not consider a negative lowerLimit for the range).
If I wanted that the sum of my random number was equal to 1 I just generate the required random numbers, compute the sum and divided each of them by the sum; however here I need something a bit different; I will need my random numbers to add up to something different than 1 and still my random numbers must be within a given range.
Example: I need 30 distinct random numbers (non integers) between -50 and 50 where the sum of the 30 generated numbers must be equal to 300; I wrote the code below, however it will not work when n is much larger than the range (upperLimit - lowerLimit), the function could return numbers outside the range [lowerLimit - upperLimit]. Any help to improve the current solution?
static void Main(string[] args)
{
var listWeights = GetRandomNumbersWithConstraints(30, 50, -50, 300);
}
private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, int sum)
{
if (upperLimit <= lowerLimit || n < 1)
throw new ArgumentOutOfRangeException();
Random rand = new Random(Guid.NewGuid().GetHashCode());
List<double> weight = new List<double>();
for (int k = 0; k < n; k++)
{
//multiply by rand.NextDouble() to avoid duplicates
double temp = (double)rand.Next(lowerLimit, upperLimit) * rand.NextDouble();
if (weight.Contains(temp))
k--;
else
weight.Add(temp);
}
//divide each element by the sum
weight = weight.ConvertAll<double>(x => x / weight.Sum()); //here the sum of my weight will be 1
return weight.ConvertAll<double>(x => x * sum);
}
EDIT - to clarify
Running the current code will generate the following 30 numbers that add up to 300. However those numbers are not within -50 and 50
-4.425315699
67.70219958
82.08592061
46.54014109
71.20352208
-9.554070146
37.65032717
-75.77280868
24.68786878
30.89874589
142.0796933
-1.964407284
9.831226893
-15.21652248
6.479463312
49.61283063
118.1853036
-28.35462683
49.82661159
-65.82706541
-29.6865969
-54.5134262
-56.04708803
-84.63783048
-3.18402453
-13.97935982
-44.54265204
112.774348
-2.911427266
-58.94098071
Ok, here how it could be done
We will use Dirichlet Distribution, which is distribution for random numbers xi in the range [0...1] such that
Sumi xi = 1
So, after linear rescaling condition for sum would be satisfied automatically. Dirichlet distribution is parametrized by αi, but we assume all RN to be from the same marginal distribution, so there is only one parameter α for each and every index.
For reasonable large value of α, mean value of sampled random numbers would be =1/n, and variance ~1/(n * α), so larger α lead to random value more close to the mean.
Ok, now back to rescaling,
vi = A + B*xi
And we have to get A and B. As #HansKesting rightfully noted, with only two free parameters we could satisfy only two constraints, but you have three. So we would strictly satisfy low bound constraint, sum value constraint, but occasionally violate upper bound constraint. In such case we just throw whole sample away and do another one.
Again, we have a knob to turn, α getting larger means we are close to mean values and less likely to hit upper bound. With α = 1 I'm rarely getting any good sample, but with α = 10 I'm getting close to 40% of good samples. With α = 16 I'm getting close to 80% of good samples.
Dirichlet sampling is done via Gamma distribution, using code from MathDotNet.
Code, tested with .NET Core 2.1
using System;
using MathNet.Numerics.Distributions;
using MathNet.Numerics.Random;
class Program
{
static void SampleDirichlet(double alpha, double[] rn)
{
if (rn == null)
throw new ArgumentException("SampleDirichlet:: Results placeholder is null");
if (alpha <= 0.0)
throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");
int n = rn.Length;
if (n == 0)
throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");
var gamma = new Gamma(alpha, 1.0);
double sum = 0.0;
for(int k = 0; k != n; ++k) {
double v = gamma.Sample();
sum += v;
rn[k] = v;
}
if (sum <= 0.0)
throw new ApplicationException($"SampleDirichlet:: sum {sum} is non-positive");
// normalize
sum = 1.0 / sum;
for(int k = 0; k != n; ++k) {
rn[k] *= sum;
}
}
static bool SampleBoundedDirichlet(double alpha, double sum, double lo, double hi, double[] rn)
{
if (rn == null)
throw new ArgumentException("SampleDirichlet:: Results placeholder is null");
if (alpha <= 0.0)
throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");
if (lo >= hi)
throw new ArgumentException($"SampleDirichlet:: low {lo} is larger than high {hi}");
int n = rn.Length;
if (n == 0)
throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");
double mean = sum / (double)n;
if (mean < lo || mean > hi)
throw new ArgumentException($"SampleDirichlet:: mean value {mean} is not within [{lo}...{hi}] range");
SampleDirichlet(alpha, rn);
bool rc = true;
for(int k = 0; k != n; ++k) {
double v = lo + (mean - lo)*(double)n * rn[k];
if (v > hi)
rc = false;
rn[k] = v;
}
return rc;
}
static void Main(string[] args)
{
double[] rn = new double [30];
double lo = -50.0;
double hi = 50.0;
double alpha = 10.0;
double sum = 300.0;
for(int k = 0; k != 1_000; ++k) {
var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
Console.WriteLine($"Rng(BD), v = {q}");
double s = 0.0;
foreach(var r in rn) {
Console.WriteLine($"Rng(BD), r = {r}");
s += r;
}
Console.WriteLine($"Rng(BD), summa = {s}");
}
}
}
UPDATE
Usually, when people ask such question, there is an implicit assumption/requirement - all random numbers shall be distribution in the same way. It means that if I draw marginal probability density function (PDF) for item indexed 0 from the sampled array, I shall get the same distribution as I draw marginal probability density function for the last item in the array. People usually sample random arrays to pass it down to other routines to do some interesting stuff. If marginal PDF for item 0 is different from marginal PDF for last indexed item, then just reverting array will produce wildly different result with the code which uses such random values.
Here I plotted distributions of random numbers for item 0 and last item (#29) for original conditions([-50...50] sum=300), using my sampling routine. Look similar, isn't it?
Ok, here is a picture from your sampling routine, same original conditions([-50...50] sum=300), same number of samples
UPDATE II
User supposed to check return value of the sampling routine and accept and use sampled array if (and only if) return value is true. This is acceptance/rejection method. As an illustration, below is code used to histogram samples:
int[] hh = new int[100]; // histogram allocated
var s = 1.0; // step size
int k = 0; // good samples counter
for( ;; ) {
var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
if (q) // good sample, accept it
{
var v = rn[0]; // any index, 0 or 29 or ....
var i = (int)((v - lo) / s);
i = System.Math.Max(i, 0);
i = System.Math.Min(i, hh.Length-1);
hh[i] += 1;
++k;
if (k == 100000) // required number of good samples reached
break;
}
}
for(k = 0; k != hh.Length; ++k)
{
var x = lo + (double)k * s + 0.5*s;
var v = hh[k];
Console.WriteLine($"{x} {v}");
}
Here you go. It'll probably run for centuries before actually returning the list, but it'll comply :)
public List<double> TheThing(int qty, double lowest, double highest, double sumto)
{
if (highest * qty < sumto)
{
throw new Exception("Impossibru!");
// heresy
highest = sumto / 1 + (qty * 2);
lowest = -highest;
}
double rangesize = (highest - lowest);
Random r = new Random();
List<double> ret = new List<double>();
while (ret.Sum() != sumto)
{
if (ret.Count > 0)
ret.RemoveAt(0);
while (ret.Count < qty)
ret.Add((r.NextDouble() * rangesize) + lowest);
}
return ret;
}
I come up with this solution which is fast. I am sure it couldbe improved, but for the moment it does the job.
n = the number of random numbers that I will need to find
Constraints
the n random numbers must add up to finalSum the n random numbers
the n random numbers must be within lowerLimit and upperLimit
The idea is to remove from the initial list (that sums up to finalSum) of random numbers the numbers outside the range [lowerLimit, upperLimit].
Then count the number left of the list (called nValid) and their sum (called sumOfValid).
Now, iteratively search for (n-nValid) random numbers within the range [lowerLimit, upperLimit] whose sum is (finalSum-sumOfValid)
I tested it with several combinations for the inputs variables (including negative sum) and the results looks good.
static void Main(string[] args)
{
int n = 100;
int max = 5000;
int min = -500000;
double finalSum = -1000;
for (int i = 0; i < 5000; i++)
{
var listWeights = GetRandomNumbersWithConstraints(n, max, min, finalSum);
Console.WriteLine("=============");
Console.WriteLine("sum = " + listWeights.Sum());
Console.WriteLine("max = " + listWeights.Max());
Console.WriteLine("min = " + listWeights.Min());
Console.WriteLine("count = " + listWeights.Count());
}
}
private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, double finalSum, int precision = 6)
{
if (upperLimit <= lowerLimit || n < 1) //todo improve here
throw new ArgumentOutOfRangeException();
Random rand = new Random(Guid.NewGuid().GetHashCode());
List<double> randomNumbers = new List<double>();
int adj = (int)Math.Pow(10, precision);
bool flag = true;
List<double> weights = new List<double>();
while (flag)
{
foreach (var d in randomNumbers.Where(x => x <= upperLimit && x >= lowerLimit).ToList())
{
if (!weights.Contains(d)) //only distinct
weights.Add(d);
}
if (weights.Count() == n && weights.Max() <= upperLimit && weights.Min() >= lowerLimit && Math.Round(weights.Sum(), precision) == finalSum)
return weights;
/* worst case - if the largest sum of the missing elements (ie we still need to find 3 elements,
* then the largest sum is 3*upperlimit) is smaller than (finalSum - sumOfValid)
*/
if (((n - weights.Count()) * upperLimit < (finalSum - weights.Sum())) ||
((n - weights.Count()) * lowerLimit > (finalSum - weights.Sum())))
{
weights = weights.Where(x => x != weights.Max()).ToList();
weights = weights.Where(x => x != weights.Min()).ToList();
}
int nValid = weights.Count();
double sumOfValid = weights.Sum();
int numberToSearch = n - nValid;
double sum = finalSum - sumOfValid;
double j = finalSum - weights.Sum();
if (numberToSearch == 1 && (j <= upperLimit || j >= lowerLimit))
{
weights.Add(finalSum - weights.Sum());
}
else
{
randomNumbers.Clear();
int min = lowerLimit;
int max = upperLimit;
for (int k = 0; k < numberToSearch; k++)
{
randomNumbers.Add((double)rand.Next(min * adj, max * adj) / adj);
}
if (sum != 0 && randomNumbers.Sum() != 0)
randomNumbers = randomNumbers.ConvertAll<double>(x => x * sum / randomNumbers.Sum());
}
}
return randomNumbers;
}
I have a situation where I need to evenly distribute N items across M slots. Each item has its own distribution %. For discussion purposes say there are three items (a,b,c) with respective percentages of (50,25,25) to be distributed evenly across 20 slots. Hence 10 X a,5 X b & 5 X c need to be distributed. The outcome would be as follows:
1. a
2. a
3. c
4. b
5. a
6. a
7. c
8. b
9. a
10. a
11. c
12. b
13. a
14. a
15. c
16. b
17. a
18. a
19. c
20. b
The part that I am struggling with is that the number of slots, number of items and percentages can all vary, of course the percentage would always total up to 100%. The code that I wrote resulted in following output, which is always back weighted in favour of item with highest percentage. Any ideas would be great.
1. a
2. b
3. c
4. a
5. b
6. c
7. a
8. b
9. c
10. a
11. c
12. b
13. a
14. b
15. c
16. a
17. a
18. a
19. a
20. a
Edit
This is what my code currently looks like. Results in back weighted distribution as I mentioned earlier. For a little context, I am trying to evenly assign commercials across programs. Hence every run with same inputs has to result in exactly the same output. This is what rules out the use of random numbers.
foreach (ListRecord spl in lstRecords){
string key = spl.AdvertiserName + spl.ContractNumber + spl.AgencyAssignmentCode;
if (!dictCodesheets.ContainsKey(key)){
int maxAssignmentForCurrentContract = weeklyList.Count(c => (c.AdvertiserName == spl.AdvertiserName) && (c.AgencyAssignmentCode == spl.AgencyAssignmentCode)
&& (c.ContractNumber == spl.ContractNumber) && (c.WeekOf == spl.WeekOf));
int tmpAssignmentCount = 0;
for (int i = 0; i < tmpLstGridData.Count; i++)
{
GridData gData = tmpLstGridData[i];
RotationCalculation commIDRotationCalc = new RotationCalculation();
commIDRotationCalc.commercialID = gData.commercialID;
commIDRotationCalc.maxAllowed = (int)Math.Round(((double)(maxAssignmentForCurrentContract * gData.rotationPercentage) / 100), MidpointRounding.AwayFromZero);
tmpAssignmentCount += commIDRotationCalc.maxAllowed;
if (tmpAssignmentCount > maxAssignmentForCurrentContract)
{
commIDRotationCalc.maxAllowed -= 1;
}
if (i == 0)
{
commIDRotationCalc.maxAllowed -= 1;
gridData = gData;
}
commIDRotationCalc.frequency = (int)Math.Round((double)(100/gData.rotationPercentage));
if (i == 1)
{
commIDRotationCalc.isNextToBeAssigned = true;
}
lstCommIDRotCalc.Add(commIDRotationCalc);
}
dictCodesheets.Add(key, lstCommIDRotCalc);
}else{
List<RotationCalculation> lstRotCalc = dictCodesheets[key];
for (int i = 0; i < lstRotCalc.Count; i++)
{
if (lstRotCalc[i].isNextToBeAssigned)
{
gridData = tmpLstGridData.Where(c => c.commercialID == lstRotCalc[i].commercialID).FirstOrDefault();
lstRotCalc[i].maxAllowed -= 1;
if (lstRotCalc.Count != 1)
{
if (i == lstRotCalc.Count - 1 && lstRotCalc[0].maxAllowed > 0)
{
//Debug.Print("In IF");
lstRotCalc[0].isNextToBeAssigned = true;
lstRotCalc[i].isNextToBeAssigned = false;
if (lstRotCalc[i].maxAllowed == 0)
{
lstRotCalc.RemoveAt(i);
}
break;
}
else
{
if (lstRotCalc[i + 1].maxAllowed > 0)
{
//Debug.Print("In ELSE");
lstRotCalc[i + 1].isNextToBeAssigned = true;
lstRotCalc[i].isNextToBeAssigned = false;
if (lstRotCalc[i].maxAllowed == 0)
{
lstRotCalc.RemoveAt(i);
}
break;
}
}
}
}
}
}
}
Edit 2
Trying to clear up my requirement here. Currently, because item 'a' is to be assigned 10 times which is the highest among all three items, towards the end of distribution, items 16 - 20 all have been assigned only 'a'. As has been asked in comments, I am trying to achieve a distribution that "looks" more even.
One way to look at this problem is as a multi-dimensional line drawing problem. So I used Bresenham's line algorithm to create the distribution:
public static IEnumerable<T> GetDistribution<T>( IEnumerable<Tuple<T, int>> itemCounts )
{
var groupCounts = itemCounts.GroupBy( pair => pair.Item1 )
.Select( g => new { Item = g.Key, Count = g.Sum( pair => pair.Item2 ) } )
.OrderByDescending( g => g.Count )
.ToList();
int maxCount = groupCounts[0].Count;
var errorValues = new int[groupCounts.Count];
for( int i = 1; i < errorValues.Length; ++i )
{
var item = groupCounts[i];
errorValues[i] = 2 * groupCounts[i].Count - maxCount;
}
for( int i = 0; i < maxCount; ++i )
{
yield return groupCounts[0].Item;
for( int j = 1; j < errorValues.Length; ++j )
{
if( errorValues[j] > 0 )
{
yield return groupCounts[j].Item;
errorValues[j] -= 2 * maxCount;
}
errorValues[j] += 2 * groupCounts[j].Count;
}
}
}
The input is the actual number of each item you want. This has a couple advantages. First it can use integer arithmetic, which avoids any rounding issues. Also it gets rid of any ambiguity if you ask for 10 items and want 3 items evenly distributed (which is basically just the rounding issue again).
Here's one with no random number that gives the required output.
using System;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
// name, percentage
Dictionary<string, double> distribution = new Dictionary<string,double>();
// name, amount if one more were to be distributed
Dictionary<string, int> dishedOut = new Dictionary<string, int>();
//Initialize
int numToGive = 20;
distribution.Add("a", 0.50);
distribution.Add("b", 0.25);
distribution.Add("c", 0.25);
foreach (string name in distribution.Keys)
dishedOut.Add(name, 1);
for (int i = 0; i < numToGive; i++)
{
//find the type with the lowest weighted distribution
string nextUp = null;
double lowestRatio = double.MaxValue;
foreach (string name in distribution.Keys)
if (dishedOut[name] / distribution[name] < lowestRatio)
{
lowestRatio = dishedOut[name] / distribution[name];
nextUp = name;
}
//distribute it
dishedOut[nextUp] += 1;
Console.WriteLine(nextUp);
}
Console.ReadLine();
}
}
Instead of a truly random number generator, use a fixed seed, so that the program has the same output every time you run it (for the same input). In the code below, the '0' is the seed, which means the 'random' numbers generated will always be the same each time the program is run.
Random r = new Random(0);
//AABC AABC…
int totalA = 10
int totalB = 5
int totalC = 5
int totalItems = 20 //A+B+C
double frequencyA = totalA / totalItems; //0.5
double frequencyB = totalB / totalItems; //0.25
double frequencyC = totalC / totalItems; //0.25
double filledA = frequencyA;
double filledB = frequencyB;
double filledC = frequencyC;
string output = String.Empty;
while(output.Length < totalItems)
{
filledA += frequencyA;
filledB += frequencyB;
filledC += frequencyC;
if(filledA >= 1)
{
filledA -= 1;
output += "A";
if(output.Length == totalItems){break;}
}
if(filledB >= 1)
{
filledB -= 1
output += "B";
if(output.Length == totalItems){break;}
}
if(filledC >= 1)
{
filledC -= 1
output += "C";
if(output.Length == totalItems){break;}
}
}
This answer was mostly stolen and lightly adapted for your use from here
My idea is that you distribute your items in the simplest way possible without care of order, then shuffle the list.
public static void ShuffleTheSameWay<T>(this IList<T> list)
{
Random rng = new Random(0);
int n = list.Count;
while (n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
Fiddle here
One of the requirements for Telegram Authentication is decomposing a given number into 2 prime co-factors. In particular P*Q = N, where N < 2^63
How can we find the smaller prime co-factor, such that P < square_root(N)
My Suggestions:
1) pre-compute primes from 3 to 2^31.5, then test if N mod P = 0
2) Find an algorithm to test for primes (but we still have to test N mod P =0)
Is there an algorithm for primes that is well suited to this case?
Pollard's Rho Algorithm [VB.Net]
Finds P very fast, where P*Q = N, for N < 2^63
Dim rnd As New System.Random
Function PollardRho(n As BigInteger) As BigInteger
If n Mod 2 = 0 Then Return 2
Dim x As BigInteger = rnd.Next(1, 1000)
Dim c As BigInteger = rnd.Next(1, 1000)
Dim g As BigInteger = 1
Dim y = x
While g = 1
x = ((x * x) Mod n + c) Mod n
y = ((y * y) Mod n + c) Mod n
y = ((y * y) Mod n + c) Mod n
g = gcd(BigInteger.Abs(x - y), n)
End While
Return g
End Function
Function gcd(a As BigInteger, b As BigInteger) As BigInteger
Dim r As BigInteger
While b <> 0
r = a Mod b
a = b
b = r
End While
Return a
End Function
Richard Brent's Algorithm [VB.Net] This is even faster.
Function Brent(n As BigInteger) As BigInteger
If n Mod 2 = 0 Then Return 2
Dim y As BigInteger = rnd.Next(1, 1000)
Dim c As BigInteger = rnd.Next(1, 1000)
Dim m As BigInteger = rnd.Next(1, 1000)
Dim g As BigInteger = 1
Dim r As BigInteger = 1
Dim q As BigInteger = 1
Dim x As BigInteger = 0
Dim ys As BigInteger = 0
While g = 1
x = y
For i = 1 To r
y = ((y * y) Mod n + c) Mod n
Next
Dim k = New BigInteger(0)
While (k < r And g = 1)
ys = y
For i = 1 To BigInteger.Min(m, r - k)
y = ((y * y) Mod n + c) Mod n
q = q * (BigInteger.Abs(x - y)) Mod n
Next
g = gcd(q, n)
k = k + m
End While
r = r * 2
End While
If g = n Then
While True
ys = ((ys * ys) Mod n + c) Mod n
g = gcd(BigInteger.Abs(x - ys), n)
If g > 1 Then
Exit While
End If
End While
End If
Return g
End Function
Ugh! I just put this program in and then realized you had tagged your question C#. This is C++, a version of Pollard Rho I wrote a couple years ago and posted here on SO to help someone else understand it. It is many times faster at factoring semiprimes than trial division is. As I said, I regret that it is C++ and not C#, but you should be able to understand the concept and even port it pretty easily. As a bonus, the .NET library has a namespace for handling arbitrarily large integers where my C++ implementation required me to go find a third party library for them. Anyway, even in C#, the below program will break a 2^63 order semiprime into 2 primes in less than 1 second. There are faster algorithms even than this, but they are much more complex.
#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"
typedef BigInteger BI;
typedef BigUnsigned BU;
using std::string;
using std::cin;
using std::cout;
BU pollard(BU &numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU &x, BU &numberToFactor, int &increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU &num);
BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;
int main ()
{
while(1)
{
primeIndex=0;
compositeIndex=0;
tempIndex=0;
initializeArrays();
compositeFactors[0] = getNumberToFactor();
cout<<"\n\n";
if (compositeFactors[0] == 0) return 0;
numberOfCompositeFactors = 1;
factorComposites();
}
}
void initializeArrays()
{
for (int i = 0; i<1000;i++)
{
primeFactors[i] = 0;
compositeFactors[i]=0;
tempFactors[i]=0;
}
}
BU getNumberToFactor ()
{
std::string s;
std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
std::cin>>s;
return stringToBigUnsigned(s);
}
void factorComposites()
{
while (numberOfCompositeFactors!=0)
{
compositeIndex = 0;
tempIndex = 0;
// This while loop finds non-zero values in compositeFactors.
// If they are composite, it factors them and puts one factor in tempFactors,
// then divides the element in compositeFactors by the same amount.
// If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
while (compositeIndex < 1000)
{
if(compositeFactors[compositeIndex] == 0)
{
compositeIndex++;
continue;
}
if(testForComposite(compositeFactors[compositeIndex]) == false)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else
{
tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
compositeFactors[compositeIndex] /= tempFactors[tempIndex];
tempIndex++;
compositeIndex++;
}
}
compositeIndex = 0;
// This while loop moves all remaining non-zero values from compositeFactors into tempFactors
// When it is done, compositeFactors should be all 0 value elements
while (compositeIndex < 1000)
{
if (compositeFactors[compositeIndex] != 0)
{
tempFactors[tempIndex] = compositeFactors[compositeIndex];
compositeFactors[compositeIndex] = 0;
tempIndex++;
compositeIndex++;
}
else compositeIndex++;
}
compositeIndex = 0;
tempIndex = 0;
// This while loop checks all non-zero elements in tempIndex.
// Those that are prime are shown on screen and moved to primeFactors
// Those that are composite are moved to compositeFactors
// When this is done, all elements in tempFactors should be 0
while (tempIndex<1000)
{
if(tempFactors[tempIndex] == 0)
{
tempIndex++;
continue;
}
if(testForComposite(tempFactors[tempIndex]) == false)
{
primeFactors[primeIndex] = tempFactors[tempIndex];
cout<<primeFactors[primeIndex]<<"\n";
tempFactors[tempIndex]=0;
primeIndex++;
tempIndex++;
}
else
{
compositeFactors[compositeIndex] = tempFactors[tempIndex];
tempFactors[tempIndex]=0;
compositeIndex++;
tempIndex++;
}
}
compositeIndex=0;
numberOfCompositeFactors=0;
// This while loop just checks to be sure there are still one or more composite factors.
// As long as there are, the outer while loop will repeat
while(compositeIndex<1000)
{
if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
compositeIndex ++;
}
}
return;
}
// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime
bool testForComposite (BU &num)
{
BU confidenceFactor = 101;
if (confidenceFactor >= num) confidenceFactor = num-1;
BU a,d,s, nMinusOne;
nMinusOne=num-1;
d=nMinusOne;
s=0;
while(modexp(d,1,2)==0)
{
d /= 2;
s++;
}
allJTestsShowComposite = true; // assume composite here until we can prove otherwise
for (BI i = 2 ; i<=confidenceFactor;i++)
{
if (modexp(i,d,num) == 1)
continue; // if this modulus is 1, then we cannot prove that num is composite with this value of i, so continue
if (modexp(i,d,num) == nMinusOne)
{
allJTestsShowComposite = false;
continue;
}
BU exponent(1);
for (BU j(0); j.toInt()<=s.toInt()-1;j++)
{
exponent *= 2;
if (modexp(i,exponent*d,num) == nMinusOne)
{
// if the modulus is not right for even a single j, then break and increment i.
allJTestsShowComposite = false;
continue;
}
}
if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need to continue testing
}
return false;
/* not proven composite in any test, so assume prime with a possibility of error =
(1/4)^(number of different values of i tested). This will be equal to the value of the
confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all integers from
2 through the value of confidenceFactor.
Note that this makes this primality test cryptographically less secure than it could be. It is theoretically possible,
if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers fail to
detect that it is composite. A safer way is to generate random integers in the outer "for" loop and use those in place of
the variable i. Better still if those random numbers are checked to ensure no duplicates are generated.
*/
}
BU pollard(BU &n)
{
if (n == 4) return 2;
BU x = 2;
BU y = 2;
BU d = 1;
int increment = 1;
while(d==1||d==n||d==0)
{
x = f(x,n, increment);
y = f(y,n, increment);
y = f(y,n, increment);
if (y>x)
{
d = gcda(y-x, n);
}
else
{
d = gcda(x-y, n);
}
if (d==0)
{
x = 2;
y = 2;
d = 1;
increment++; // This changes the pseudorandom function we use to increment x and y
}
}
return d;
}
BU gcda(BU a, BU b)
{
if (a==b||a==0)
return 0; // If x==y or if the absolute value of (x-y) == the number to be factored, then we have failed to find
// a factor. I think this is not proof of primality, so the process could be repeated with a new function.
// For example, by replacing x*x+1 with x*x+2, and so on. If many such functions fail, primality is likely.
BU currentGCD = 1;
while (currentGCD!=0) // This while loop is based on Euclid's algorithm
{
currentGCD = b % a;
b=a;
a=currentGCD;
}
return b;
}
BU f(BU &x, BU &n, int &increment)
{
return (x * x + increment) % n;
}
I have an array like this -
string[] input = new string[] {"bRad", "Charles", "sam", "lukE", "vIctor"}
Now I wanted to sort this according to position of capital letter occurring in each string. The first occurrence is the only occurrence to consider while sorting. If two strings have CAPs at same position then sort them alphabetically, the same applies to strings which does not have any CAPs, sort them alphabetically.
What I have done so far is not performing well enough. I have tried countless times to improve it but with no luck. There's going to be huge amount data on which this is tested. So performance is of foremost importance. I'm using .NET 2.0 and I'm not allowed to use any higher versions.
public static int q, p, i, s;
public static Dictionary<string, int> a = new Dictionary<string, int>();
Array.Sort(input, delegate (string x, string y) {
if (x == y)
return 0;
if (a.TryGetValue(x + "|" + y, out s))
return s;
if (a.TryGetValue(y + "|" + x, out s))
return -s;
q = x.Length;
p = y.Length;
for (i = 0; i < x.Length; i++)
{
if (x[i] < 91)
{
q = i;
break;
}
}
for (i = 0; i < y.Length; i++)
{
if (y[i] < 91)
{
p = i;
break;
}
}
if (q == x.Length && p == y.Length)
s = x.CompareTo(y);
else if (q > p)
s = 1;
else if (q < p)
s = -1;
else
s = x.CompareTo(y);
a.Add(x + "|" + y, s);
return s;
});
Removing the dictionary for your cache alone sped it up (my example of 15000 values, with up to 500 chars per value) went from 2449.51ms with the dictionary, and after removing that went down to 58.72ms
I tried "craigmj"'s idea of caching the individual values position which is faster then doing the concat but it seems with my random data no cache was still faster.
Here is some code to test out... this runs in 30ms compared to 2559ms (original)
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
Array.Sort(input, delegate(string x, string y)
{
if (x == y)
return 0;
int shortestLength = Math.Min(x.Length, y.Length);
for (i = 0; i < shortestLength; i++)
{
if (x[i] < 91 && y[i] < 91)
return x.CompareTo(y);
else if (x[i] < 91)
return -1;
else if (y[i] < 91)
return 1;
}
return x.CompareTo(y);
});
stopWatch.Stop();
double ms = (stopWatch.ElapsedTicks * 1000.0) / Stopwatch.Frequency;
Debug.WriteLine("Optimized Time: " + ms);
code to continue checking for Capital
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
Array.Sort(input, delegate(string x, string y)
{
if (x == y)
return 0;
int xlen = x.Length;
int ylen = y.Length;
int longestLength = Math.Max(xlen, ylen);
for (i = 0; i < longestLength; i++)
{
if (i < xlen && i < ylen && x[i] < 91 && y[i] < 91)
return x.CompareTo(y);
else if (i < xlen && x[i] < 91)
return -1;
else if (i < ylen && y[i] < 91)
return 1;
}
return x.CompareTo(y);
});
stopWatch.Stop();
double ms = (stopWatch.ElapsedTicks * 1000.0) / Stopwatch.Frequency;
Debug.WriteLine("Optimized Time: " + ms);
you need to think about your algorithm here :-)
How many elements are going to be in your dictionary? Well, since you're putting "{x} | {y}" into the dictionary for every element x, y in your array, that's n ^ 2 for an n-element array. Not a good idea.
This isn't necessarily the best solution (I've not thought about that yet), but for a start:
Only store the position of the first Capital in the dictionary for a particular word, not for combinations.
Now your delegate becomes:
delegate (string x, string y) {
if (x == y) // NOT A GOOD IDEA -
// THE Sort method should not call a string with itself,
// and if this is doing string comparison
// (I'm rusty on C#), you're
// wasting a comparison if there's a mismatch
return 0;
int xCapitalPos, yCapitalPos;
if (!a.TryGetValue(x, out xCapitalPos)) {
// compute xCapitalPos and add it to dictionary a
}
if (!a.TryGetValue(y, out yCapitalPos)) {
// compute yCapitalPos and add it to dictionary a
}
int delta = xCapitalPos - yCapitalPos;
if (0!=delta) {
return delta;
} else {
return x.compareTo(y);
}
}
That's where I would start. See how you do, then consider how you might do better from there...
--- 5 minutes later, cup of coffee in hand
Ok, I've just thought of how I might improve it!
Don't use compareTo, which does a string comparison. Write your own string comparison function, that given 2 strings, will do a string comparison while taking capital location into account. Then you can drop the dictionary and everything else: it won't be necessary, since the Sort method (which I presume is implemented properly as a QuickSort or a MergeSort or something efficient) will ensure you don't do more comparisons than necessary.
All the best,
C
I guess there are a couple of optimisations (it should already be damned fast).
Firstly, as suggested above there's really no need for a dictionary since the Sort algorithm itself optimises the need for that out.
Second, loop through x and y at the same time, and break out as early as possible (that is, if one finds a capital and the other doesn't then exit early). Minor savings here.
That should just about do it (and be simpler). You're basically rewriting String.CompareTo only, and efficiently, and relying on Array.Sort to do the rest.
Here's some (pseudo) code for the delegate that does a Capital-and-String position match:
delegate (string lhs, string rhs) {
int llength = lhs.Length
int rlength = rhs.Length
// The value we will return
// <0 => lhs < rhs
// ==0 => lhs == rhs
// >0 => lhs > rhs
int ret = 0;
Boolean uppercaseFoundAndEqual = false;
for (int i=0; i<llength; i++) {
if (i>=rlength) {
// We've exhausted the rhs, but not the lhs
return (0==ret) ? -1 : ret;
}
Char l = lhs[i];
Char r = rhs[i];
// We only worry about the case position if we've not yet found
// an uppercase char in either string
if (!uppercaseFoundAndEqual) {
Boolean lUpper = (('A'<=l) && ('Z'>=l));
Boolean rUpper = (('A'<=r) && ('Z'>=r));
// If we've encountered an upper-lower difference, we return
int delta = (lUpper ? 1 : 0) - (rUpper ? 1 : 0);
if (0!=delta) return delta;
if (lUpper) { // Both are upper case - by our delta comparison, we know
// lUpper==rUpper
if (0!=ret) return ret; // Return based on previous case comparison
// Otherwise we've found an uppercase, now we're just doing
// standard string comparison
uppercaseFoundAndEqual = true;
}
}
if (0==ret) { // If we're still equal to this point, standard char comparison
ret = l-r;
}
if (uppercaseFoundAndEqual && (0!=ret)) {
return ret;
}
}
if (i<rlength) {
// We've exhausted the lhs, but not the rhs
return (0==ret) ? 1 : ret;
}
return 0; // We exhausted both strings and they're identical
}
Please take careful note:
I've not tested this! (Hence the 'pseudo-code' comment) I don't have C#, so it's based on a little Googling for C# syntax. Please correct if I've made errors!
This is not using Internationalization. In other words, this is a very basic ASCII comparison. Yuck! You should look at using System.Globalization.StringInfo, but I'm afraid to code that just based on some Googling!
Your description doesn't describe what happens when two strings differ in length without capitalization. For example, is 'aa' greater than or less than 'aaA'? I've implemented based on the idea that we ignore capitalization outside the 'intersect' area, but it wouldn't be difficult to change that.
(for emphasis) I've not tested this! Mileage may vary, but I believe the general idea is good.