Bit counting arbitrarily large positive integers in C#

Bit counting arbitrarily large positive integers in C# - c#

There are many implementations of bit counting out there but in my case, I need to test if an arbitrarily large number contains at most two set bits.
I wrote the following function that does the job and seems to be quite fast but I wanted to find out if it can be further optimized for C#. This function gets called in a loop a few million times.
public static byte [] BitCountLookupArray = new byte []
{
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8
};
// The parameter [number] will NEVER be negative.
public static bool HasSetBitCountOfLessThenThree (System.Numerics.BigInteger number)
{
int sum = 0;
byte [] bytes = null;
bytes = number.ToByteArray();
for (int i=0; i < bytes.Length; i++)
{
sum += BitCountLookupArray [bytes [i]];
}
return (sum < 3);
}
IMPORTANT: The argument [number] sent to the function will NEVER be negative.
Some points I thought of were:
Making the function static. Done.
Using a static lookup array. Done.
Using pointers instead of array indexes since the number of bytes often crosses 100,000. Not sure how much this would help.
Forcing an inline function which sadly cannot be guaranteed in .NET.
Open to other suggestions.

This way you can optimise it further
for (int i=0; i < bytes.Length; i++)
{
sum += BitCountLookupArray [bytes [i]];
if(sum >= 3)
{
return false // This will stop the execution of unnecessary lines
// as we need to know whether sum is less than 3 or not.
}
}
return true;

Since you only need to know whether you have fewer than 3 set bits, I would suggest this:
// remove two bits
number &= number - 1;
number &= number - 1;
// if number != 0, then there were 3 or more bits set
return number.IsZero;
Of course Rain's method works too, and I'm not sure which strategy will be faster.
Alternative:
//remove one bit
number &= number - 1;
// if the number of bits left is 0 or 1, there were < 3 bits set
return number.IsZero || number.IsPowerOfTwo;
It's probably faster to test first, and remove the bit later:
return number.IsZero || // zero bits?
number.IsPowerOfTwo || // one bit?
(number & (number - 1)).IsPowerOfTwo; // two bits?

The most obvious optimisation is to drop out of the loop as soon as sum == 3, since any further matches past that point are immaterial.
There's also no need to set bytes twice; simply use byte [] bytes = number.ToByteArray();, but the benifit here is miniscule.

Related

Accord.NET throw "Index was outside the bounds of the array" [duplicate]

This question already has answers here:
What is an IndexOutOfRangeException / ArgumentOutOfRangeException and how do I fix it?
(5 answers)
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
I was working on KNN from Accord.NET and I faced this error for some reason when I need to test model.
but this error message didn't help at all (Index was outside the bounds of the array) because this error happen in the library itself.
simple code with random data:
using Accord.MachineLearning;
double[][] inputs =
{
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 5 ,1},
new double[] { 16, 2 ,0}, new double[] { 4, 15 ,1},
};
int[] outputs =
{
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 9
};
var knn = new KNearestNeighbors(k: 15);
knn.Learn(inputs, outputs);
//test
var t = new double[] { 16, 2, 0 };
int answer = knn.Decide(t);
and here the exception:
but I found way around and I share solution with you below :

after many days and after implementing this simple sample I found that output array should have continues range values (encoded : 0,1,2,3,....) .
so here 9 is why that bug happen here 🙂
int[] outputs =
{
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0, 2
};

Shift values in array [C#]

I would like to shift values in an array to the corners.
This is my 2-dimensional array:
4, 8, 0, 2
0, 6, 1, 9
7, 0, 5, 3
I want to shift all 0 values to the corners so it looks like:
0, 4, 8, 0
2, 6, 1, 9
0, 7, 5, 3
0, 4, 8, 0
2, 6, 1, 9
7, 5, 3, 0
0, 4, 8, 2
6, 1, 9, 7
0, 5, 3, 0
4, 8, 2, 0
6, 1, 9, 7
0, 5, 3, 0
There is an easy way to do it? Thanks a lot!

How can I get "thinner" graph for my coordinate system?

Following up with this, I have a bunch of coordinates and I draw them on a bitmap image as a coordinate system. Now, I would like to get rid of all the noise, and filter coordinates to give a "clearer" or "cleaner" path and "less" or "better" data to work on. To explain more, I will need to expose my awesome painting skills as follows:
Current:
Desired:
Notice:
I will need to delete coordinates
I might need to add coordinates
I might need to ignore shortest neighbor in some cases
The only thing I can think of, is to use a shortest path algorithm such as A* and Dijkstra. And populate data in some sort of data structure to contain neighbors and costs for every node and then to execute the algorithm. I don't want to start something that might be wrong or waste. I would love to see a pseudo code if possible on how could I solve such a problem?
P.S I am currently on Wpf C# but I am open to use C# or C++ for any task. Thanks

You are looking for an operation called thinning or skeletonization, possibly followed by some post-processing to remove small components. There are different algorithms for this that offer different properties. For example Guo and Hall's and Zhang and Suen's.

What you're after is a path finding application. There are several ways to approach this, but one of the simpler ways is to:
Pick a starting point, add to list
While True:
For each border_pt bordering last point on list:
Count number of points bordering border_pt
If count > best_count:
Mark border_pt as best
if border_pt is empty:
break
Add border_pt to list
Here's some C# code that does just that, it generates a simple list based on your cloud:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Linq;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace WindowsFormsApplication1
{
class ExampleProgram : Form
{
const int GridWidth = 24;
const int GridHeight = 15;
List<Point> m_points = new List<Point>();
List<Point> m_trail = new List<Point>();
[STAThread]
static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new ExampleProgram());
}
ExampleProgram()
{
// Simple little tool to add a bunch of points
AddPoints(
0, 4, 1, 3, 1, 4, 1, 5, 2, 4, 2, 5, 2, 6, 3, 4, 3, 5, 4, 5, 4, 6, 5, 5, 6, 5,
6, 4, 5, 4, 7, 4, 7, 3, 8, 3, 8, 4, 8, 5, 8, 6, 9, 6, 9, 5, 9, 4, 9, 3, 10, 2,
10, 3, 10, 4, 10, 5, 10, 6, 11, 5, 11, 4, 11, 3, 11, 2, 12, 4, 12, 5, 13, 5,
13, 6, 13, 8, 14, 8, 14, 7, 14, 6, 15, 7, 15, 8, 15, 9, 14, 9, 14, 10, 13, 10,
12, 10, 11, 10, 13, 11, 14, 11, 15, 11, 15, 12, 16, 12, 17, 12, 18, 12, 19,
12, 18, 11, 17, 11, 17, 10, 18, 10, 19, 10, 19, 9, 19, 8, 20, 8, 21, 8, 18,
7, 19, 7, 20, 7, 21, 7, 21, 6, 22, 6, 23, 6, 21, 5, 20, 5, 19, 5, 19, 4, 18,
4, 17, 4, 20, 3, 21, 3, 22, 3, 20, 2, 19, 2, 18, 2, 19, 1, 20, 1, 21, 1, 19,
0, 18, 0, 10, 0, 4, 1);
// Very basic form logic
ClientSize = new System.Drawing.Size(GridWidth * 20, GridHeight * 20);
DoubleBuffered = true;
Paint += ExampleProgram_Paint;
// Add a new point to the form (commented out)
// MouseUp += ExampleProgram_MouseUp_AddPoint;
// Draw the trail we find
MouseUp += ExampleProgram_MouseUp_AddTrail;
// Pick a starting point to start finding the trail from
// TODO: Left as an excersize for the reader to decide how to pick
// the starting point programatically
m_trail.Add(new Point(0, 4));
}
IEnumerable<Point> Border(Point pt)
{
// Return all points that border a give point
if (pt.X > 0)
{
if (pt.Y > 0)
{
yield return new Point(pt.X - 1, pt.Y - 1);
}
yield return new Point(pt.X - 1, pt.Y);
if (pt.Y < GridHeight - 1)
{
yield return new Point(pt.X - 1, pt.Y + 1);
}
}
if (pt.Y > 0)
{
yield return new Point(pt.X, pt.Y - 1);
}
if (pt.Y < GridHeight - 1)
{
yield return new Point(pt.X, pt.Y + 1);
}
if (pt.X < GridWidth - 1)
{
if (pt.Y > 0)
{
yield return new Point(pt.X + 1, pt.Y - 1);
}
yield return new Point(pt.X + 1, pt.Y);
if (pt.Y < GridHeight - 1)
{
yield return new Point(pt.X + 1, pt.Y + 1);
}
}
}
void AddPoints(params int[] points)
{
// Helper to add a bunch of points to our list of points
for (int i = 0; i < points.Length; i += 2)
{
m_points.Add(new Point(points[i], points[i + 1]));
}
}
void ExampleProgram_MouseUp_AddTrail(object sender, MouseEventArgs e)
{
// Calculate the trail
while (true)
{
// Find the best point for the next point
int bestCount = 0;
Point best = new Point();
// At the current end point, test all the points around it
foreach (var pt in Border(m_trail[m_trail.Count - 1]))
{
// And for each point, see how many points this point borders
int count = 0;
if (m_points.Contains(pt) && !m_trail.Contains(pt))
{
foreach (var test in Border(pt))
{
if (m_points.Contains(test))
{
if (m_trail.Contains(test))
{
// This is a point both in the original cloud, and the current
// trail, so give it a negative weight
count--;
}
else
{
// We haven't visited this point, so give it a positive weight
count++;
}
}
}
}
if (count > bestCount)
{
// This point looks better than anything we've found, so
// it's the best one so far
bestCount = count;
best = pt;
}
}
if (bestCount <= 0)
{
// We either didn't find anything, or what we did find was bad, so
// break out of the loop, we're done
break;
}
m_trail.Add(best);
}
Invalidate();
}
void ExampleProgram_MouseUp_AddPoint(object sender, MouseEventArgs e)
{
// Just add the point, and dump it out
int x = (int)Math.Round((((double)e.X) - 10.0) / 20.0, 0);
int y = (int)Math.Round((((double)e.Y) - 10.0) / 20.0, 0);
m_points.Add(new Point(x, y));
Debug.WriteLine("m_points.Add(new Point(" + x + ", " + y + "));");
Invalidate();
}
void ExampleProgram_Paint(object sender, PaintEventArgs e)
{
// Simple drawing, just draw a grid, and the points
e.Graphics.Clear(Color.White);
for (int x = 0; x < GridWidth; x++)
{
e.Graphics.DrawLine(Pens.Black, x * 20 + 10, 0, x * 20 + 10, ClientSize.Height);
}
for (int y = 0; y < GridHeight; y++)
{
e.Graphics.DrawLine(Pens.Black, 0, y * 20 + 10, ClientSize.Width, y * 20 + 10);
}
foreach (var pt in m_points)
{
e.Graphics.FillEllipse(Brushes.Black, (pt.X * 20 + 10) - 5, (pt.Y * 20 + 10) - 5, 10, 10);
}
foreach (var pt in m_trail)
{
e.Graphics.FillEllipse(Brushes.Red, (pt.X * 20 + 10) - 6, (pt.Y * 20 + 10) - 6, 12, 12);
}
}
}
}

You might want to consider treating your coordinates as a binary image and apply some Morphological techniques to the image.
Thinning might give you good results, but processing like this can be tricky to get working well in a wide range of cases.

Random number in list C#

Hello I'm having trouble with part of my code. I have a dictionary and the key is a number and the value for the key is a list of random integers between 1-8. For some reason after the first key and value are added all the other key's values are the same as the first. This is the output right now which is wrong.
1: 1, 8, 6, 1, 4, 7, 2 ,4
2: 1, 8, 6, 1, 4, 7, 2 ,4
3: 1, 8, 6, 1, 4, 7, 2 ,4
4: 1, 8, 6, 1, 4, 7, 2 ,4
5: 1, 8, 6, 1, 4, 7, 2 ,4
6: 1, 8, 6, 1, 4, 7, 2 ,4
7: 1, 8, 6, 1, 4, 7, 2 ,4
8: 1, 8, 6, 1, 4, 7, 2 ,4
I tried to clear the arrayList once the 8 random numbers is added to the list, but then there's no values in the dictionary at all. Does anyone have any suggestions please.

private List<int> randomList() {
for (int i = 1; i < 9; i++) {
if (randomNum.Count < 9) {
randomNum.Add(random.Next(1, 9));
}
}
return randomNum;
}
After the first iteration, randomNum.Count will never return false and thus it will not calculate it again. You can manually reset it like this:
for (int i = 1; i < 9; i++) {
dic.Add(i, randomList());
randomNum = new List<int>();
}

Try adding this line to the beginning of your randomList() method:
randomNum = new List<int>();

One point that wasn't mentioned: always think about where you declare your variables and why. Doing so will help you to avoid such unintended consequences. In this case your randomNum list is only used inside of your randomList method. If you declare it inside of your method, it is recreated on each call and no more problem.

Log of a very large number

I'm dealing with the BigInteger class with numbers in the order of 2 raised to the power 10,000,000.
The BigInteger Log function is now the most expensive function in my algorithm and I am desperately looking for an alternative.
Since I only need the integral part of the log, I came across this answer which seems brilliant in terms of speed but for some reason I am not getting accurate values. I do not care about the decimal part but I do need to get an accurate integral part whether the value is floored or ceiled as long as I know which.
Here is the function I implemented:
public static double LogBase2 (System.Numerics.BigInteger number)
{
return (LogBase2(number.ToByteArray()));
}
public static double LogBase2 (byte [] bytes)
{
// Corrected based on [ronalchn's] answer.
return (System.Math.Log(bytes [bytes.Length - 1], 2) + ((bytes.Length - 1) * 8));
}
The values are now incredibly accurate except for corner cases. The values 7 to 7.99999, 15 to 15.9999, 23 to 23.9999 31 to 31.9999, etc. return -Infinity. The numbers seem to revolve around byte boundaries. Any idea what's going on here?
Example:
LogBase2( 1081210289) = 30.009999999993600 != 30.000000000000000
LogBase2( 1088730701) = 30.019999999613300 != 30.000000000000000
LogBase2( 2132649894) = 30.989999999389400 != 30.988684686772200
LogBase2( 2147483648) = 31.000000000000000 != -Infinity
LogBase2( 2162420578) = 31.009999999993600 != -Infinity
LogBase2( 4235837212) = 31.979999999984800 != -Infinity
LogBase2( 4265299789) = 31.989999999727700 != -Infinity
LogBase2( 4294967296) = 32.000000000000000 != 32.000000000000000
LogBase2( 4324841156) = 32.009999999993600 != 32.000000000000000
LogBase2( 545958373094) = 38.989999999997200 != 38.988684686772200
LogBase2( 549755813887) = 38.999999999997400 != 38.988684686772200
LogBase2( 553579667970) = 39.009999999998800 != -Infinity
LogBase2( 557430119061) = 39.019999999998900 != -Infinity
LogBase2( 561307352157) = 39.029999999998300 != -Infinity
LogBase2( 565211553542) = 39.039999999997900 != -Infinity
LogBase2( 569142910795) = 39.049999999997200 != -Infinity
LogBase2( 1084374326282) = 39.979999999998100 != -Infinity
LogBase2( 1091916746189) = 39.989999999998500 != -Infinity
LogBase2( 1099511627775) = 39.999999999998700 != -Infinity

Try this:
public static int LogBase2(byte[] bytes)
{
if (bytes[bytes.Length - 1] >= 128) return -1; // -ve bigint (invalid - cannot take log of -ve number)
int log = 0;
while ((bytes[bytes.Length - 1]>>log)>0) log++;
return log + bytes.Length*8-9;
}
The reason for the most significant byte being 0 is because the BigInteger is a signed integer. When the most significant bit of the high-order byte is 1, an extra byte is tacked on to represent the sign bit of 0 for positive integers.
Also changed from using the System.Math.Log function because if you only want the rounded value, it is much faster to use bit operations.
If you have Microsoft Solver Foundation (download at http://msdn.microsoft.com/en-us/devlabs/hh145003.aspx), then you can use the BitCount() function:
public static double LogBase2(Microsoft.SolverFoundation.Common.BigInteger number)
{
return number.BitCount;
}
Or you can use the java library. Add a reference to the vjslib library (found in the .NET tab - this is the J# implementation of the java library).
You can now add "using java.math" in your code.
java.math.BigInteger has a bitLength() function

BigInteger bi = new BigInteger(128);
int log = bi.Log2();
public static class BigIntegerExtensions
{
static int[] PreCalc = new int[] { 8, 7, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
public static int Log2(this BigInteger bi)
{
byte[] buf = bi.ToByteArray();
int len = buf.Length;
return len * 8 - PreCalc[buf[len - 1]] - 1;
}
}

Years late but maybe this will help someone else...
.Net Core 3 added the .GetBitLength() that is basically log2. (but just one increment too high) Since it is built-in to .net I think this is about as fast as we can get.
// Create some example number
BigInteger myNum= (BigInteger)32;
// Get the Log2
BigInteger myLog2 = myNum.GetBitLength() - 1;
https://dotnetfiddle.net/7ggy4D

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Bit counting arbitrarily large positive integers in C# - c#

This way you can optimise it further for (int i=0; i < bytes.Length; i++) { sum += BitCountLookupArray [bytes [i]]; if(sum >= 3) { return false // This will stop the execution of unnecessary lines // as we need to know whether sum is less than 3 or not. } } return true;

The most obvious optimisation is to drop out of the loop as soon as sum == 3, since any further matches past that point are immaterial. There's also no need to set bytes twice; simply use byte [] bytes = number.ToByteArray();, but the benifit here is miniscule.

Related

Accord.NET throw "Index was outside the bounds of the array" [duplicate]

Shift values in array [C#]

How can I get "thinner" graph for my coordinate system?

Random number in list C#

Log of a very large number

Categories

Resources