I'm trying to calculate the product of digits of each number of a sequence of numbers, for example:
21, 22, 23 ... 98, 99 ..
would be:
2, 4, 6 ... 72, 81 ..
To reduce the complexity, I would consider only the [consecutive numbers] in a limited length of digits, such as from 001 to 999 or from 0001 to 9999.
However, when the sequence is large, for example, 1000000000, repeatedly extract the digits and then multiply for every number would be inefficient.
The basic idea is to skip the consecutive zeros we will encounter during the calculation, something like:
using System.Collections.Generic;
using System.Linq;
using System;
// note the digit product is not given with the iteration
// we would need to provide a delegate for the calculation
public static partial class NumericExtensions {
public static void NumberIteration(
this int value, Action<int, int[]> delg, int radix=10) {
var digits=DigitIterator(value, radix).ToArray();
var last=digits.Length-1;
var emptyArray=new int[] { };
var pow=(Func<int, int, int>)((x, y) => (int)Math.Pow(x, 1+y));
var weights=Enumerable.Repeat(radix, last-1).Select(pow).ToArray();
for(int complement=radix-1, i=value, j=i; i>0; i-=1)
if(i>j)
delg(i, emptyArray);
else if(0==digits[0]) {
delg(i, emptyArray);
var k=0;
for(; k<last&&0==digits[k]; k+=1)
;
var y=(digits[k]-=1);
if(last==k||0!=y) {
if(0==y) { // implied last==k
digits=new int[last];
last-=1;
}
for(; k-->0; digits[k]=complement)
;
}
else {
j=i-weights[k-1];
}
}
else {
// receives digits of a number which doesn't contain zeros
delg(i, digits);
digits[0]-=1;
}
delg(0, emptyArray);
}
static IEnumerable<int> DigitIterator(int value, int radix) {
if(-2<radix&&radix<2)
radix=radix<0?-2:2;
for(int remainder; 0!=value; ) {
value=Math.DivRem(value, radix, out remainder);
yield return remainder;
}
}
}
This is only for the enumeration of numbers, to avoid numbers which contain zeros to be calculated in the first place, the digit products are not yet given by the code; but generate the digit products by providing a delegate to perform the calculation will still take time.
How to calculate the digit products of the consecutive numbers efficiently?
EDIT: The "start from anywhere, extended range" version...
This version has a signficantly extended range, and therefore returns an IEnumerable<long> instead of an IEnumerable<int> - multiply enough digits together and you exceed int.MaxValue. It also goes up to 10,000,000,000,000,000 - not quite the full range of long, but pretty big :) You can start anywhere you like, and it will carry on from there to its end.
class DigitProducts
{
private static readonly int[] Prefilled = CreateFirst10000();
private static int[] CreateFirst10000()
{
// Inefficient but simple, and only executed once.
int[] values = new int[10000];
for (int i = 0; i < 10000; i++)
{
int product = 1;
foreach (var digit in i.ToString())
{
product *= digit -'0';
}
values[i] = product;
}
return values;
}
public static IEnumerable<long> GetProducts(long startingPoint)
{
if (startingPoint >= 10000000000000000L || startingPoint < 0)
{
throw new ArgumentOutOfRangeException();
}
int a = (int) (startingPoint / 1000000000000L);
int b = (int) ((startingPoint % 1000000000000L) / 100000000);
int c = (int) ((startingPoint % 100000000) / 10000);
int d = (int) (startingPoint % 10000);
for (; a < 10000; a++)
{
long aMultiplier = a == 0 ? 1 : Prefilled[a];
for (; b < 10000; b++)
{
long bMultiplier = a == 0 && b == 0 ? 1
: a != 0 && b < 1000 ? 0
: Prefilled[b];
for (; c < 10000; c++)
{
long cMultiplier = a == 0 && b == 0 && c == 0 ? 1
: (a != 0 || b != 0) && c < 1000 ? 0
: Prefilled[c];
long abcMultiplier = aMultiplier * bMultiplier * cMultiplier;
for (; d < 10000; d++)
{
long dMultiplier =
(a != 0 || b != 0 || c != 0) && d < 1000 ? 0
: Prefilled[d];
yield return abcMultiplier * dMultiplier;
}
d = 0;
}
c = 0;
}
b = 0;
}
}
}
EDIT: Performance analysis
I haven't looked at the performance in detail, but I believe at this point the bulk of the work is just simply iterating over a billion values. A simple for loop which just returns the value itself takes over 5 seconds on my laptop, and iterating over the digit products only takes a bit over 6 seconds, so I don't think there's much more room for optimization - if you want to go from the start. If you want to (efficiently) start from a different position, more tweaks are required.
Okay, here's an attempt which uses an iterator block to yield the results, and precomputes the first thousand results to make things a bit quicker.
I've tested it up to about 150 million, and it's correct so far. It only goes returns the first billion results - if you needed more than that, you could add another block at the end...
static IEnumerable<int> GetProductDigitsFast()
{
// First generate the first 1000 values to cache them.
int[] productPerThousand = new int[1000];
// Up to 9
for (int x = 0; x < 10; x++)
{
productPerThousand[x] = x;
yield return x;
}
// Up to 99
for (int y = 1; y < 10; y++)
{
for (int x = 0; x < 10; x++)
{
productPerThousand[y * 10 + x] = x * y;
yield return x * y;
}
}
// Up to 999
for (int x = 1; x < 10; x++)
{
for (int y = 0; y < 10; y++)
{
for (int z = 0; z < 10; z++)
{
int result = x * y * z;
productPerThousand[x * 100 + y * 10 + z] = x * y * z;
yield return result;
}
}
}
// Now use the cached values for the rest
for (int x = 0; x < 1000; x++)
{
int xMultiplier = x == 0 ? 1 : productPerThousand[x];
for (int y = 0; y < 1000; y++)
{
// We've already yielded the first thousand
if (x == 0 && y == 0)
{
continue;
}
// If x is non-zero and y is less than 100, we've
// definitely got a 0, so the result is 0. Otherwise,
// we just use the productPerThousand.
int yMultiplier = x == 0 || y >= 100 ? productPerThousand[y]
: 0;
int xy = xMultiplier * yMultiplier;
for (int z = 0; z < 1000; z++)
{
if (z < 100)
{
yield return 0;
}
else
{
yield return xy * productPerThousand[z];
}
}
}
}
}
I've tested this by comparing it with the results of an incredibly naive version:
static IEnumerable<int> GetProductDigitsSlow()
{
for (int i = 0; i < 1000000000; i++)
{
int product = 1;
foreach (var digit in i.ToString())
{
product *= digit -'0';
}
yield return product;
}
}
Hope this idea is of some use... I don't know how it compares to the others shown here in terms of performance.
EDIT: Expanding this slightly, to use simple loops where we know the results will be 0, we end up with fewer conditions to worry about, but for some reason it's actually slightly slower. (This really surprised me.) This code is longer, but possibly a little easier to follow.
static IEnumerable<int> GetProductDigitsFast()
{
// First generate the first 1000 values to cache them.
int[] productPerThousand = new int[1000];
// Up to 9
for (int x = 0; x < 10; x++)
{
productPerThousand[x] = x;
yield return x;
}
// Up to 99
for (int y = 1; y < 10; y++)
{
for (int x = 0; x < 10; x++)
{
productPerThousand[y * 10 + x] = x * y;
yield return x * y;
}
}
// Up to 999
for (int x = 1; x < 10; x++)
{
for (int y = 0; y < 10; y++)
{
for (int z = 0; z < 10; z++)
{
int result = x * y * z;
productPerThousand[x * 100 + y * 10 + z] = x * y * z;
yield return result;
}
}
}
// Use the cached values up to 999,999
for (int x = 1; x < 1000; x++)
{
int xMultiplier = productPerThousand[x];
for (int y = 0; y < 100; y++)
{
yield return 0;
}
for (int y = 100; y < 1000; y++)
{
yield return xMultiplier * y;
}
}
// Now use the cached values for the rest
for (int x = 1; x < 1000; x++)
{
int xMultiplier = productPerThousand[x];
// Within each billion, the first 100,000 values will all have
// a second digit of 0, so we can just yield 0.
for (int y = 0; y < 100 * 1000; y++)
{
yield return 0;
}
for (int y = 100; y < 1000; y++)
{
int yMultiplier = productPerThousand[y];
int xy = xMultiplier * yMultiplier;
// Within each thousand, the first 100 values will all have
// an anti-penulimate digit of 0, so we can just yield 0.
for (int z = 0; z < 100; z++)
{
yield return 0;
}
for (int z = 100; z < 1000; z++)
{
yield return xy * productPerThousand[z];
}
}
}
}
You can do this in a dp-like fashion with the following recursive formula:
n n <= 9
a[n/10] * (n % 10) n >= 10
where a[n] is the result of the multiplication of the digits of n.
This leads to a simple O(n) algorithm: When calculating f(n) assuming you have already calculated f(Β·) for smaller n, you can just use the result from all digits but the last multiplied with the last digit.
a = range(10)
for i in range(10, 100):
a.append(a[i / 10] * (i % 10))
You can get rid of the expensive multiplication by just adding doing a[n - 1] + a[n / 10] for numbers where the last digit isn't 0.
The key to efficiency is not to enumerate the numbers and extract the digits, but to enumerate digits and generate the numbers.
int[] GenerateDigitProducts( int max )
{
int sweep = 1;
var results = new int[max+1];
for( int i = 1; i <= 9; ++i ) results[i] = i;
// loop invariant: all values up to sweep * 10 are filled in
while (true) {
int prior = results[sweep];
if (prior > 0) {
for( int j = 1; j <= 9; ++j ) {
int k = sweep * 10 + j; // <-- the key, generating number from digits is much faster than decomposing number into digits
if (k > max) return results;
results[k] = prior * j;
// loop invariant: all values up to k are filled in
}
}
++sweep;
}
}
It's up to the caller to ignore the results which are less than min.
Demo: http://ideone.com/rMK7Sh
Here's a low space version using the branch-bound-prune technique:
static void VisitDigitProductsImpl(int min, int max, System.Action<int, int> visitor, int build_n, int build_ndp)
{
if (build_n >= min && build_n <= max) visitor(build_n, build_ndp);
// bound
int build_n_min = build_n;
int build_n_max = build_n;
do {
build_n_min *= 10;
build_n_max *= 10;
build_n_max += 9;
// prune
if (build_n_min > max) return;
} while (build_n_max < min);
int next_n = build_n * 10;
int next_ndp = 0;
// branch
// if you need to visit zeros as well: VisitDigitProductsImpl(min, max, visitor, next_n, next_ndp);
for( int i = 1; i <= 9; ++i ) {
next_n++;
next_ndp += build_ndp;
VisitDigitProductsImpl(min, max, visitor, next_n, next_ndp);
}
}
static void VisitDigitProducts(int min, int max, System.Action<int, int> visitor)
{
for( int i = 1; i <= 9; ++i )
VisitDigitProductsImpl(min, max, visitor, i, i);
}
Demo: http://ideone.com/AIal1L
Calculating a product from the previous one
Because the numbers are consecutive, in most cases you can generate one product from the previous one by inspecting only the units place.
For example:
12345 = 1 * 2 * 3 * 4 * 5 = 120
12346 = 1 * 2 * 3 * 4 * 6 = 144
But once you've calculated the value for 12345, you can calculate 12346 as (120 / 5) * 6.
Clearly this won't work if the previous product was zero. It does work when wrapping over from 9 to 10 because the new last digit is zero, but you could optimise that case anyway (see below).
If you're dealing with lots of digits, this approach adds up to quite a saving even though it involves a division.
Dealing with zeros
As you're looping through values to generate the products, as soon as you encounter a zero you know that the product will be zero.
For example, with four-digit numbers, once you get to 1000 you know that the products up to 1111 will all be zero so there's no need to calculate these.
The ultimate efficiency
Of course, if you're willing or able to generate and cache all the values up front then you can retrieve them in O(1). Further, as it's a one-off cost, the efficiency of the algorithm you use to generate them may be less important in this case.
I end up with very simple code as the following:
Code:
public delegate void R(
R delg, int pow, int rdx=10, int prod=1, int msd=0);
R digitProd=
default(R)!=(digitProd=default(R))?default(R):
(delg, pow, rdx, prod, msd) => {
var x=pow>0?rdx:1;
for(var call=(pow>1?digitProd:delg); x-->0; )
if(msd>0)
call(delg, pow-1, rdx, prod*x, msd);
else
call(delg, pow-1, rdx, x, x);
};
msd is the most significant digit, it's like most significant bit in binary.
The reason I didn't choose to use iterator pattern is it takes more time than the method call. The complete code(with test) is put at the rear of this answer.
Note that the line default(R)!=(digitProd=default(R))?default(R): ... is only for assigment of digitProd, since the delegate cannot be used before it is assigned. We can actually write it as:
Alternative syntax:
var digitProd=default(R);
digitProd=
(delg, pow, rdx, prod, msd) => {
var x=pow>0?rdx:1;
for(var call=(pow>1?digitProd:delg); x-->0; )
if(msd>0)
call(delg, pow-1, rdx, prod*x, msd);
else
call(delg, pow-1, rdx, x, x);
};
The disadvantage of this implementation is that it cannot started from a particular number but the maximum number of full digits.
There're some simple ideas that I solve it:
Recursion
The delegate(Action) R is a recursive delegate definition which is used as tail call recursion, for both the algorithm and the delegate which receives the result of digit product.
And the other ideas below explain for why recursion.
No division
For consecutive numbers, use of the division to extract each digit is considered low efficiency, thus I chose to operate on the digits directly with recursion in a down-count way.
For example, with 3 digits of the number 123, it's one of the 3 digits numbers started from 999:
9 8 7 6 5 4 3 2 [1] 0 -- the first level of recursion
9 8 7 6 5 4 3 [2] 1 0 -- the second level of recursion
9 8 7 6 5 4 [3] 2 1 0 -- the third level of recursion
Don't cache
As we can see that this answer
How to multiply each digit in a number efficiently
suggested to use the mechanism of caching, but for the consecutive numbers, we don't, since it is the cache.
For the numbers 123, 132, 213, 231, 312, 321, the digit products are identical. Thus for a cache, we can reduce the items to store which are only the same digits with different order(permutations), and we can regard them as the same key.
However, sorting the digits also takes time. With a HashSet implemented collection of keys, we pay more storage with more items; even we've reduced the items, we still spend time on equality comparing. There does not seem to be a hash function better than use its value for equality comparing, and which is just the result we are calculating. For example, excepting 0 and 1, there're only 36 combinations in the multiplication table of two digits.
Thus, as long as the calculation is efficient enough, we can consider the algorithm itself is a virtual cache without costing a storage.
Reduce the time on calculation of numbers contain zero(s)
For the digit products of consecutive numbers, we will encounter:
1 zero per 10
10 consecutive zeros per 100
100 consecutive zeros per 1000
and so on. Note that there are still 9 zeros we will encounter with per 10 in per 100. The count of zeros can be calculated with the following code:
static int CountOfZeros(int n, int r=10) {
var powerSeries=n>0?1:0;
for(var i=0; n-->0; ++i) {
var geometricSeries=(1-Pow(r, 1+n))/(1-r);
powerSeries+=geometricSeries*Pow(r-1, 1+i);
}
return powerSeries;
}
For n is the count of digits, r is the radix. The number would be a power series which calculated from a geometric series and plus 1 for the 0.
For example, the numbers of 4 digits, the zeros we will encounter are:
(1)+(((1*9)+11)*9+111)*9 = (1)+(1*9*9*9)+(11*9*9)+(111*9) = 2620
For this implementation, we do not really skip the calculation of numbers contain zero. The reason is the result of a shallow level of recursion is reused with the recursive implementation which are what we can regard as cached. The attempting of multiplication with a single zero can be detected and avoided before it performs, and we can pass a zero to the next level of recursion directly. However, just multiply will not cause much of performance impact.
The complete code:
public static partial class TestClass {
public delegate void R(
R delg, int pow, int rdx=10, int prod=1, int msd=0);
public static void TestMethod() {
var power=9;
var radix=10;
var total=Pow(radix, power);
var value=total;
var count=0;
R doNothing=
(delg, pow, rdx, prod, msd) => {
};
R countOnly=
(delg, pow, rdx, prod, msd) => {
if(prod>0)
count+=1;
};
R printProd=
(delg, pow, rdx, prod, msd) => {
value-=1;
countOnly(delg, pow, rdx, prod, msd);
Console.WriteLine("{0} = {1}", value.ToExpression(), prod);
};
R digitProd=
default(R)!=(digitProd=default(R))?default(R):
(delg, pow, rdx, prod, msd) => {
var x=pow>0?rdx:1;
for(var call=(pow>1?digitProd:delg); x-->0; )
if(msd>0)
call(delg, pow-1, rdx, prod*x, msd);
else
call(delg, pow-1, rdx, x, x);
};
Console.WriteLine("--- start --- ");
var watch=Stopwatch.StartNew();
digitProd(printProd, power);
watch.Stop();
Console.WriteLine(" total numbers: {0}", total);
Console.WriteLine(" zeros: {0}", CountOfZeros(power-1));
if(count>0)
Console.WriteLine(" non-zeros: {0}", count);
var seconds=(decimal)watch.ElapsedMilliseconds/1000;
Console.WriteLine("elapsed seconds: {0}", seconds);
Console.WriteLine("--- end --- ");
}
static int Pow(int x, int y) {
return (int)Math.Pow(x, y);
}
static int CountOfZeros(int n, int r=10) {
var powerSeries=n>0?1:0;
for(var i=0; n-->0; ++i) {
var geometricSeries=(1-Pow(r, 1+n))/(1-r);
powerSeries+=geometricSeries*Pow(r-1, 1+i);
}
return powerSeries;
}
static String ToExpression(this int value) {
return (""+value).Select(x => ""+x).Aggregate((x, y) => x+"*"+y);
}
}
In the code, doNothing, countOnly, printProd are for what to do when we get the result of digit product, we can pass any of them to digitProd which implemented the full algorithm. For example, digitProd(countOnly, power) would only increase count, and the final result would be as same as CountOfZeros returns.
I'd create an array that represent the decimal digits of a number and then increase that number just as you would in real life (i.e. on an overflow increase the more significant digit).
From there I'd use an array of products that can be used as a tiny lookup table.
E.g.
the number 314 would result in the product array: 3, 3, 12
the number 345 would result in the product array: 3, 12, 60
Now if you increase the decimal number you'd only need to recalculate the righter most product by multiplying it with the product to the left. When a second digit is modified you'd only recalculate two products (the second from the right and the outer right product). This way you'll never calculate more than absolutely necessary and you have a very tiny lookup table.
So if you start with the number 321 and increment then:
digits = 3, 2, 1 products = 3, 6, 6
incrementing then changes the outer right digit and therefore only the outer right product is recalculated
digits = 3, 2, 2 products = 3, 6, 12
This goes up until the second digit is incremented:
digits = 3, 3, 0 products = 3, 9, 0 (two products recalculated)
Here is an example to show the idea (not very good code, but just as an example):
using System;
using System.Diagnostics;
namespace Numbers2
{
class Program
{
/// <summary>
/// Maximum of supported digits.
/// </summary>
const int MAXLENGTH = 20;
/// <summary>
/// Contains the number in a decimal format. Index 0 is the righter number.
/// </summary>
private static byte[] digits = new byte[MAXLENGTH];
/// <summary>
/// Contains the products of the numbers. Index 0 is the righther number. The left product is equal to the digit on that position.
/// All products to the right (i.e. with lower index) are the product of the digit at that position multiplied by the product to the left.
/// E.g.
/// 234 will result in the product 2 (=first digit), 6 (=second digit * 2), 24 (=third digit * 6)
/// </summary>
private static long[] products = new long[MAXLENGTH];
/// <summary>
/// The length of the decimal number. Used for optimisation.
/// </summary>
private static int currentLength = 1;
/// <summary>
/// The start value for the calculations. This number will be used to start generated products.
/// </summary>
const long INITIALVALUE = 637926372435;
/// <summary>
/// The number of values to calculate.
/// </summary>
const int NROFVALUES = 10000;
static void Main(string[] args)
{
Console.WriteLine("Started at " + DateTime.Now.ToString("HH:mm:ss.fff"));
// set value and calculate all products
SetValue(INITIALVALUE);
UpdateProducts(currentLength - 1);
for (long i = INITIALVALUE + 1; i <= INITIALVALUE + NROFVALUES; i++)
{
int changedByte = Increase();
Debug.Assert(changedByte >= 0);
// update the current length (only increase because we're incrementing)
if (changedByte >= currentLength) currentLength = changedByte + 1;
// recalculate products that need to be updated
UpdateProducts(changedByte);
//Console.WriteLine(i.ToString() + " = " + products[0].ToString());
}
Console.WriteLine("Done at " + DateTime.Now.ToString("HH:mm:ss.fff"));
Console.ReadLine();
}
/// <summary>
/// Sets the value in the digits array (pretty blunt way but just for testing)
/// </summary>
/// <param name="value"></param>
private static void SetValue(long value)
{
var chars = value.ToString().ToCharArray();
for (int i = 0; i < MAXLENGTH; i++)
{
int charIndex = (chars.Length - 1) - i;
if (charIndex >= 0)
{
digits[i] = Byte.Parse(chars[charIndex].ToString());
currentLength = i + 1;
}
else
{
digits[i] = 0;
}
}
}
/// <summary>
/// Recalculate the products and store in products array
/// </summary>
/// <param name="changedByte">The index of the digit that was changed. All products up to this index will be recalculated. </param>
private static void UpdateProducts(int changedByte)
{
// calculate other products by multiplying the digit with the left product
bool previousProductWasZero = false;
for (int i = changedByte; i >= 0; i--)
{
if (previousProductWasZero)
{
products[i] = 0;
}
else
{
if (i < currentLength - 1)
{
products[i] = (int)digits[i] * products[i + 1];
}
else
{
products[i] = (int)digits[i];
}
if (products[i] == 0)
{
// apply 'zero optimisation'
previousProductWasZero = true;
}
}
}
}
/// <summary>
/// Increases the number and returns the index of the most significant byte that changed.
/// </summary>
/// <returns></returns>
private static int Increase()
{
digits[0]++;
for (int i = 0; i < MAXLENGTH - 1; i++)
{
if (digits[i] == 10)
{
digits[i] = 0;
digits[i + 1]++;
}
else
{
return i;
}
}
if (digits[MAXLENGTH - 1] == 10)
{
digits[MAXLENGTH - 1] = 0;
}
return MAXLENGTH - 1;
}
}
}
This way calculating the product for 1000 numbers in the billion range is nearly as fast as doing that for the numbers 1 to 1000.
By the way, I'm very curious what you're trying to use all this for?
Depending on the length of your numbers and the length of the sequence if would go for some optimization.
As you can limit the maximum size of the number you could iterate over the number itself via an increasing modulus.
Let's say you have the number 42:
var Input = 42;
var Product = 1;
var Result = 0;
// Iteration - step 1:
Result = Input % 10; // = 2
Input -= Result;
Product *= Result;
// Iteration - step 2:
Result = Input % 100 / 10; // = 4
Input -= Result;
Product *= Result;
You can pack this operation into a nice loop which is probably small enough to fit in the processors caches and iterate over the whole number. As you avoid any function calls this is probably also quite fast.
If you want to concern zeros as abort criteria the implementation for this is obviously quite easy.
As Matthew said already: Ultimate performance and efficiency will be gained with a lookup table.
The smaller the range of your sequence numbers is, the faster the lookup table is; because it will be retrieved from the cache and not from slow memory.
I'm currently having this method which works fine:
private static List<long> GetPrimeNumbers(long number)
{
var result = new List<long>();
for (var i = 0; i <= number; i++)
{
var isPrime = true;
for (var j = 2; j < i; j++)
{
if (i % j == 0)
{
isPrime = false;
break;
}
}
if (isPrime)
{
result.Add(i);
}
}
return result;
}
Is the above the best algorithm possible?
It's really slow when the number is above 100000.
I mean, what'd be the best, most performant algorithm to find the prime numbers less than or equal to a given number?
Sieve of Eratosthenes. This algorithm can generate all prime numbers up to n. Time complexity - O(nlog(n)), memory complexity - O(n)
BPSW primality test. This algorithm can check if n is pseudoprime. It was tested on first 10^15 numbers. Time complexity - O(log(n)).
UPDATE:
I did some research and wrote simple implementation of generating prime numbers in c#. Main idea when we check number N for primality - we just need to check if it divisible by any prime number that less than sqrt(N).
First implementation:
public static List<int> GeneratePrimes(int n)
{
var primes = new List<int>();
for(var i = 2; i <= n; i++)
{
var ok = true;
foreach(var prime in primes)
{
if (prime * prime > i)
break;
if (i % prime == 0)
{
ok = false;
break;
}
}
if(ok)
primes.Add(i);
}
return primes;
}
Test results:
10^6 - 0.297s
10^7 - 6.202s
10^8 - 141.860s
Second implementation using parallel computing:
1. Generate all primes up to sqrt(N)
2. Generate all primes from sqrt(N) + 1 to N using primes up to sqrt(N) using parallel computing.
public static List<int> GeneratePrimesParallel(int n)
{
var sqrt = (int) Math.Sqrt(n);
var lowestPrimes = GeneratePrimes(sqrt);
var highestPrimes = (Enumerable.Range(sqrt + 1, n - sqrt)
.AsParallel()
.Where(i => lowestPrimes.All(prime => i % prime != 0)));
return lowestPrimes.Concat(highestPrimes).ToList();
}
Test results:
10^6 - 0.276s
10^7 - 4.082s
10^8 - 78.624
Probably the Sieve of Atkin is most performant, although for all I know somebody found a better once since.
Erathosthenes and Sundaram also have sieves of their own, which are considerably simpler to implement. Any of them kicks the stuffing out of doing it by separately looking for a factor in each number up to the limit.
All sieves use more working memory than factorizing one value at a time, but generally still less memory than the resulting list of primes.
You can improve substantially your algorithm testing whether n is a multiple of any integer between 2 and sqrt(n).
private static List<int> GetPrimeNumbers2(long number)
{
var result = new List<int>();
for (var i = 0; i <= number; i++)
{
var isPrime = true;
var n = Math.Floor(Math.Sqrt(i));
for (var j = 2; j <= n; j++)
{
if (i % j == 0)
{
isPrime = false;
break;
}
}
if (isPrime)
{
result.Add(i);
}
}
return result;
}
This change the complexity from O(NN) to O(Nsqrt(N)).
The fastest known algorithm for testing the primality of general numbers is the Elliptic Curve Primality Proving (ECPP): http://en.wikipedia.org/wiki/Elliptic_curve_primality_proving
I guess that implementing it will be difficult so do it only if you really need it. There are probably library that could help you here.
This will give you reasonable performance for the initial execution and then near to O(1) (it will be O(N) but very, very, small) performance for any repeated requests, and reasonable performance for values larger than the current max number seen.
private static List<ulong> KnownPrimes = new List<ulong>();
private static ulong LargestValue = 1UL;
private static List<ulong> GetFastestPrimeNumbers(ulong number)
{
var result = new List<ulong>();
lock (KnownPrimes)
{
result.AddRange(KnownPrimes.Where(c => c < number).ToList());
if (number <= LargestValue)
{
return result;
}
result = KnownPrimes;
for (var i = LargestValue + 1; i <= number; i++)
{
var isPrime = true;
var n = Math.Floor(Math.Sqrt(i));
for (var j = 0; j < KnownPrimes.Count; j++)
{
var jVal = KnownPrimes[j];
if (jVal * jVal > i)
{
//isPrime = false;
break;
}
else if (i % jVal == 0)
{
isPrime = false;
break;
}
}
if (isPrime)
{
result.Add(i);
}
}
LargestValue = number;
}
return result;
}
Edit: Considerably faster using Sieve of Atkin, which I addapted to konw about the:
private static List<ulong> KnownPrimes = new List<long>();
private static ulong LargestValue = 1UL;
private unsafe static List<ulong> FindPrimes(ulong number)
{
var result = new List<ulong>();
var isPrime = new bool[number + 1];
var sqrt = Math.Sqrt(number);
lock (KnownPrimes)
{
fixed (bool* pp = isPrime)
{
bool* pp1 = pp;
result.AddRange(KnownPrimes.Where(c => c < number).ToList());
if (number <= LargestValue)
{
return result;
}
result = KnownPrimes;
for (ulong x = 1; x <= sqrt; x++)
for (ulong y = 1; y <= sqrt; y++)
{
var n = 4 * x * x + y * y;
if (n <= number && (n % 12 == 1 || n % 12 == 5))
pp1[n] ^= true;
n = 3 * x * x + y * y;
if (n <= number && n % 12 == 7)
pp1[n] ^= true;
n = 3 * x * x - y * y;
if (x > y && n <= number && n % 12 == 11)
pp1[n] ^= true;
}
for (ulong n = 5; n <= sqrt; n++)
if (pp1[n])
{
var s = n * n;
for (ulong k = s; k <= number; k += s)
pp1[k] = false;
}
if (LargestValue < 3)
{
KnownPrimes.Add(2);
KnownPrimes.Add(3);
}
for (ulong n = 5; n <= number; n += 2)
if (pp1[n])
KnownPrimes.Add(n);
LargestValue = number;
}
}
return result;
}
Adapted from Source
This can easily be improved to get better performance when adding items, but I would suggest you save the previous KnownPrimes list to disk between executions, and load a pre-existing list of values such as the list from http://primes.utm.edu/lists/small/millions β Credit goes to CodingBarfield
I found this link:
http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm
according to your question, it seems that what you are interested in is not in proving that a certain given number is probably (or certainly) a prime, and neither you are interested in factoring large numbers. To find all prime numbers up to a given N, one can use Eratosthenes Sieve, but it seems that in the link above further optimizations were considered.
I think a pertinent question is "how big will the upper limit ever be". If the number is in a relatively small range [lets say 2^16] you could probably just precompute and save all the primes (below some limit) to file, and then load into memory where appropriate (and then potentially continue using one of the Sieves listed below.
Ivan Benko and Steve Jessop above do state the two more well known fast methods [Eratosthenes, Atkin] although Ivan, the complexity of the Sieve is O(n*log(log(n))).
The Sieve is relatively easy to implement and is very fast compared to your method.
The absolute most performant:
(Minimize the work to get the result).
Store the primes of all numbers in the domain in a hashtable with the number as key.