Why is BitArray faster than array of bools? - c#

I have this implementation of Sieve of Eratosthenes in C#:
public static BitArray Count()
{
const int halfSize = MaxSize / 2;
var mark = new BitArray(halfSize);
const int max = halfSize - 2;
var maxFactor = (int) Math.Sqrt(MaxSize + 1) / 2;
for (var i = 1; i <= maxFactor; ++i)
{
if (mark[i]) continue;
var p = i + i + 1;
var k = p * p >> 1;
for (; k <= max; k += p)
{
mark[k] = true;
}
}
return mark;
}
It gives results good enough for me. Nonetheless, I decided to test this algorithm using arrays of bools, expecting it to use more memory but be faster. And to my surprise that wasn't the result. Benchmark.NET on .NET Core 3.1 shows that bool array is more than two times slower than BitArray. Considering that latter uses more method calls and gives much longer asm (BitArray vs. bool array), how is it possible?
+---------------+----------+---------+---------+----------+----------+----------+-------+-----------+
| Method | Mean | Error | StdDev | Median | Min | Max | Op/s | Allocated |
+---------------+----------+---------+---------+----------+----------+----------+-------+-----------+
| SieveBool | 294.7 ms | 4.00 ms | 3.74 ms | 293.5 ms | 290.8 ms | 304.0 ms | 3.393 | 33.38 MB |
+---------------+----------+---------+---------+----------+----------+----------+-------+-----------+
| SieveBitArray | 130.2 ms | 1.03 ms | 0.97 ms | 130.3 ms | 128.5 ms | 132.1 ms | 7.680 | 4.17 MB |
+---------------+----------+---------+---------+----------+----------+----------+-------+-----------+
Results are similar when using fields instead of initializing arrays in methods (except there is no allocation of course).

Related

Is there a way to reverse each 2 bytes of a file?

Basic summary of what I'm trying to achieve in this idea (As far as that I know there isn't exactly a function to do what I'm doing).
What I need to do is idea is to reverse every 2 bytes of a file. Reading a file bytes and reversing each 2 bytes.
Example: 05 04 82 FF
Output: 04 05 FF 82
I have some idea of it. But I know my attempts are WAY off.
To clarify.
I'm trying to take a bin file.
Read the bytes inside the file.
And reversing every 2 inside that file and close it.
If anyone can clear this complicated way up that would be great?
There are many approaches you could take to achieve this.
Here is a fairly efficient streaming approach with low allocations, using all the characters we know and love.... ArrayPool, Span<T>, and FileStream.
Note 1 : Adjust the buffer size to something that suits your hardware if needed.
Note 2 : This lacks basic sanity checks and fault tolerance, it will also die miserably if the size of the file isn't devisable by 2.
Given
private static ArrayPool<byte> pool = ArrayPool<byte>.Shared;
private const int BufferSize = 4096;
public static void Swap(string fileName)
{
var tempFileName = Path.ChangeExtension(fileName, "bob");
var buffer = pool.Rent(BufferSize);
try
{
var span = new Span<byte>(buffer);
using var oldFs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.None, BufferSize);
using var newFs = new FileStream(tempFileName, FileMode.Create, FileAccess.Write, FileShare.None, BufferSize);
var size = 0;
while ((size = oldFs.Read(span)) > 0)
{
for (var i = 0; i < size; i += 2)
{
var temp = span[i];
span[i] = span[i + 1];
span[i + 1] = temp;
}
newFs.Write(span.Slice(0,size));
}
}
finally
{
pool.Return(buffer);
}
File.Move(tempFileName, fileName,true);
}
Test
File.WriteAllText(#"D:\Test1.txt","1234567890abcdef");
Swap(#"D:\Test1.txt");
var result = File.ReadAllText(#"D:\Test1.txt");
Console.WriteLine(result == "2143658709badcfe");
Output
True
Benchmarks
This was just a simple benchmark comparing the current solution with a simple array approach and pointers, varying the buffer size which you might do to increase HDD throughput. Technically it's only benchmarking one run through of a 10Mb data block, However the allocations would sky rocket if the methods got run more than once.
Environment
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1198 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.100
[Host] : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT [AttachedDebugger]
.NET Core 5.0 : .NET Core 5.0.0 (CoreCLR 5.0.20.51904, CoreFX 5.0.20.51904), X64 RyuJIT
Job=.NET Core 5.0 Runtime=.NET Core 5.0
Results
| Method | N | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------- |------ |------------:|---------:|---------:|-------:|------:|------:|----------:|
| SwapSpanPool | 4094 | 25.89 ns | 0.078 ns | 0.069 ns | - | - | - | - |
| SwapArray | 4094 | 157.70 ns | 0.516 ns | 0.483 ns | 0.4923 | - | - | 4120 B |
| SwapUnsafe | 4094 | 154.71 ns | 0.293 ns | 0.274 ns | 0.4923 | - | - | 4120 B |
|------------- |------ |------------:|---------:|---------:|-------:|------:|------:|----------:|
| SwapSpanPool | 16384 | 25.82 ns | 0.048 ns | 0.043 ns | - | - | - | - |
| SwapArray | 16384 | 520.62 ns | 1.186 ns | 1.109 ns | 1.9569 | - | - | 16408 B |
| SwapUnsafe | 16384 | 518.82 ns | 1.361 ns | 1.273 ns | 1.9569 | - | - | 16408 B |
|------------- |------ |------------:|---------:|---------:|-------:|------:|------:|----------:|
| SwapSpanPool | 65536 | 25.81 ns | 0.049 ns | 0.043 ns | - | - | - | - |
| SwapArray | 65536 | 1,840.41 ns | 5.792 ns | 5.418 ns | 7.8106 | - | - | 65560 B |
| SwapUnsafe | 65536 | 1,846.57 ns | 3.715 ns | 3.475 ns | 7.8106 | - | - | 65560 B |
Setup
[MemoryDiagnoser]
[SimpleJob(RuntimeMoniker.NetCoreApp50)]
public class DumbTest
{
private static readonly ArrayPool<byte> pool = ArrayPool<byte>.Shared;
private MemoryStream ms1;
private MemoryStream ms2;
[Params(4094, 16384, 65536)] public int N;
[GlobalSetup]
public void Setup()
{
var data = new byte[10 * 1024 * 1024];
new Random(42).NextBytes(data);
ms1 = new MemoryStream(data);
ms2 = new MemoryStream(new byte[10 * 1024 * 1024]);
}
public void SpanPool()
{
var buffer = pool.Rent(N);
try
{
var span = new Span<byte>(buffer);
var size = 0;
while ((size = ms1.Read(span)) > 0)
{
for (var i = 0; i < size; i += 2)
{
var temp = span[i];
span[i] = span[i + 1];
span[i + 1] = temp;
}
ms2.Write(span.Slice(0, size));
}
}
finally
{
pool.Return(buffer);
}
}
public void Array()
{
var buffer = new byte[N];
var size = 0;
while ((size = ms1.Read(buffer)) > 0)
{
for (var i = 0; i < size; i += 2)
{
var temp = buffer[i];
buffer[i] = buffer[i + 1];
buffer[i + 1] = temp;
}
ms2.Write(buffer, 0, size);
}
}
public unsafe void Unsafe()
{
var buffer = new byte[N];
fixed (byte* p = buffer)
{
var size = 0;
while ((size = ms1.Read(buffer)) > 0)
{
for (var i = 0; i < size; i += 2)
{
var temp = buffer[i];
p[i] = p[i + 1];
p[i + 1] = temp;
}
ms2.Write(buffer, 0, size);
}
}
}
[Benchmark]
public void SwapSpanPool()
{
SpanPool();
}
[Benchmark]
public void SwapArray()
{
Array();
}
[Benchmark]
public void SwapUnsafe()
{
Unsafe();
}
}

Ways to improve string memory allocation

This question is more theoretical than practical, but still.
I've been looking for a chance to improve the following code from the string memory allocation standpoint:
/* Output for n = 3:
*
* ' #'
* ' ##'
* '###'
*
*/
public static string[] staircase(int n) {
string[] result = new string[n];
for(var i = 0; i < result.Length; i++) {
var spaces = string.Empty.PadLeft(n - i - 1, ' ');
var sharpes = string.Empty.PadRight(i + 1, '#');
result[i] = spaces + sharpes;
}
return result;
}
PadHelper is the method, that is eventually called under the hood twice per iteration.
So, correct me if I'm wrong, but it seems like memory is allocated at least 3 times per iteration.
Any code improvements will be highly appreciated.
how about:
result[i] = new string('#',i).PadLeft(n)
?
Note that this still allocates two strings internally, but I honestly don't see that as a problem. The garbage collector will take care of it for you.
StringBuilder is always an answer when it comes to string allocations; I'm sure you know that so apparently you want something else. Well, since your strings are all the same length, you can declare a single char[] array, populate it every time (only requires changing one array element on each iteration) and then use the string(char[]) constructor:
public static string[] staircase(int n)
{
char[] buf = new char[n];
string[] result = new string[n];
for (var i = 0; i < n - 1; i++)
{
buf[i] = ' ';
}
for (var i = 0; i < n; i++)
{
buf[n - i - 1] = '#';
result[i] = new string(buf);
}
return result;
}
You can save on both allocations and speed by starting with a string that contains all the Spaces and all the Sharpes you're ever going to need, and then taking substrings from that, as follows:
public string[] Staircase2()
{
string allChars = new string(' ', n - 1) + new string('#', n); // n-1 spaces + n sharpes
string[] result = new string[n];
for (var i = 0; i < result.Length; i++)
result[i] = allChars.Substring(i, n);
return result;
}
I used BenchmarkDotNet to compare Staircase1 (your original approach) with Staircase2 (my approach above) from n=2 upto n=8, see the results below.
It shows that Staircase2 is always faster (see the Mean column), and it allocates fewer bytes starting from n=3.
| Method | n | Mean | Error | StdDev | Allocated |
|----------- |-- |------------:|-----------:|-----------:|----------:|
| Staircase1 | 2 | 229.36 ns | 4.3320 ns | 4.0522 ns | 92 B |
| Staircase2 | 2 | 92.00 ns | 0.7200 ns | 0.6735 ns | 116 B |
| Staircase1 | 3 | 375.06 ns | 3.3043 ns | 3.0908 ns | 156 B |
| Staircase2 | 3 | 114.12 ns | 2.8933 ns | 3.2159 ns | 148 B |
| Staircase1 | 4 | 507.32 ns | 3.8995 ns | 3.2562 ns | 236 B |
| Staircase2 | 4 | 142.78 ns | 1.4575 ns | 1.3634 ns | 196 B |
| Staircase1 | 5 | 650.03 ns | 15.1515 ns | 25.7284 ns | 312 B |
| Staircase2 | 5 | 169.25 ns | 1.9076 ns | 1.6911 ns | 232 B |
| Staircase1 | 6 | 785.75 ns | 16.9353 ns | 15.8413 ns | 412 B |
| Staircase2 | 6 | 195.91 ns | 2.9852 ns | 2.4928 ns | 292 B |
| Staircase1 | 7 | 919.15 ns | 11.4145 ns | 10.6771 ns | 500 B |
| Staircase2 | 7 | 237.55 ns | 4.6380 ns | 4.9627 ns | 332 B |
| Staircase1 | 8 | 1,075.66 ns | 26.7013 ns | 40.7756 ns | 620 B |
| Staircase2 | 8 | 255.50 ns | 2.6894 ns | 2.3841 ns | 404 B |
This doesn't mean that Staircase2 is the absolute best possible, but certainly there is a way that is better than the original.
You can project your desired results using the Linq Select method. For example, something like this:
public static string[] staircase(int n) {
return Enumerable.Range(1, n).Select(i => new string('#', i).PadLeft(n)).ToArray();
}
Alternate approach using an int array:
public static string[] staircase(int n) {
return (new int[n]).Select((x,i) => new string('#', i+1).PadLeft(n)).ToArray();
}
HTH

Slow Regex Split

I am parsing a large quantity of data (over 2GB), and my regex search is quite slow. Is there anyway to improve it?
Slow Code
string file_content = "4980: 01:06:59.140 - SomeLargeQuantityOfLogEntries";
List<string> split_content = Regex.Split(file_content, #"\s+(?=\d+: \d{2}:\d{2}:\d{2}\.\d{3} - )").ToList();
The way the program works is as follows:
Loads all the data into a string.
The above line of code is used to split the string into log entries and store each entry as an entry in a list. (This is the slow part that I would like to optimize)
Log entries are denoted by the Regex pattern shown above.
In the answer below I put couple of optimizations which you may use.
tl;dr; Speed up log parsing in 6 times by iterating the lines and use custom parsing method (not Regex)
Measurements
Before we attempt to make optimizations I'd propose to define how are we going to measure their impact and value.
For benchmarking I'll use Benchmark.NET framework. Create console application:
static void Main(string[] args)
{
BenchmarkRunner.Run<LogReaderBenchmarks>();
BenchmarkRunner.Run<LogParserBenchmarks>();
BenchmarkRunner.Run<LogBenchmarks>();
Console.ReadLine();
return;
}
Run below command in PackageManagerConsole to add nuget package:
Install-Package BenchmarkDotNet -Version 0.11.5
Test data generator looks like this, run it once, and then just use that temp file all over your benchmarks:
public static class LogFilesGenerator {
public static void GenerateLogFile(string location)
{
var sizeBytes = 512*1024*1024; // 512MB
var line = new StringBuilder();
using (var f = new StreamWriter(location))
{
for (long z = 0; z < sizeBytes; z += line.Length)
{
line.Clear();
line.Append($"{z}: {DateTime.UtcNow.TimeOfDay.ToString(#"hh\:mm\:ss\.fff")} - ");
for (var l = -1; l < z % 3; l++)
line.AppendLine(Guid.NewGuid().ToString());
f.WriteLine(line);
}
f.Close();
}
}
}
Reading file
And commentators pointed - that is very inefficient to read the whole file to memory, GC will be very unhappy, let's read it line-by-line.
The simplest way to achieve this is just using File.ReadLines() method which returns you non-materialized enumerable - you'll read the file while you are iterating over it.
You can also read file asynchronously as explained here. This is rather useless approach as I still merge everything to a single line, so I'm a bit speculating here when will comment on the results :)
| Method | buffer | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------------- |------- |--------:|------------:|-----------:|----------:|----------:|
| ReadFileToMemory | ? | 1.919 s | 181000.0000 | 93000.0000 | 6000.0000 | 2.05 GB |
| ReadFileEnumerating | ? | 1.881 s | 314000.0000 | - | - | 1.38 GB |
| ReadFileToMemoryAsync | 4096 | 9.254 s | 248000.0000 | 68000.0000 | 6000.0000 | 1.92 GB |
| ReadFileToMemoryAsync | 16384 | 5.632 s | 215000.0000 | 61000.0000 | 6000.0000 | 1.72 GB |
| ReadFileToMemoryAsync | 65536 | 3.499 s | 196000.0000 | 54000.0000 | 4000.0000 | 1.62 GB |
[RyuJitX64Job]
[MemoryDiagnoser]
[IterationCount(1), InnerIterationCount(1), WarmupCount(0), InvocationCount(1), ProcessCount(1)]
[StopOnFirstError]
public class LogReaderBenchmarks
{
string file = #"C:\Users\Admin\AppData\Local\Temp\tmp6483.tmp";
[GlobalSetup()]
public void Setup()
{
//file = Path.GetTempFileName(); <---- uncomment these lines to generate file first time.
//Console.WriteLine(file);
//LogFilesGenerator.GenerateLogFile(file);
}
[Benchmark(Baseline = true)]
public string ReadFileToMemory() => File.ReadAllText(file);
[Benchmark]
[Arguments(1024*4)]
[Arguments(1024 * 16)]
[Arguments(1024 * 64)]
public async Task<string> ReadFileToMemoryAsync(int buffer) => await ReadTextAsync(file, buffer);
[Benchmark]
public int ReadFileEnumerating() => File.ReadLines(file).Select(l => l.Length).Max();
private async Task<string> ReadTextAsync(string filePath, int bufferSize)
{
using (FileStream sourceStream = new FileStream(filePath,
FileMode.Open, FileAccess.Read, FileShare.Read,
bufferSize: bufferSize, useAsync: true))
{
StringBuilder sb = new StringBuilder();
byte[] buffer = new byte[bufferSize];
int numRead;
while ((numRead = await sourceStream.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
string text = Encoding.Unicode.GetString(buffer, 0, numRead);
sb.Append(text);
}
return sb.ToString();
}
}
}
As you can see ReadFileEnumerating is the fastest. It allocates the same amount of memory as ReadFileToMemory but it is all in Gen 0, so GC can collect it faster, max memory consumption is much smaller than ReadFileToMemory.
Async read does not give any performance gain. If you need throughput, don't use it.
Split log entries
Regex is slow and memory hungry. Passing a huge string will make your application work slow. You can mitigate this problem and check each line of the file if it matches your Regex. You need to reconstruct the whole log entry though if it could be multiline.
Also you can introduce more efficient method that matches your string, check customParseMatch for example. I don't pretend it to be the most efficient, you may write a separate benchmark for predicate, but it already shows a good result comparing to Regex - it is 10 times faster.
| Method | Mean | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------------------- |---------:|------:|------------:|------------:|----------:|----------:|
| SplitByRegex | 24.191 s | 1.00 | 426000.0000 | 119000.0000 | 4000.0000 | 2.65 GB |
| SplitByRegexIterating | 16.302 s | 0.67 | 176000.0000 | 88000.0000 | 1000.0000 | 2.05 GB |
| SplitByCustomParseIterating | 2.385 s | 0.10 | 398000.0000 | - | - | 1.75 GB |
[RyuJitX64Job]
[MemoryDiagnoser]
[IterationCount(1), InnerIterationCount(1), WarmupCount(0), InvocationCount(1), ProcessCount(1)]
[StopOnFirstError]
public class LogParserBenchmarks
{
string file = #"C:\Users\Admin\AppData\Local\Temp\tmp6483.tmp";
string[] lines;
string text;
Regex split_regex = new Regex(#"\s+(?=\d+: \d{2}:\d{2}:\d{2}\.\d{3} - )");
[GlobalSetup()]
public void Setup()
{
lines = File.ReadAllLines(file);
text = File.ReadAllText(file);
}
[Benchmark(Baseline = true)]
public string[] SplitByRegex() => split_regex.Split(text);
[Benchmark]
public int SplitByRegexIterating() =>
parseLogEntries(lines, split_regex.IsMatch).Count();
[Benchmark]
public int SplitByCustomParseIterating() =>
parseLogEntries(lines, customParseMatch).Count();
public static bool customParseMatch(string line)
{
var refinedLine = line.TrimStart();
var colonIndex = refinedLine.IndexOf(':');
if (colonIndex < 0) return false;
if (!int.TryParse(refinedLine.Substring(0,colonIndex), out var _)) return false;
if (refinedLine[colonIndex + 1] != ' ') return false;
if (!TimeSpan.TryParseExact(refinedLine.Substring(colonIndex + 2,12), #"hh\:mm\:ss\.fff", CultureInfo.InvariantCulture, out var _)) return false;
return true;
}
IEnumerable<string> parseLogEntries(IEnumerable<string> lines, Predicate<string> entryMatched)
{
StringBuilder builder = new StringBuilder();
foreach (var line in lines)
{
if (entryMatched(line) && builder.Length > 0)
{
yield return builder.ToString();
builder.Clear();
}
builder.AppendLine(line);
}
if (builder.Length > 0)
yield return builder.ToString();
}
}
Parallelism
If your log entries could be multi-line that is not trivial task and I'd leave it to other members to provide a code.
Summary
So iterating over each line and using a custom parse function gives us the best results so far. Let's make a benchmark and check how much did we gain:
| Method | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------------------- |---------:|------------:|------------:|----------:|----------:|
| ReadTextAndSplitByRegex | 29.070 s | 601000.0000 | 198000.0000 | 2000.0000 | 4.7 GB |
| ReadLinesAndSplitByFunction | 4.117 s | 713000.0000 | - | - | 3.13 GB |
[RyuJitX64Job]
[MemoryDiagnoser]
[IterationCount(1), InnerIterationCount(1), WarmupCount(0), InvocationCount(1), ProcessCount(1)]
[StopOnFirstError]
public class LogBenchmarks
{
[Benchmark(Baseline = true)]
public string[] ReadTextAndSplitByRegex()
{
var text = File.ReadAllText(LogParserBenchmarks.file);
return LogParserBenchmarks.split_regex.Split(text);
}
[Benchmark]
public int ReadLinesAndSplitByFunction()
{
var lines = File.ReadLines(LogParserBenchmarks.file);
var entries = LogParserBenchmarks.parseLogEntries(lines, LogParserBenchmarks.customParseMatch);
return entries.Count();
}
}
I'm not going to try to improve on Fenixil's excellent and thorough answer. I would like to point out that while regular expressions are great for some things, as is already apparent they aren't particularly efficient. Below is how the regex you've given is resolved (according to the RegEx Buddy tool).
It takes a bit of work to match a regex. This link How a Regex Engine Works Internally explains the process further.

Parallel.For with BigInteger calculations output different than For loop

I have the following loop that runs a conversion from base95 to base10. I'm working with several thousand digit numbers, so BigIntegers are required. inst is the base95 string.
Parallel.For(0, inst.Length, x =>
{
result += BigInteger.Pow(95, x) * (inst[x] - 32);
});
If I work with base95 strings of about 200 characters or less, it works perfectly fine and outputs the same result that a normal for loop would.
However, once I start increasing the size of the base95 string, the parallel's output is thrown way off. I need to work with base95 strings with 1500+ characters, and even up to 30000. A regular for loop can calculate the result fine.
What could be causing this problem? Is there a better method to this than a Parallel.For loop that's still faster than a for loop?
Its just not thread safe. As to why its not corrupting with smaller strings, i'm not sure. Possibly TPL just doesn't think the workload is worthy of extra threads. Though, i did verified your results, it does produce inconsistent results with larger strings.
The only solution will be to make it thread safe. A cheap and nasty approach will be to use lock... It would be better if you could user another thread safe approach like Interlocked, however, it wont work with BigInteger.
BigInteger result = 0;
object sync = new object();
Parallel.For(
0,
inst.Length,
x =>
{
var temp = BigInteger.Pow(95, x) * (inst[x] - 32);
lock (sync)
result += temp;
});
Its not perfect with all the locking but its still faster than a regular for loop on my pc
Another approach would be to use the for overloads, this way you are only locking once per each thread.
Parallel.For(
0,
inst.Length,
() => new BigInteger(0),
(x, state, subTotal) => subTotal + BigInteger.Pow(95, x) * (inst[x] - 32),
integer =>
{
lock (sync)
result += integer;
});
Benchmarks
So i was bored, here is your bench marks
Tests were run 50 times each, GC.Collect and GC.WaitForPendingFinalizers is run before every test to give cleaner results. All results were tested against each other to prove they are accurate. Scale represents the size of the string as per your question
Setup
----------------------------------------------------------------------------
Mode : Release (64Bit)
Test Framework : .NET Framework 4.7.1 (CLR 4.0.30319.42000)
----------------------------------------------------------------------------
Operating System : Microsoft Windows 10 Pro
Version : 10.0.17134
----------------------------------------------------------------------------
CPU Name : Intel(R) Core(TM) i7-3770K CPU # 3.50GHz
Description : Intel64 Family 6 Model 58 Stepping 9
Cores (Threads) : 4 (8) : Architecture : x64
Clock Speed : 3901 MHz : Bus Speed : 100 MHz
L2Cache : 1 MB : L3Cache : 8 MB
----------------------------------------------------------------------------
Results
--- Random characters -----------------------------------------------------------------
| Value | Average | Fastest | Cycles | Garbage | Test | Gain |
--- Scale 10 ----------------------------------------------------------- Time 0.259 ---
| for | 5.442 µs | 4.968 µs | 21.794 K | 0.000 B | Base | 0.00 % |
| ParallelResult | 32.451 µs | 30.397 µs | 116.808 K | 0.000 B | Pass | -496.25 % |
| ParallelLock | 35.551 µs | 32.443 µs | 127.966 K | 0.000 B | Pass | -553.22 % |
| AsParallel | 141.457 µs | 118.959 µs | 398.676 K | 0.000 B | Pass | -2,499.13 % |
--- Scale 100 ---------------------------------------------------------- Time 0.298 ---
| ParallelResult | 93.261 µs | 80.085 µs | 329.450 K | 0.000 B | Pass | 11.36 % |
| ParallelLock | 103.912 µs | 84.470 µs | 366.599 K | 0.000 B | Pass | 1.23 % |
| for | 105.210 µs | 93.823 µs | 371.025 K | 0.000 B | Base | 0.00 % |
| AsParallel | 183.538 µs | 159.002 µs | 488.534 K | 0.000 B | Pass | -74.45 % |
--- Scale 1,000 -------------------------------------------------------- Time 4.191 ---
| AsParallel | 5.701 ms | 4.932 ms | 15.479 M | 0.000 B | Pass | 65.83 % |
| ParallelResult | 6.510 ms | 5.701 ms | 18.166 M | 0.000 B | Pass | 60.98 % |
| ParallelLock | 6.734 ms | 5.303 ms | 17.314 M | 0.000 B | Pass | 59.64 % |
| for | 16.685 ms | 15.640 ms | 58.183 M | 0.000 B | Base | 0.00 % |
--- Scale 10,000 ------------------------------------------------------ Time 34.805 ---
| AsParallel | 6.205 s | 4.767 s | 19.202 B | 0.000 B | Pass | 47.20 % |
| ParallelResult | 6.286 s | 5.891 s | 14.752 B | 0.000 B | Pass | 46.51 % |
| ParallelLock | 6.290 s | 5.202 s | 9.982 B | 0.000 B | Pass | 46.48 % |
| for | 11.752 s | 11.436 s | 41.136 B | 0.000 B | Base | 0.00 % |
---------------------------------------------------------------------------------------
ParallelLock
[Test("ParallelLock", "", true)]
public BigInteger Test1(string input, int scale)
{
BigInteger result = 0;
object sync = new object();
Parallel.For(
0,
input.Length,
x =>
{
var temp = BigInteger.Pow(95, x) * (input[x] - 32);
lock (sync)
result += temp;
});
return result;
}
ParallelResult
[Test("ParallelResult", "", false)]
public BigInteger Test2(string input, int scale)
{
BigInteger result = 0;
object sync = new object();
Parallel.For(
0,
input.Length,
() => new BigInteger(0),
(x, state, subTotal) => subTotal + BigInteger.Pow(95, x) * (input[x] - 32),
integer =>
{
lock (sync)
result += integer;
});
return result;
}
AsParallel as tendered by gdir
[Test("AsParallel", "", false)]
public BigInteger Test4(string input, int scale)
{
return Enumerable.Range(0, input.Length)
.AsParallel()
.Aggregate(
new BigInteger(0),
(subtotal, x) => subtotal + BigInteger.Pow(95, x) * (input[x] - 32),
(total, thisThread) => total + thisThread,
(finalSum) => finalSum);;
}
for
[Test("for", "", false)]
public BigInteger Test3(string input, int scale)
{
BigInteger result = 0;
for (int i = 0; i < input.Length; i++)
{
result += BigInteger.Pow(95, i) * (input[i] - 32);
}
return result;
}
Input
public static string StringOfChar(int scale)
{
var list = Enumerable.Range(1, scale)
.Select(x => (char)(_rand.Next(32)+32))
.ToArray();
return string.Join("", list);
}
Validation
private static bool Validation(BigInteger result, BigInteger baseLine)
{
return result == baseLine;
}
Summary
Parallel will give you a performance boost, the less you can lock the better it is in theory, however there maybe many factors of why the results played out the way they did. its seems the result overload seems to work well, but is pretty similar with the larger workloads, i'm not really sure why. Note i didn't play with the parallel options, and you might be able to tweak it a little bit more for your solution
anyway, good luck

I want to find the maximum frequency of a common digit in a consecutive subset of integer array

The partial digit subsequence of an array A is a subsequence of integers in which each consecutive integers have at least 1 digit in common
I keep a dictionary with 0 to 9 characters and the count of each subsequent characters. then i loop through all values in integer array and take each digit and check my dictionary for the count of that digit.
public static void Main(string[] args)
{
Dictionary<char, int> dct = new Dictionary<char, int>
{
{ '0', 0 },
{ '1', 0 },
{ '2', 0 },
{ '3', 0 },
{ '4', 0 },
{ '5', 0 },
{ '6', 0 },
{ '7', 0 },
{ '8', 0 },
{ '9', 0 }
};
string[] arr = Console.ReadLine().Split(' ');
for (int i = 0; i < arr.Length; i++)
{
string str = string.Join("", arr[i].Distinct());
for (int j = 0; j < str.Length; j++)
{
int count = dct[str[j]];
if (count == i || (i > 0 && arr[i - 1].Contains(str[j])))
{
count++;
dct[str[j]] = count;
}
else dct[str[j]] = 1;
}
}
string s = dct.Aggregate((l, r) => l.Value > r.Value ? l : r).Key.ToString();
Console.WriteLine(s);
}
for e.g,
12 23 231
Answer would be 2 because it occurs 3 times
The array can contain 10^18 elements.
can someone help me with an optimal solution. This algorithm is not fit to handle large data in an array.
All the posted answers are wrong because all of them have ignored the most important part of the question:
The array can contain 10^18 elements.
This array is being read from disk? Supposing each element is two bytes, that's two million terabyte drives just for the array. I don't think that's going to fit into memory. You'll have to go with a streaming solution.
How long will the streaming solution take? If you can process a billion array items a second, which seems within reason, your program will take 32 years to execute.
Your requirements are not realistic, and so the problem cannot feasibly be solved with the resources of a single person. You'll need the resources of a large corporation or nation to attack this problem, and you'll need a lot of funding for hardware acquisition and management.
The linear algorithm is trivial; it's the size of the data that is the entire problem. Start building your data center somewhere with cheap power and friendly tax laws, because you are going to be importing a lot of disks.
You shouldn't need to go through the array elements one by one, you can simply merge the entire string array into 1 string and go through the characters
12 23 231 -> "1223231" , loop through and count it.
It should be fast enough O(n) and require only 10 entries in your dictionary. How "fast" do you exactly need ?
I didn't use arrays, I'm not sure if u must use arrays, if not, check this solution.
static void Main(string[] args)
{
List<char> numbers = new List<char>();
Dictionary<char, int> dct = new Dictionary<char, int>()
{
{ '0',0 },
{ '1',0 },
{ '2',0 },
{ '3',0 },
{ '4',0 },
{ '5',0 },
{ '6',0 },
{ '7',0 },
{ '8',0 },
{ '9',0 },
};
string option;
do
{
Console.Write("Enter number: ");
string number = Console.ReadLine();
numbers.AddRange(number);
Console.Write("Enter 'X' if u want to finish work: ");
option = Console.ReadLine();
} while (option.ToLower() != "x");
foreach(char c in numbers)
{
if(dct.ContainsKey(c))
{
dct[c]++;
}
}
foreach(var keyValue in dct)
{
Console.WriteLine($"Char {keyValue.Key} was used {keyValue.Value} times");
}
Console.ReadKey(true);
}
Certainly not an efficient solution but this will work.
public class Program
{
public static int arrLength = 0;
public static string[] arr;
public static Dictionary<char, int> dct = new Dictionary<char, int>();
public static void Main(string[] args)
{
dct.Add('0', 0);
dct.Add('1', 0);
dct.Add('2', 0);
dct.Add('3', 0);
dct.Add('4', 0);
dct.Add('5', 0);
dct.Add('6', 0);
dct.Add('7', 0);
dct.Add('8', 0);
dct.Add('9', 0);
arr = Console.ReadLine().Split(' ');
arrLength = arr.Length;
foreach (string str in arr)
{
char[] ch = str.ToCharArray();
ch = ch.Distinct<char>().ToArray();
foreach (char c in ch)
{
Exists(c, Array.IndexOf(arr, str));
}
}
int val = dct.Values.Max();
foreach(KeyValuePair<char,int> v in dct.Where(x => x.Value == val))
{
Console.WriteLine("Common digit {0} with frequency {1} ",v.Key,v.Value+1);
}
Console.ReadLine();
}
public static bool Exists(char c, int pos)
{
int count = 0;
if (pos == arrLength - 1)
return false;
for (int i = pos; i < arrLength - 1; i++)
{
if (arr[i + 1].ToCharArray().Contains(c))
{
count++;
if (count > dct[c])
dct[c] = count;
}
else
break;
}
return true;
}
}
As somebody else pointed out, if you have 10^18 numbers then this is going to be a lot more data than you can fit into memory. So you need a streaming solution. You also don't want to spend a lot of time on memory allocation or converting strings to character arrays, calling functions to de-duplicate digits, etc. Ideally, you need a solution that looks at each character once.
The memory requirement of the program below is very small: just two small arrays of long integers.
The algorithm I developed maintains two arrays of counts per digit. One is the maximum number of consecutive occurrences of a digit, and the other is the most recent count of consecutive occurrences.
The code itself reads the file character-by-character, accumulating digits until it encounters a character that is not a digit, then it updates the current counts array for each digit encountered. If the current count exceeds the maximum count, then the max count for that digit is updated. If a digit doesn't appear in a number, then its current count is reset to 0.
The occurrence of individual digits in a number is maintained by setting bits in the digits variable. That way, a number like 1221 will not count the digits twice.
using (var input = File.OpenText("filename"))
{
var maxCounts = new long[]{0,0,0,0,0,0,0,0,0,0};
var latestCounts = new long[]{0,0,0,0,0,0,0,0,0,0};
char prevChar = ' ';
word digits = 0;
while (!input.EndOfStream)
{
var c = input.Read();
// If the character is a digit, set the corresponding bit
if (char.IsDigit(c))
{
digits |= (1 << (c-'0'));
prevChar = c;
continue;
}
// test here to prevent resetting counts when there are multiple non-digit
// characters between numbers.
if (!char.IsDigit(prevChar))
{
continue;
}
prevChar = c;
// digits has a bit set for every digit
// that occurred in the number.
// Update the counts arrays.
// For each of the first 10 bits, update the corresponding count.
for (int i = 0; i < 10; ++i)
{
if ((digits & 1) == 1)
{
++latestCounts[i];
if (latestCounts[i] > maxCounts[i])
{
maxCounts[i] = latestCounts[i];
}
}
else
{
latestCounts[i] = 0;
}
// Shift the next bit into place.
digits >> 1;
}
digits = 0;
}
}
This code minimizes the processing required, but the program's running time will be dominated by the speed at which you can read the file. There are optimizations you can make to increase the input speed, but ultimately you're limited to your system's data transfer speed.
I'll give you three versions.
Basically, I just loaded up a list of random ints as string, the scale is how many, and run it on Core and Framework to see. Each test was run 10 times and averaged.
Mine1
Uses Distinct
public unsafe class Mine : Benchmark<List<string>, char>
{
protected override char InternalRun()
{
var result = new int[10];
var asd = Input.Select(x => new string(x.Distinct().ToArray())).ToList();
var raw = string.Join("", asd);
fixed (char* pInput = raw)
{
var len = pInput + raw.Length;
for (var p = pInput; p < len; p++)
{
result[*p - 48]++;
}
}
return (char)(result.ToList().IndexOf(result.Max()) + '0');
}
}
Mine2
Basically this uses a second array to work things out
public unsafe class Mine2 : Benchmark<List<string>, char>
{
protected override char InternalRun()
{
var result = new int[10];
var current = new int[10];
var raw = string.Join(" ", Input);
fixed (char* pInput = raw)
{
var len = pInput + raw.Length;
for (var p = pInput; p < len; p++)
if (*p != ' ')
current[*p - 48] = 1;
else
for (var i = 0; i < 10; i++)
{
result[i] += current[i];
current[i] = 0;
}
}
return (char)(result.ToList().IndexOf(result.Max()) + '0');
}
}
Mine3
No Joins or string allocation
public unsafe class Mine3 : Benchmark<List<string>, char>
{
protected override char InternalRun()
{
var result = new int[10];
foreach (var item in Input)
fixed (char* pInput = item)
{
var current = new int[10];
var len = pInput + item.Length;
for (var p = pInput; p < len; p++)
current[*p - 48] = 1;
for (var i = 0; i < 10; i++)
{
result[i] += current[i];
current[i] = 0;
}
}
return (char)(result.ToList().IndexOf(result.Max()) + '0');
}
}
#Results .Net Framework 4.7.1
Mode : Release
Test Framework : .Net Framework 4.7.1
Benchmarks runs : 10 times (averaged)
Scale : 10,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
--------------------------------------------------------------------------
Mine3 | 0.533 ms | 0.431 ms | 0.10 | 1,751,372 | Base | 0.00 %
Mine2 | 0.994 ms | 0.773 ms | 0.38 | 3,100,896 | Yes | -86.63 %
Mine | 8.122 ms | 7.012 ms | 1.29 | 27,480,083 | Yes | -1,424.78 %
Original | 20.729 ms | 16.044 ms | 4.56 | 65,316,558 | No | -3,791.47 %
Scale : 100,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
------------------------------------------------------------------------------
Mine3 | 4.766 ms | 4.475 ms | 0.34 | 16,140,716 | Base | 0.00 %
Mine2 | 8.424 ms | 7.890 ms | 0.33 | 28,577,416 | Yes | -76.76 %
Mine | 96.650 ms | 93.066 ms | 3.35 | 327,615,266 | Yes | -1,927.94 %
Original | 163.342 ms | 154.070 ms | 12.61 | 550,038,934 | No | -3,327.32 %
Scale : 1,000,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
------------------------------------------------------------------------------------
Mine3 | 49.827 ms | 48.600 ms | 1.19 | 169,162,589 | Base | 0.00 %
Mine2 | 106.334 ms | 97.641 ms | 6.53 | 359,773,719 | Yes | -113.41 %
Mine | 1,051.600 ms | 1,000.731 ms | 35.75 | 3,511,515,189 | Yes | -2,010.51 %
Original | 1,640.385 ms | 1,588.431 ms | 65.50 | 5,538,915,638 | No | -3,192.18 %
#Results .Net Core 2.0
Mode : Release
Test Framework : Core 2.0
Benchmarks runs : 10 times (averaged)
Scale : 10,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
--------------------------------------------------------------------------
Mine3 | 0.476 ms | 0.353 ms | 0.12 | 1,545,995 | Base | 0.00 %
Mine2 | 0.554 ms | 0.551 ms | 0.00 | 1,883,570 | Yes | -16.23 %
Mine | 7.585 ms | 5.875 ms | 1.27 | 25,580,339 | Yes | -1,492.28 %
Original | 21.380 ms | 16.263 ms | 6.46 | 65,741,909 | No | -4,388.14 %
Scale : 100,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
------------------------------------------------------------------------------
Mine3 | 3.946 ms | 3.685 ms | 0.25 | 13,409,181 | Base | 0.00 %
Mine2 | 6.203 ms | 5.796 ms | 0.33 | 21,042,340 | Yes | -57.21 %
Mine | 72.975 ms | 68.599 ms | 4.13 | 246,471,960 | Yes | -1,749.41 %
Original | 161.400 ms | 145.664 ms | 19.37 | 544,703,761 | Yes | -3,990.40 %
Scale : 1,000,000
Name | Average | Fastest | StDv | Cycles | Pass | Gain
------------------------------------------------------------------------------------
Mine3 | 41.036 ms | 38.928 ms | 2.47 | 139,045,736 | Base | 0.00 %
Mine2 | 71.283 ms | 68.777 ms | 2.49 | 241,525,269 | Yes | -73.71 %
Mine | 749.250 ms | 720.809 ms | 27.79 | 2,479,171,863 | Yes | -1,725.84 %
Original | 1,517.240 ms | 1,477.321 ms | 48.94 | 5,142,422,700 | No | -3,597.35 %
#Summary
String allocation, join, and distinct suck for performance. If you need more performance you could probably break the list up into work loads and smash this in parallel.

Categories