How To Make This For Loop Faster - c#

I have the problem that this for loop takes so much time to complete.
I want a faster way to complete it.
ArrayList arrayList = new ArrayList();
byte[] encryptedBytes = null;
for (int i = 0; i < iterations; i++)
{
encryptedBytes = Convert.FromBase64String(inputString.Substring(base64BlockSize * i,
base64BlockSize));
arrayList.AddRange(rsaCryptoServiceProvider.Decrypt(encryptedBytes, true));
}
The iterations variable sometimes is larger than 100,000 and that takes like for ever.

Did you consider running the decryption process in a parallel loop. Your input strings have to be prepared first in a regular loop, but that's a quick process. Then you run the decryption in Parallel.For:
var inputs = new List<string>();
var result = new string[(inputString.Length / 64) - 1];
// Create inputs from the input string.
for (int i = 0; i < iterations; ++i)
{
inputs.Add(inputString.Substring(base64BlockSize * i, base64BlockSize));
}
Parallel.For(0, iterations, i =>
{
var encryptedBytes = Convert.FromBase64String(inputs[i]);
result[i] = rsaCryptoServiceProvider.Decrypt(encryptedBytes, true);
});
I assumed the result returned is a string but if that's not the case then you have to adjust the type for the concurrent bag collection.

Related

C# performance - pointer to span in a hot loop

I'm looking for a faster alternative to BitConverter:
But! Inside a "hot loop":
//i_k_size = 8 bytes
while (fs.Read(ba_buf, 0, ba_buf.Length) > 0 && dcm_buf_read_ctr < i_buf_reads)
{
Span<byte> sp_data = ba_buf.AsSpan();
for (int i = 0; i < ba_buf.Length; i += i_k_size)
{
UInt64 k = BitConverter.ToUInt64(sp_data.Slice(i, i_k_size));
}
}
My efforts to integrate a pointer with conversion - made performance worse. Can a pointer be used to maki it faster with span?
Below is the benchmark: pointer 2 array is 2x faster
Actually I want this code to be used instead of BitConverter:
public static int l_1gb = 1073741824;
static unsafe void Main(string[] args)
{
Random rnd = new Random();
Stopwatch sw1 = new();
sw1.Start();
byte[] k = new byte[8];
fixed (byte* a2rr = &k[0])
{
for (int i = 0; i < 1000000000; i++)
{
rnd.NextBytes(k);
//UInt64 p1 = BitConverter.ToUInt64(k);
//time: 10203.824
//time: 10508.981
//time: 10246.784
//time: 10285.889
//UInt64* uint64ptr = (UInt64*)a2rr;
//x2 performance !
UInt64 p2 = *(UInt64*)a2rr;
//time: 4609.814
//time: 4588.157
//time: 4634.494
}
}
Console.WriteLine($"time: {Math.Round(sw1.Elapsed.TotalMilliseconds, 3)}");
}
Assuming ba_buf is a byte[], a very easy and efficient way to run your loop is as such:
foreach(var value in MemoryMarshal.Cast<byte, ulong>(ba_buf))
// work with value here
If you need to finesse the buffer (for example, to cut off parts of it), use AsSpan(start, count) on it first.
You can optimise this quite a lot by initialising some spans outside the reading loop and then read directly into a Span<byte> and access the data via a Span<ulong> like so:
int buf_bytes = sizeof(ulong) * 1024; // Or whatever buffer size you need.
var ba_buf = new byte[buf_bytes];
var span_buf = ba_buf.AsSpan();
var data_span = MemoryMarshal.Cast<byte, ulong>(span_buf);
while (true)
{
int count = fs.Read(span_buf) / sizeof(ulong);
if (count == 0)
break;
for (int i = 0; i < count; i++)
{
// Do something with data_span[i]
Console.WriteLine(data_span[i]); // Put your own processing here.
}
}
This avoids memory allocation as much as possible. It terminates the reading loop when it runs out of data, and if the number of bytes returned is not a multiple of sizeof(ulong) it ignores the extra bytes.
It will always read all the available data, but if you want to terminate it earlier you can add code to do so.
As an example, consider this code which writes 2,000 ulong values to a file and then reads them back in using the code above:
using (var output = File.OpenWrite("x"))
{
for (ulong i = 0; i < 2000; ++i)
{
output.Write(BitConverter.GetBytes(i));
}
}
using var fs = File.OpenRead("x");
int buf_bytes = sizeof(ulong) * 1024; // Or whatever buffer size you need.
var ba_buf = new byte[buf_bytes];
var span_buf = ba_buf.AsSpan();
var data_span = MemoryMarshal.Cast<byte, ulong>(span_buf);
while (true)
{
int count = fs.Read(span_buf) / sizeof(ulong);
if (count == 0)
break;
for (int i = 0; i < count; i++)
{
// Do something with data_span[i]
Console.WriteLine(data_span[i]); // Put your own processing here.
}
}

C# - Code optimization to get all substrings from a string

I was working on a code snippet to get all substrings from a given string.
Here is the code that I use
var stringList = new List<string>();
for (int length = 1; length < mainString.Length; length++)
{
for (int start = 0; start <= mainString.Length - length; start++)
{
var substring = mainString.Substring(start, length);
stringList.Add(substring);
}
}
It looks not so great to me, with two for loops. Is there any other way that I can achieve this with better time complexity.
I am stuck on the point that, for getting a substring, I will surely need two loops. Is there any other way I can look into ?
The number of substrings in a string is O(n^2), so one loop inside another is the best you can do. You are correct in your code structure.
Here's how I would've phrased your code:
void Main()
{
var stringList = new List<string>();
string s = "1234";
for (int i=0; i <s.Length; i++)
for (int j=i; j < s.Length; j++)
stringList.Add(s.Substring(i,j-i+1));
}
You do need 2 for loops
Demo here
var input = "asd sdf dfg";
var stringList = new List<string>();
for (int i = 0; i < input.Length; i++)
{
for (int j = i; j < input.Length; j++)
{
var substring = input.Substring(i, j-i+1);
stringList.Add(substring);
}
}
foreach(var item in stringList)
{
Console.WriteLine(item);
}
Update
You cannot improve on the iterations.
However you can improve performance, by using fixed arrays and pointers
In some cases you can significantly increase execution speed by reducing object allocations. In this case by using a single char[] and ArraySegment<of char> to process substrings. This will also lead to use of less address space and decrease in garbage collector load.
Relevant excerpt from Using the StringBuilder Class in .NET page on Microsoft Docs:
The String object is immutable. Every time you use one of the methods in the System.String class, you create a new string object in memory, which requires a new allocation of space for that new object. In situations where you need to perform repeated modifications to a string, the overhead associated with creating a new String object can be costly.
Example implementation:
static List<ArraySegment<char>> SubstringsOf(char[] value)
{
var substrings = new List<ArraySegment<char>>(capacity: value.Length * (value.Length + 1) / 2 - 1);
for (int length = 1; length < value.Length; length++)
for (int start = 0; start <= value.Length - length; start++)
substrings.Add(new ArraySegment<char>(value, start, length));
return substrings;
}
For more information check Fundamentals of Garbage Collection page on Microsoft Docs, what is the use of ArraySegment class? discussion on StackOverflow, ArraySegment<T> Structure page on MSDN and List<T>.Capacity page on MSDN.
Well, O(n**2) time complexity is inevitable, however you can try impove space consumption. In many cases, you don't want all the substrings being materialized, say, as a List<string>:
public static IEnumerable<string> AllSubstrings(string value) {
if (value == null)
yield break; // Or throw ArgumentNullException
for (int length = 1; length < value.Length; ++length)
for (int start = 0; start <= value.Length - length; ++start)
yield return value.Substring(start, length);
}
For instance, let's count all substrings in "abracadabra" which start from a and longer than 3 characters. Please, notice that all we have to do is to loop over susbstrings without saving them into a list:
int count = AllSubstrings("abracadabra")
.Count(item => item.StartsWith("a") && item.Length > 3);
If for any reason you want a List<string>, just add .ToList():
var stringList = AllSubstrings(mainString).ToList();

Parallel algorithm slower than sequential

I spent the last few days on creating a parallel version of a code (college work), but I came to a dead end (at least for me): The parallel version is nearly as twice slower than the sequential one, and I have no clue on why. Here is the code:
Variables.GetMatrix();
int ThreadNumber = Environment.ProcessorCount/2;
int SS = Variables.PopSize / ThreadNumber;
//GeneticAlgorithm GA = new GeneticAlgorithm();
Stopwatch stopwatch = new Stopwatch(), st = new Stopwatch(), st1 = new Stopwatch();
List<Thread> ThreadList = new List<Thread>();
//List<Task> TaskList = new List<Task>();
GeneticAlgorithm[] SubPop = new GeneticAlgorithm[ThreadNumber];
Thread t;
//Task t;
ThreadVariables Instance = new ThreadVariables();
stopwatch.Start();
st.Start();
PopSettings();
InitialPopulation();
st.Stop();
//Lots of attributions...
int SPos = 0, EPos = SS;
for (int i = 0; i < ThreadNumber; i++)
{
int temp = i, StartPos = SPos, EndPos = EPos;
t = new Thread(() =>
{
SubPop[temp] = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, StartPos, EndPos);
SubPop[temp].RunGA();
SubPop[temp].ShowPopulation();
});
t.Start();
ThreadList.Add(t);
SPos = EPos;
EPos += SS;
}
foreach (Thread a in ThreadList)
a.Join();
double BestFit = SubPop[0].BestSol;
string BestAlign = SubPop[0].TV.Debug;
for (int i = 1; i < ThreadNumber; i++)
{
if (BestFit < SubPop[i].BestSol)
{
BestFit = SubPop[i].BestSol;
BestAlign = SubPop[i].TV.Debug;
Variables.ResSave = SubPop[i].TV.ResSave;
Variables.NumSeq = SubPop[i].TV.NumSeq;
}
}
Basically the code creates an array of the object type, instantiante and run the algorithm in each position of the array, and collecting the best value of the object array at the end. This type of algorithm works on a three-dimentional data array, and on the parallel version I assign each thread to process one range of the array, avoiding concurrency on data. Still, I'm getting the slow timing... Any ideas?
I'm using an Core i5, which has four cores (two + two hyperthreading), but any amount of threads greater than one I use makes the code run slower.
What I can explain of the code I'm running in parallel is:
The second method being called in the code I posted makes about 10,000 iterations, and in each iteration it calls one function. This function may or may not call others more (spread across two different objects for each thread) and make lots of calculations, it depends on a bunch of factors which are particular of the algorithm. And all these methods for one thread work in an area of a data array that isn't accessed by the other threads.
With System.Linq there is a lot to make simpler:
int ThreadNumber = Environment.ProcessorCount/2;
int SS = Variables.PopSize / ThreadNumber;
int numberOfTotalIterations = // I don't know what goes here.
var doneAlgorithms = Enumerable.Range(0, numberOfTotalIterations)
.AsParallel() // Makes the whole thing running in parallel
.WithDegreeOfParallelism(ThreadNumber) // We don't need this line if you want the system to manage the number of parallel processings.
.Select(index=> _runAlgorithmAndReturn(index,SS))
.ToArray(); // This is obsolete if you only need the collection of doneAlgorithms to determine the best one.
// If not, keep it to prevent multiple enumerations.
// So we sort algorithms by BestSol ascending and take the first one to determine the "best".
// OrderBy causes a full enumeration, hence the above mentioned obsoletion of the ToArray() statement.
GeneticAlgorithm best = doneAlgorithms.OrderBy(algo => algo.BestSol).First();
BestFit = best.Bestsol;
BestAlign = best.TV.Debug;
Variables.ResSave = best.TV.ResSave;
Variables.NumSeq = best.TV.NumSeq;
And declare a method to make it a bit more readable
/// <summary>
/// Runs a single algorithm and returns it
/// </summary>
private GeneticAlgorithm _runAlgorithmAndReturn(int index, int SS)
{
int startPos = index * SS;
int endPos = startPos + SS;
var algo = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, startPos, endPos);
algo.RunGA();
algo.ShowPopulation();
return algo;
}
There is a big overhead in creating threads.
Instead of creating new threads, use the ThreadPool, as show below:
Variables.GetMatrix();
int ThreadNumber = Environment.ProcessorCount / 2;
int SS = Variables.PopSize / ThreadNumber;
//GeneticAlgorithm GA = new GeneticAlgorithm();
Stopwatch stopwatch = new Stopwatch(), st = new Stopwatch(), st1 = new Stopwatch();
List<WaitHandle> WaitList = new List<WaitHandle>();
//List<Task> TaskList = new List<Task>();
GeneticAlgorithm[] SubPop = new GeneticAlgorithm[ThreadNumber];
//Task t;
ThreadVariables Instance = new ThreadVariables();
stopwatch.Start();
st.Start();
PopSettings();
InitialPopulation();
st.Stop();
//lots of attributions...
int SPos = 0, EPos = SS;
for (int i = 0; i < ThreadNumber; i++)
{
int temp = i, StartPos = SPos, EndPos = EPos;
ManualResetEvent wg = new ManualResetEvent(false);
WaitList.Add(wg);
ThreadPool.QueueUserWorkItem((unused) =>
{
SubPop[temp] = new GeneticAlgorithm(Population, NumSeq, SeqSize, MaxOffset, PopFit, Child, Instance, StartPos, EndPos);
SubPop[temp].RunGA();
SubPop[temp].ShowPopulation();
wg.Set();
});
SPos = EPos;
EPos += SS;
}
ManualResetEvent.WaitAll(WaitList.ToArray());
double BestFit = SubPop[0].BestSol;
string BestAlign = SubPop[0].TV.Debug;
for (int i = 1; i < ThreadNumber; i++)
{
if (BestFit < SubPop[i].BestSol)
{
BestFit = SubPop[i].BestSol;
BestAlign = SubPop[i].TV.Debug;
Variables.ResSave = SubPop[i].TV.ResSave;
Variables.NumSeq = SubPop[i].TV.NumSeq;
}
}
Note that instead of using Join to wait the thread execution, I'm using WaitHandles.
You're creating the threads yourself, so there's some extreme overhead there. Parallelise like the comments suggested. Also make sure the time a single work-unit takes is long enough. A single thread/workunit should be alive for at least ~20 ms.
Pretty basic things really. I'd suggest you really read up on how multi-threading in .NET works.
I see you don't create too many threads. But the optimal threadcount can't be determined just from the processor count. The built-in Parallel class has advanced algorithms to reduce the overall time.
Partitioning and threading are some pretty complex things that require a lot knowledge to get right, so unless you REALLY know what you're doing rely on the Parallel class to handle it for you.

How to initialize integer array in C# [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
c# Leaner way of initializing int array
Basically I would like to know if there is a more efficent code than the one shown below
private static int[] GetDefaultSeriesArray(int size, int value)
{
int[] result = new int[size];
for (int i = 0; i < size; i++)
{
result[i] = value;
}
return result;
}
where size can vary from 10 to 150000. For small arrays is not an issue, but there should be a better way to do the above.
I am using VS2010(.NET 4.0)
C#/CLR does not have built in way to initalize array with non-default values.
Your code is as efficient as it could get if you measure in operations per item.
You can get potentially faster initialization if you initialize chunks of huge array in parallel. This approach will need careful tuning due to non-trivial cost of mutlithread operations.
Much better results can be obtained by analizing your needs and potentially removing whole initialization alltogether. I.e. if array is normally contains constant value you can implement some sort of COW (copy on write) approach where your object initially have no backing array and simpy returns constant value, that on write to an element it would create (potentially partial) backing array for modified segment.
Slower but more compact code (that potentially easier to read) would be to use Enumerable.Repeat. Note that ToArray will cause significant amount of memory to be allocated for large arrays (which may also endup with allocations on LOH) - High memory consumption with Enumerable.Range?.
var result = Enumerable.Repeat(value, size).ToArray();
One way that you can improve speed is by utilizing Array.Copy. It's able to work at a lower level in which it's bulk assigning larger sections of memory.
By batching the assignments you can end up copying the array from one section to itself.
On top of that, the batches themselves can be quite effectively paralleized.
Here is my initial code up. On my machine (which only has two cores) with a sample array of size 10 million items, I was getting a 15% or so speedup. You'll need to play around with the batch size (try to stay in multiples of your page size to keep it efficient) to tune it to the size of items that you have. For smaller arrays it'll end up almost identical to your code as it won't get past filling up the first batch, but it also won't be (noticeably) worse in those cases either.
private const int batchSize = 1048576;
private static int[] GetDefaultSeriesArray2(int size, int value)
{
int[] result = new int[size];
//fill the first batch normally
int end = Math.Min(batchSize, size);
for (int i = 0; i < end; i++)
{
result[i] = value;
}
int numBatches = size / batchSize;
Parallel.For(1, numBatches, batch =>
{
Array.Copy(result, 0, result, batch * batchSize, batchSize);
});
//handle partial leftover batch
for (int i = numBatches * batchSize; i < size; i++)
{
result[i] = value;
}
return result;
}
Another way to improve performance is with a pretty basic technique: loop unrolling.
I have written some code to initialize an array with 20 million items, this is done repeatedly 100 times and an average is calculated. Without unrolling the loop, this takes about 44 MS. With loop unrolling of 10 the process is finished in 23 MS.
private void Looper()
{
int repeats = 100;
float avg = 0;
ArrayList times = new ArrayList();
for (int i = 0; i < repeats; i++)
times.Add(Time());
Console.WriteLine(GetAverage(times)); //44
times.Clear();
for (int i = 0; i < repeats; i++)
times.Add(TimeUnrolled());
Console.WriteLine(GetAverage(times)); //22
}
private float GetAverage(ArrayList times)
{
long total = 0;
foreach (var item in times)
{
total += (long)item;
}
return total / times.Count;
}
private long Time()
{
Stopwatch sw = new Stopwatch();
int size = 20000000;
int[] result = new int[size];
sw.Start();
for (int i = 0; i < size; i++)
{
result[i] = 5;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
return sw.ElapsedMilliseconds;
}
private long TimeUnrolled()
{
Stopwatch sw = new Stopwatch();
int size = 20000000;
int[] result = new int[size];
sw.Start();
for (int i = 0; i < size; i += 10)
{
result[i] = 5;
result[i + 1] = 5;
result[i + 2] = 5;
result[i + 3] = 5;
result[i + 4] = 5;
result[i + 5] = 5;
result[i + 6] = 5;
result[i + 7] = 5;
result[i + 8] = 5;
result[i + 9] = 5;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
return sw.ElapsedMilliseconds;
}
Enumerable.Repeat(value, size).ToArray();
Reading up Enumerable.Repeat is 20 times slower than the ops standard for loop and the only thing I found which might improve its speed is
private static int[] GetDefaultSeriesArray(int size, int value)
{
int[] result = new int[size];
for (int i = 0; i < size; ++i)
{
result[i] = value;
}
return result;
}
NOTE the i++ is changed to ++i. i++ copies i, increments i, and returns the original value. ++i just returns the incremented value
As someone already mentioned, you can leverage parallel processing like this:
int[] result = new int[size];
Parallel.ForEach(result, x => x = value);
return result;
Sorry I had no time to do performance testing on this (don't have VS installed on this machine) but if you can do it and share the results it would be great.
EDIT: As per comment, while I still think that in terms of performance they are equivalent, you can try the parallel for loop:
Parallel.For(0, size, i => result[i] = value);

Taking large (1 million) number of substring (100 character wide) reads from a long string (3 million characters)

How can I take 1 million substring from a string with more than 3 million characters efficiently in C#? I have written a program which involves reading random DNA reads (substrings from random position) of length say 100 from a string with 3 million characters. There are 1 million such reads. Currently i run a while loop that runs 1 million times and read a substring of 100 character length from the string with 3 million character. This is taking a long time. What can i do to complete this faster?
heres my code, len is the length of the original string, 3 million in this case, it may be as low as 50 thats why the check in the while loop.
while(i < 1000000 && len-100> 0) //len is 3000000
{
int randomPos = _random.Next()%(len - ReadLength);
readString += all.Substring(randomPos, ReadLength) + Environment.NewLine;
i++;
}
Using a StringBuilder to assemble the string will get you a 600 times increase in processing (as it avoids repeated object creation everytime you append to the string.
before loop (initialising capacity avoids recreating the backing array in StringBuilder):
StringBuilder sb = new StringBuilder(1000000 * ReadLength);
in loop:
sb.Append(all.Substring(randomPos, ReadLength) + Environment.NewLine);
after loop:
readString = sb.ToString();
Using a char array instead of a string to extract the values yeilds another 30% improvement as you avoid object creation incurred when calling Substring():
before loop:
char[] chars = all.ToCharArray();
in loop:
sb.Append(chars, randomPos, ReadLength);
sb.AppendLine();
Edit (final version which does not use StringBuilder and executes in 300ms):
char[] chars = all.ToCharArray();
var iterations = 1000000;
char[] results = new char[iterations * (ReadLength + 1)];
GetRandomStrings(len, iterations, ReadLength, chars, results, 0);
string s = new string(results);
private static void GetRandomStrings(int len, int iterations, int ReadLength, char[] chars, char[] result, int resultIndex)
{
Random random = new Random();
int i = 0, index = resultIndex;
while (i < iterations && len - 100 > 0) //len is 3000000
{
var i1 = len - ReadLength;
int randomPos = random.Next() % i1;
Array.Copy(chars, randomPos, result, index, ReadLength);
index += ReadLength;
result[index] = Environment.NewLine[0];
index++;
i++;
}
}
I think better solutions will come, but .NET StringBuilder class instances are faster than String class instances because it handles data as a Stream.
You can split the data in pieces and use .NET Task Parallel Library for Multithreading and Parallelism
Edit: Assign fixed values to a variable out of the loop to avoid recalculation;
int x = len-100
int y = len-ReadLength
use
StringBuilder readString= new StringBuilder(ReadLength * numberOfSubStrings);
readString.AppendLine(all.Substring(randomPos, ReadLength));
for Parallelism you should split your input to pieces. Then run these operations on pieces in seperate threads. Then combine the results.
Important: As my previous experiences showed these operations run faster with .NET v2.0 rather than v4.0, so you should change your projects target framework version; but you can't use Task Parallel Library with .NET v2.0 so you should use multithreading in oldschool way like
Thread newThread ......
How long is a long time ? It shouldn't be that long.
var file = new StreamReader(#"E:\Temp\temp.txt");
var s = file.ReadToEnd();
var r = new Random();
var sw = new Stopwatch();
sw.Start();
var range = Enumerable.Range(0,1000000);
var results = range.Select( i => s.Substring(r.Next(s.Length - 100),100)).ToList();
sw.Stop();
sw.ElapsedMilliseconds.Dump();
s.Length.Dump();
So on my machine the results were 807ms and the string is 4,055,442 chars.
Edit: I just noticed that you want a string as a result, so my above solution just changes to...
var results = string.Join(Environment.NewLine,range.Select( i => s.Substring(r.Next(s.Length - 100),100)).ToArray());
And adds about 100ms, so still under a second in total.
Edit: I abandoned the idea to use memcpy, and I think the result is super great.
I've broken a 3m length string into 30k strings of 100 length each in 43 milliseconds.
private static unsafe string[] Scan(string hugeString, int subStringSize)
{
var results = new string[hugeString.Length / subStringSize];
var gcHandle = GCHandle.Alloc(hugeString, GCHandleType.Pinned);
var currAddress = (char*)gcHandle.AddrOfPinnedObject();
for (var i = 0; i < results.Length; i++)
{
results[i] = new string(currAddress, 0, subStringSize);
currAddress += subStringSize;
}
return results;
}
To use the method for the case shown in the question:
const int size = 3000000;
const int subSize = 100;
var stringBuilder = new StringBuilder(size);
var random = new Random();
for (var i = 0; i < size; i++)
{
stringBuilder.Append((char)random.Next(30, 80));
}
var hugeString = stringBuilder.ToString();
var stopwatch = Stopwatch.StartNew();
for (int i = 0; i < 1000; i++)
{
var strings = Scan(hugeString, subSize);
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds / 1000); // 43

Categories