C# Collections vs Arrays: Longest Increasing Subsequence Benefits

C# Collections vs Arrays: Longest Increasing Subsequence Benefits - c#

What's the performance penalty that I can expect if I'm using Lists over Arrays to solve the Longest Increasing Subsequence?
Will the dynamic nature of Lists improve average performance because we're not dealing with sizes we won't actually use?
PS: Any tips on improving performance while still maintaining some readability?
public static int Run(int[] nums)
{
var length = nums.Length;
List<List<int>> candidates = new List<List<int>>();
candidates.Add(new List<int> { nums[0] });
for (int i = 1; i < length; i++)
{
var valueFromArray = nums[i];
var potentialReplacements = candidates.Where(t => t[t.Count-1] > valueFromArray);
foreach (var collection in potentialReplacements)
{
var collectionCount = collection.Count;
if ((collection.Count > 1 && collection[collectionCount - 2] < valueFromArray) || (collectionCount == 1))
{
collection.RemoveAt(collectionCount - 1);
collection.Add(valueFromArray);
}
}
if (!candidates.Any(t => t[t.Count - 1] >= valueFromArray))
{
var newList = new List<int>();
foreach(var value in candidates[candidates.Count - 1])
{
newList.Add(value);
}
newList.Add(nums[i]);
candidates.Add(newList);
}
}
return candidates[candidates.Count - 1].Count;
}

Depending on the solution the results may vary. Arrays are faster when compared with lists of the same size. How more fast? Lets take a look at the c# solution below. This is a simple O(n^2) solution. I coded a version with arrays only and another one with lists only. I'm running it 1000 times and recording the values for both. Then I just print the average improvement of the array version over the list version. I'm getting over 50% improvement on my computer.
Notice that this solution uses arrays and lists with the same sizes always. Than means I never created an array bigger than the size the lists are gonna grow to in the lists version. Once you start creating arrays with a Max size that may not be filled the comparison stops to be fair.
C# code below:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace hashExample
{
class Program
{
static int RunArray(int[] array)
{
int[] dp = new int[array.Length];
dp[0] = 1;
for (int i = 1; i < array.Length; i++)
{
dp[i] = 1;
for (int j = 0; j < i; j++)
if (array[i] > array[j] && dp[i] < dp[j] + 1)
dp[i] = dp[j] + 1;
}
return dp.Max();
}
static int RunList(List<int> array)
{
List<int> dp = new List<int>(array.Count);
dp.Add(1);
for (int i = 1; i < array.Count; i++)
{
dp.Add(1);
for (int j = 0; j < i; j++)
if (array[i] > array[j] && dp[i] < dp[j] + 1)
dp[i] = dp[j] + 1;
}
return dp.Max();
}
static void Main(string[] args)
{
int arrayLen = 1000;
Random r = new Random();
List<double> values = new List<double>();
Stopwatch clock = new Stopwatch();
Console.WriteLine("Running...");
for (int i = 0; i < 100; i++)
{
List<int> list = new List<int>();
int[] array = new int[arrayLen];
for (int j = 0; j < arrayLen;j++)
{
int e = r.Next();
array[j] = e;
list.Add(e);
}
clock.Restart();
RunArray(array);
clock.Stop();
double timeArray = clock.ElapsedMilliseconds;
clock.Restart();
RunList(list);
clock.Stop();
double timeList = clock.ElapsedMilliseconds;
//Console.WriteLine(Math.Round(timeArray/timeList*100,2) + "%");
values.Add(timeArray / timeList);
}
Console.WriteLine("Arrays are " + Math.Round(values.Average()*100,1) + "% faster");
Console.WriteLine("Done");
}
}
}

Related

Why is bubble sort running faster than selection sort?

I am working on a project that compares the time bubble and selection sort take. I made two separate programs and combined them into one and now bubble sort is running much faster than selection sort. I checked to make sure that the code wasn't just giving me 0s because of some conversion error and was running as intended. I am using System.Diagnostics; to measure the time. I also checked that the machine was not the problem, I ran it on Replit and got similar results.
{
class Program
{
public static int s1 = 0;
public static int s2 = 0;
static decimal bubblesort(int[] arr1)
{
int n = arr1.Length;
var sw1 = Stopwatch.StartNew();
for (int i = 0; i < n - 1; i++)
{
for (int j = 0; j < n - i - 1; j++)
{
if (arr1[j] > arr1[j + 1])
{
int tmp = arr1[j];
// swap tmp and arr[i] int tmp = arr[j];
arr1[j] = arr1[j + 1];
arr1[j + 1] = tmp;
s1++;
}
}
}
sw1.Stop();
// Console.WriteLine(sw1.ElapsedMilliseconds);
decimal a = Convert.ToDecimal(sw1.ElapsedMilliseconds);
return a;
}
static decimal selectionsort(int[] arr2)
{
int n = arr2.Length;
var sw1 = Stopwatch.StartNew();
// for (int e = 0; e < 1000; e++)
// {
for (int x = 0; x < arr2.Length - 1; x++)
{
int minPos = x;
for (int y = x + 1; y < arr2.Length; y++)
{
if (arr2[y] < arr2[minPos])
minPos = y;
}
if (x != minPos && minPos < arr2.Length)
{
int temp = arr2[minPos];
arr2[minPos] = arr2[x];
arr2[x] = temp;
s2++;
}
}
// }
sw1.Stop();
// Console.WriteLine(sw1.ElapsedMilliseconds);
decimal a = Convert.ToDecimal(sw1.ElapsedMilliseconds);
return a;
}
static void Main(string[] args)
{
Console.WriteLine("Enter the size of n");
int n = Convert.ToInt32(Console.ReadLine());
Random rnd = new System.Random();
decimal bs = 0M;
decimal ss = 0M;
int s = 0;
int[] arr1 = new int[n];
int tx = 1000; //tx is a variable that I can use to adjust sample size
decimal tm = Convert.ToDecimal(tx);
for (int i = 0; i < tx; i++)
{
for (int a = 0; a < n; a++)
{
arr1[a] = rnd.Next(0, 1000000);
}
ss += selectionsort(arr1);
bs += bubblesort(arr1);
}
bs = bs / tm;
ss = ss / tm;
Console.WriteLine("Bubble Sort took " + bs + " miliseconds");
Console.WriteLine("Selection Sort took " + ss + " miliseconds");
}
}
}
What is going on? What is causing bubble sort to be fast or what is slowing down Selection sort? How can I fix this?
I found that the problem was that the Selection Sort was looping 1000 times per method run in addition to the 1000 runs for sample size, causing the method to perform significantly worse than bubble sort. Thank you guys for help and thank you TheGeneral for showing me the benchmarking tools. Also, the array that was given as a parameter was a copy instead of a reference, as running through the loop manually showed me that the bubble sort was doing it's job and not sorting an already sorted array.

To solve your initial problem you just need to copy your arrays, you can do this easily with ToArray():
Creates an array from a IEnumerable.
ss += selectionsort(arr1.ToArray());
bs += bubblesort(arr1.ToArray());
However let's learn how to do a more reliable benchmark with BenchmarkDotNet:
BenchmarkDotNet Nuget
Official Documentation
Given
public class Sort
{
public static void BubbleSort(int[] arr1)
{
int n = arr1.Length;
for (int i = 0; i < n - 1; i++)
{
for (int j = 0; j < n - i - 1; j++)
{
if (arr1[j] > arr1[j + 1])
{
int tmp = arr1[j];
// swap tmp and arr[i] int tmp = arr[j];
arr1[j] = arr1[j + 1];
arr1[j + 1] = tmp;
}
}
}
}
public static void SelectionSort(int[] arr2)
{
int n = arr2.Length;
for (int x = 0; x < arr2.Length - 1; x++)
{
int minPos = x;
for (int y = x + 1; y < arr2.Length; y++)
{
if (arr2[y] < arr2[minPos])
minPos = y;
}
if (x != minPos && minPos < arr2.Length)
{
int temp = arr2[minPos];
arr2[minPos] = arr2[x];
arr2[x] = temp;
}
}
}
}
Benchmark code
[SimpleJob(RuntimeMoniker.Net50)]
[MemoryDiagnoser()]
public class SortBenchmark
{
private int[] data;
[Params(100, 1000)]
public int N;
[GlobalSetup]
public void Setup()
{
var r = new Random(42);
data = Enumerable
.Repeat(0, N)
.Select(i => r.Next(0, N))
.ToArray();
}
[Benchmark]
public void Bubble() => Sort.BubbleSort(data.ToArray());
[Benchmark]
public void Selection() => Sort.SelectionSort(data.ToArray());
}
Usage
static void Main(string[] args)
{
BenchmarkRunner.Run<SortBenchmark>();
}
Results
Method
N
Mean
Error
StdDev
Bubble
100
8.553 us
0.0753 us
0.0704 us
Selection
100
4.757 us
0.0247 us
0.0231 us
Bubble
1000
657.760 us
7.2581 us
6.7893 us
Selection
1000
300.395 us
2.3302 us
2.1796 us
Summary
What have we learnt? Your bubble sort code is slower ¯\_(ツ)_/¯

It looks like you're passing in the sorted array into Bubble Sort. Because arrays are passed by reference, the sort that you're doing on the array is editing the same contents of the array that will be eventually passed into bubble sort.
Make a second array and pass the second array into bubble sort.

Selection-sort algorithm sorting wrong

Okay I'm having a problem with the selection sort algorithm. It'll sort ints just fine but when I try to use it for doubles it starts to sort randomly.
here's my code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Sorter.ListSort;
using System.Collections;
namespace ConsoleApp20
{
class Program
{
static void Main(string[] args)
{
var x = new List<double>();
x.Add(23.1);
x.Add(1.5);
x.Add(3);
x.Add(15.23);
x.Add(101.2);
x.Add(23.35);
var sorted = selectionSort(x);
foreach (double s in sorted)
Console.WriteLine(s);
Console.ReadLine();
}
public static List<double> selectionSort(List<double> data)
{
int count = data.Count;
// Console.WriteLine(count);
for (int i = 0; i < count - 1; i++)
{
int min = i;
for (int j = i + 1; j < count; j++)
{
if (data[j] < data[min])
min = j;
double temp = data[min];
data[min] = data[i];
data[i] = temp;
}
}
return data;
}
}
}
Now this is what the algorithm is returning
As we can see 3 is NOT bigger than 15.23, what's going on?

Like MoreON already mentioned in the comments, you should swap the elements after you've found the min value.
So it should look like this
public static List<double> selectionSort(List<double> data)
{
int count = data.Count;
// Console.WriteLine(count);
for (int i = 0; i < count - 1; i++)
{
int min = i;
for (int j = i + 1; j < count; j++)
{
if (data[j] < data[min])
min = j;
}
double temp = data[min];
data[min] = data[i];
data[i] = temp;
}
return data;
}
But if you don't want to reinvent the wheel, you could also use:
var sorted = x.OrderBy(o => o).ToList();

You need to change place for int min
for (int i = 0; i < count - 1; i++)
{
for (int j = i + 1; j < count; j++)
{
int min = i;
if (data[j] < data[min])
min = j;
double temp = data[min];
data[min] = data[i];
data[i] = temp;
}
}

Add a new method called swap and use the following selection sort code.
void swap(int *a,int*b){
int temp=*a;
*a=*b;
*b=temp;
}
void selectionSort(int arr[],int n){
int min_index;
for(int i=0;i<n-1;i++)
{
min_index=i;
for(int j=i+1;j<n;j++)
{
if(arr[j]<arr[min_index])
min_index=j;
}
swap(&arr[min_index],&arr[i]);
}
}

Optimizing my image search routine

My routine is as follows (the SpeedyBitmap class just locks each bitmap for faster pixel data access):
private static bool TestRange(int number, int lower, int upper) {
return number >= lower && number <= upper;
}
public static List<Point> Search(Bitmap toSearch, Bitmap toFind, double percentMatch = 0.85, byte rTol = 2, byte gTol = 2, byte bTol = 2) {
List<Point> points = new List<Point>();
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
int findArea = toFind.Width * toFind.Height;
int allowedMismatches = findArea - (int)(findArea * percentMatch);
int mismatches = 0;
using (SpeedyBitmap speedySearch = new SpeedyBitmap(toSearch)) {
using (SpeedyBitmap speedyFind = new SpeedyBitmap(toFind)) {
for (int i = 0; i < speedySearch.Height - speedyFind.Height + 1; i++) {
for (int j = 0; j < speedySearch.Width - speedyFind.Width + 1; j++) {
for (int k = 0; k < speedyFind.Height; k++) {
for (int l = 0; l < speedyFind.Width; l++) {
Color searchColor = speedySearch[j + l, i + k];
Color findColor = speedyFind[l, k];
if (!TestRange(searchColor.R, findColor.R - rTol, findColor.R + rTol) ||
!TestRange(searchColor.G, findColor.G - gTol, findColor.G + gTol) ||
!TestRange(searchColor.B, findColor.B - bTol, findColor.B + bTol)) {
mismatches++;
if (mismatches > allowedMismatches) {
mismatches = 0;
goto notFound;
}
}
}
}
points.Add(new Point(j, i));
continue;
notFound:
;
}
}
}
}
Console.WriteLine(stopwatch.ElapsedMilliseconds);
return points;
}
Searching a moderately sized image (1000x1000) takes over 20 seconds. Removing the percent match takes it down to a few hundred milliseconds, so I think I've established where the major bottleneck is.
How can I make this run faster? Perhaps I could flatten the two-dimensional arrays down to one-dimensional arrays and then apply some sort of sequence check on the two to possibly match up the longest sequence of where the toFind bitmap data appears with the respective tolerances and match percent. If this would be a good solution, how would I begin to implement it?

fastest way to make an element in a list as the first element

I have list and I want a short and fast way to make one of its element as the first.
I have in mind to use the code below to select the 10th element and make it first. But looking for a better soln
tempList.Insert(0, tempList[10]);
tempList.RemoveAt(11);

if you don't mind the ordering of the rest, you actually could swap the two items at position 0 and 10, believe it's better than doing insertion and deletion:
var other = tempList[0];
tempList[0]=tempList[10];
tempList[10] = other;
and you can even make this an extension of List for ease of use, something like:
public static void Swap<T>(this List<T> list, int oldIndex, int newIndex)
{
// place the swap code here
}

There are some special cases when you could get better performance (for the sake of simplicity, I'm assuming that the value is always inserted before the position where it was taken from):
class Program
{
const int Loops = 10000;
const int TakeLoops = 10;
const int ItemsCount = 100000;
const int Multiplier = 500;
const int InsertAt = 0;
static void Main(string[] args)
{
var tempList = new List<int>();
var tempDict = new Dictionary<int, int>();
for (int i = 0; i < ItemsCount; i++)
{
tempList.Add(i);
tempDict.Add(i, i);
}
var limit = 0;
Stopwatch
sG = new Stopwatch(),
s1 = new Stopwatch(),
s2 = new Stopwatch();
TimeSpan
t1 = new TimeSpan(),
t2 = new TimeSpan();
for (int k = 0; k < TakeLoops; k++)
{
var takeFrom = k * Multiplier + InsertAt;
s1.Restart();
for (int i = 0; i < Loops; i++)
{
tempList.Insert(InsertAt, tempList[takeFrom]);
tempList.RemoveAt(takeFrom + 1);
}
s1.Stop();
t1 += s1.Elapsed;
s2.Restart();
for (int i = 0; i < Loops; i++)
{
var e = tempDict[takeFrom];
for (int j = takeFrom - InsertAt; j > InsertAt; j--)
{
tempDict[InsertAt + j] = tempDict[InsertAt + j - 1];
}
tempDict[InsertAt] = e;
}
s2.Stop();
t2 += s2.Elapsed;
if (s2.Elapsed > s1.Elapsed || limit == 0)
limit = takeFrom;
}
sG.Start();
for (int k = 0; k < TakeLoops; k++)
{
var takeFrom = k * Multiplier + InsertAt;
if (takeFrom >= limit)
{
for (int i = 0; i < Loops; i++)
{
tempList.Insert(InsertAt, tempList[takeFrom]);
tempList.RemoveAt(takeFrom + 1);
}
}
else
{
for (int i = 0; i < Loops; i++)
{
var e = tempDict[takeFrom];
for (int j = takeFrom - InsertAt; j > InsertAt; j--)
{
tempDict[InsertAt + j] = tempDict[InsertAt + j - 1];
}
tempDict[InsertAt] = e;
}
}
}
sG.Stop();
Console.WriteLine("List: {0}", t1);
Console.WriteLine("Dictionary: {0}", t2);
Console.WriteLine("Optimized: {0}", sG.Elapsed);
/***************************
List: 00:00:11.9061505
Dictionary: 00:00:08.9502043
Optimized: 00:00:08.2504321
****************************/
}
}
In the example above, a Dictionary<int,int> is being used to store the index of each element. You will get better results if the gap between insertAt and takeFrom is smaller. As this interval increases, the performance will degrade. I guess you might want to evaluate this gap and take the optimal branch based on it's value.

Need to optimise counting positive and negative values

I need to optimise code that counts pos/neg values and remove non-qualified values by time.
I have queue of values with time-stamp attached.
I need to discard values which are 1ms old and count negative and positive values. here is pseudo code
list<val> l;
v = q.dequeue();
deleteold(l, v.time);
l.add(v);
negcount = l.count(i => i.value < 0);
poscount = l.count(i => i.value >= 0);
if(negcount == 10) return -1;
if(poscount == 10) return 1;
I need this code in c# working with max speed. No need to stick to the List. In fact arrays separated for neg and pos values are welcome.
edit: probably unsafe arrays will be the best. any hints?
EDIT: thanks for the heads up.. i quickly tested array version vs list (which i already have) and the list is faster: 35 vs 16 ms for 1 mil iterations...
Here is the code for fairness sake:
class Program
{
static int LEN = 10;
static int LEN1 = 9;
static void Main(string[] args)
{
Var[] data = GenerateData();
Stopwatch sw = new Stopwatch();
for (int i = 0; i < 30; i++)
{
sw.Reset();
ArraysMethod(data, sw);
Console.Write("Array: {0:0.0000}ms ", sw.ElapsedTicks / 10000.0);
sw.Reset();
ListMethod(data, sw);
Console.WriteLine("List: {0:0.0000}ms", sw.ElapsedTicks / 10000.0);
}
Console.ReadLine();
}
private static void ArraysMethod(Var[] data, Stopwatch sw)
{
int signal = 0;
int ni = 0, pi = 0;
Var[] n = new Var[LEN];
Var[] p = new Var[LEN];
for (int i = 0; i < LEN; i++)
{
n[i] = new Var();
p[i] = new Var();
}
sw.Start();
for (int i = 0; i < DATALEN; i++)
{
Var v = data[i];
if (v.val < 0)
{
int x = 0;
ni = 0;
// time is not sequential
for (int j = 0; j < LEN; j++)
{
long diff = v.time - n[j].time;
if (diff < 0)
diff = 0;
// too old
if (diff > 10000)
x = j;
else
ni++;
}
n[x] = v;
if (ni >= LEN1)
signal = -1;
}
else
{
int x = 0;
pi = 0;
// time is not sequential
for (int j = 0; j < LEN; j++)
{
long diff = v.time - p[j].time;
if (diff < 0)
diff = 0;
// too old
if (diff > 10000)
x = j;
else
pi++;
}
p[x] = v;
if (pi >= LEN1)
signal = 1;
}
}
sw.Stop();
}
private static void ListMethod(Var[] data, Stopwatch sw)
{
int signal = 0;
List<Var> d = new List<Var>();
sw.Start();
for (int i = 0; i < DATALEN; i++)
{
Var v = data[i];
d.Add(new Var() { time = v.time, val = v.val < 0 ? -1 : 1 });
// delete expired
for (int j = 0; j < d.Count; j++)
{
if (v.time - d[j].time < 10000)
d.RemoveAt(j--);
else
break;
}
int cnt = 0;
int k = d.Count;
for (int j = 0; j < k; j++)
{
cnt += d[j].val;
}
if ((cnt >= 0 ? cnt : -cnt) >= LEN)
signal = 9;
}
sw.Stop();
}
static int DATALEN = 1000000;
private static Var[] GenerateData()
{
Random r = new Random(DateTime.Now.Millisecond);
Var[] data = new Var[DATALEN];
Var prev = new Var() { val = 0, time = DateTime.Now.TimeOfDay.Ticks};
for (int i = 0; i < DATALEN; i++)
{
int x = r.Next(20);
data[i] = new Var() { val = x - 10, time = prev.time + x * 1000 };
}
return data;
}
class Var
{
public int val;
public long time;
}
}

To get negcount and poscount, you are traversing the entire list twice.
Instead, traverse it once (to compute negcount), and then poscount = l.Count - negcount.

Some ideas:
Only count until max(negcount,poscount) becomes 10, then quit (no need to count the rest). Only works if 10 is the maximum count.
Count negative and positive items in 1 go.
Calculate only negcount and infer poscount from count-negcount which is easier to do than counting them both.
Whether any of them are faster than what you have now, and which is fastest, depends among other things on what the data typically looks like. Is it long? Short?
Some more about 3:
You can use trickery to avoid branches here. You don't have to test whether the item is negative, you can add its negativity to a counter. Supposing the item is x and it is an int, x >> 31 is 0 for positive x and -1 for negative x. So counter -= x >> 31 will give negcount.
Edit: unsafe arrays can be faster, but shouldn't be in this case, because the loop would be of the form
for (int i = 0; i < array.Length; i++)
do something with array[i];
Which is optimized by the JIT compiler.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Collections vs Arrays: Longest Increasing Subsequence Benefits - c#

Related

Why is bubble sort running faster than selection sort?

Selection-sort algorithm sorting wrong

Optimizing my image search routine

fastest way to make an element in a list as the first element

Need to optimise counting positive and negative values

Categories

Resources