Stopwatch displaying wrong times? - c#

public void GnomeSort<T>(IList<T> list, IComparer<T> comparer)
{
sortTimer = new Stopwatch();
sortTimer.Start();
bool stillGoing = true;
while (stillGoing)
{
stillGoing = false;
for (int i = 1; i < list.Count; )
{
T x = list[i - 1];
T y = list[i];
if (comparer.Compare(x, y) <= 0)
i++;
else
{
list[i - 1] = y;
list[i] = x;
i--;
if (i == 0)
i = 1;
stillGoing = true;
}
}
}
sortTimer.Stop();
richTextBox1.Text += "Gnome Sorting completed, total time taken " + sortTimer.Elapsed + "\n";
}
If I run this twice, using the same unsorted randomly generated array here:
randomArray = randomizedArray
(Convert.ToInt32(textBox1.Text), Convert.ToInt32(textBox2.Text));
randomArrayGnome = randomArray;
randomArrayBubble = randomArray;
randomArrayInsertion = randomArray;
GnomeSort(randomArray);
BubbleSort(randomArrayBubble);
But it outputs something close to this:
Gnome Sorting completed, total time taken 00:00:02.5419864
Bubble Sorting completed, total time taken 00:00:00.0003556
but if I switch the call order, the times are drastically different, instead Bubble Sorting may take 6 seconds. What is happening here? How come it isn't sorting them correctly?

Your problem is with your initialisation of the arrays. As shown here:
randomArray = randomizedArray
(Convert.ToInt32(textBox1.Text), Convert.ToInt32(textBox2.Text));
randomArrayGnome = randomArray;
randomArrayBubble = randomArray;
randomArrayInsertion = randomArray;
The above code creates four variables that all reference the same array. So what happens is that the first sort algorithm sorts the array, subsequent ones will encounter an array that is already sorted so will execute very fast.
As simple solution is to use a Linq - ToList to clone the array:
randomArray = randomizedArray
(Convert.ToInt32(textBox1.Text), Convert.ToInt32(textBox2.Text));
randomArrayGnome = randomArray.ToList();

randomArray & randomArrayGnome both holds the reference to randomizedArray.
When you call
GnomeSort(randomArray);
BubbleSort(randomArrayBubble);
the reference array is already sorted, BubbleSort is working on a already sorted array!
You can use Array.Clone() so that four different reference are created!

Related

Why is List<string>.Sort() slow?

So I noticed that a treeview took unusually long to sort, first I figured that most of the time was spent repainting the control after adding each sorted item. But eitherway I had a gut feeling that List<T>.Sort() was taking longer than reasonable so I used a custom sort method to benchmark it against. The results were interesting, List<T>.Sort() took ~20 times longer, that's the biggest disappointment in performance I've ever encountered in .NET for such a simple task.
My question is, what could be the reason for this? My guess is the overhead of invoking the comparison delegate, which further has to call String.Compare() (in case of string sorting). Increasing the size of the list appears to increase the performance gap. Any ideas? I'm trying to use .NET classes as much as possible but in cases like this I just can't.
Edit:
static List<string> Sort(List<string> list)
{
if (list.Count == 0)
{
return new List<string>();
}
List<string> _list = new List<string>(list.Count);
_list.Add(list[0]);
int length = list.Count;
for (int i = 1; i < length; i++)
{
string item = list[i];
int j;
for (j = _list.Count - 1; j >= 0; j--)
{
if (String.Compare(item, _list[j]) > 0)
{
_list.Insert(j + 1, item);
break;
}
}
if (j == -1)
{
_list.Insert(0, item);
}
}
return _list;
}
Answer: It's not.
I ran the following benchmark in a simple console app and your code was slower:
static void Main(string[] args)
{
long totalListSortTime = 0;
long totalCustomSortTime = 0;
for (int c = 0; c < 100; c++)
{
List<string> list1 = new List<string>();
List<string> list2 = new List<string>();
for (int i = 0; i < 5000; i++)
{
var rando = RandomString(15);
list1.Add(rando);
list2.Add(rando);
}
Stopwatch watch1 = new Stopwatch();
Stopwatch watch2 = new Stopwatch();
watch2.Start();
list2 = Sort(list2);
watch2.Stop();
totalCustomSortTime += watch2.ElapsedMilliseconds;
watch1.Start();
list1.Sort();
watch1.Stop();
totalListSortTime += watch1.ElapsedMilliseconds;
}
Console.WriteLine("totalListSortTime = " + totalListSortTime);
Console.WriteLine("totalCustomSortTime = " + totalCustomSortTime);
Console.ReadLine();
}
Result:
I haven't had the time to fully test it because I had a blackout (writing from phone now), but it would seem your code (from Pastebin) is sorting several times an already ordered list, so it would seem that your algorithm could be faster to...sort an already sorted list. In case the standard .NET implementation is a Quick Sort, this would be natural since QS has its worst case scenario on already sorted lists.

How to populate two separate arrays from one comma-delimited list?

I have a comma delimited text file that contains 20 digits separated by commas. These numbers represent earned points and possible points for ten different assignments. We're to use these to calculate a final score for the course.
Normally, I'd iterate through the numbers, creating two sums, divide and be done with it. However, our assignment dictates that we load the list of numbers into two arrays.
so this:
10,10,20,20,30,35,40,50,45,50,45,50,50,50,20,20,45,90,85,85
becomes this:
int[10] earned = {10,20,30,40,45,50,20,45,85};
int[10] possible = {10,20,35,50,50,50,20,90,85};
Right now, I'm using
for (x=0;x<10;x++)
{
earned[x] = scores[x*2]
poss [x] = scores[(x*2)+1]
}
which gives me the results I want, but seems excessively clunky.
Is there a better way?
The following should split each alternating item the list into the other two lists.
int[20] scores = {10,10,20,20,30,35,40,50,45,50,45,50,50,50,20,20,45,90,85,85};
int[10] earned;
int[10] possible;
int a = 0;
for(int x=0; x<10; x++)
{
earned[x] = scores[a++];
possible[x] = scores[a++];
}
You can use LINQ here:
var arrays = csv.Split(',')
.Select((v, index) => new {Value = int.Parse(v), Index = index})
.GroupBy(g => g.Index % 2,
g => g.Value,
(key, values) => values.ToArray())
.ToList();
and then
var earned = arrays[0];
var possible = arrays[1];
Get rid of the "magic" multiplications and illegible array index computations.
var earned = new List<int>();
var possible = new List<int>();
for (x=0; x<scores.Length; x += 2)
{
earned.Add(scores[x + 0]);
possible.Add(scores[x + 1]);
}
This has very little that would need a text comment. This is the gold standard for self-documenting code.
I initially thought the question was a C question because of all the incomprehensible indexing. It looked like pointer magic. It was too clever.
In my codebases I usually have an AsChunked extension available that splits a list into chunks of the given size.
var earned = new List<int>();
var possible = new List<int>();
foreach (var pair in scores.AsChunked(2)) {
earned.Add(pair[0]);
possible.Add(pair[1]);
}
Now the meaning of the code is apparent. The magic is gone.
Even shorter:
var pairs = scores.AsChunked(2);
var earned = pairs.Select(x => x[0]).ToArray();
var possible = pairs.Select(x => x[1]).ToArray();
I suppose you could do it like this:
int[] earned = new int[10];
int[] possible = new int[10];
int resultIndex = 0;
for (int i = 0; i < scores.Count; i = i + 2)
{
earned[resultIndex] = scores[i];
possible[resultIndex] = scores[i + 1];
resultIndex++;
}
You would have to be sure that an equal number of values are stored in scores.
I would leave your code as is. You are technically expressing very directly what your intent is, every 2nd element goes into each array.
The only way to improve that solution is to comment why you are multiplying. But I would expect someone to quickly recognize the trick, or easily reproduce what it is doing. Here is an excessive example of how to comment it. I wouldn't recommend using this directly.
for (x=0;x<10;x++)
{
//scores contains the elements inline one after the other
earned[x] = scores[x*2] //Get the even elements into earned
poss [x] = scores[(x*2)+1] //And the odd into poss
}
However if you really don't like the multiplication, you can track the scores index separately.
int i = 0;
for (int x = 0; x < 10; x++)
{
earned[x] = scores[i++];
poss [x] = scores[i++];
}
But I would probably prefer your version since it does not depend on the order of the operations.
var res = grades.Select((x, i) => new {x,i}).ToLookup(y=>y.i%2, y=>y.x)
int[] earned = res[0].ToArray();
int[] possible = res[1].ToArray();
This will group all grades into two buckets based on index, then you can just do ToArray if you need result in array form.
here is an example of my comment so you do not need to change the code regardless of the list size:
ArrayList Test = new ArrayList { "10,10,20,20,30,35,40,50,45,50,45,50,50,50,20,20,45,90,85,85" };
int[] earned = new int[Test.Count / 2];
int[] Score = new int[Test.Count / 2];
int Counter = 1; // start at one so earned is the first array entered in to
foreach (string TestRow in Test)
{
if (Counter % 2 != 0) // is the counter even
{
int nextNumber = 0;
for (int i = 0; i < Score.Length; i++) // this gets the posistion for the next array entry
{
if (String.IsNullOrEmpty(Convert.ToString(Score[i])))
{
nextNumber = i;
break;
}
}
Score[nextNumber] = Convert.ToInt32(TestRow);
}
else
{
int nextNumber = 0;
for (int i = 0; i < earned.Length; i++) // this gets the posistion for the next array entry
{
if (String.IsNullOrEmpty(Convert.ToString(earned[i])))
{
nextNumber = i;
break;
}
}
earned[nextNumber] = Convert.ToInt32(TestRow);
}
Counter++
}

remaining values from HashSet into secondary out

We've been asked to create a program which takes 2 input (which I have parsed) for seats and passengers in a plane and randomly place passengers on seats in a plane in one output as well as placing the remaining seats in a secondary output.
I was wondering if there is a simple way to replace the remaining values in HashSet, into the listBoxLedige.
As it works now, the seats are being distributing but the values in the secondary output arent related the the first output.
if(passengers > seats)
{
MessageBox.Show("For mange passagerer");
}
else
{
HashSet<int> check = new HashSet<int>();
for(int i = 0; i <= passengers - 1; i++)
{
int resultat = rnd.Next(1, seats + 1);
while(check.Contains(resultat))
{
resultat = rnd.Next(1, seats + 1);
}
check.Add(resultat);
int[] passagerer01 = new int[passengers];
passagerer01[i] = i+1;
listBoxFulde.Items.Add("Passager #" + passagerer01[i] + "på sæde #" + resultat);
}
HashSet<int> ledige01 = new HashSet<int>();
for(int i = 0; i <= (seats - passengers - 1); i++)
{
int tilbage = rnd.Next(1, seats + 1);
while(ledige01.Contains(tilbage))
{
ledige01.Add(tilbage);
}
listBoxLedige.Items.Add("Sæde #" + tilbage);
I'm not exactly sure I understand your problem, but have you taken a look at the Except LINQ extension method? Judging by your wording ("remaining values"), it might be the right method for you.
Edit Here's how it's done:
IEnumerable<int> empty = allSeats.Except(check);
Note how empty is now a deferred enumerator (unless you do a .ToArray(), .ToList() or similar on it).
Here's what I'd do (see Range and ExceptWith):
HashSet<int> ledige01 = new HashSet<int>(
Enumerable.Range(1, seats));
ledige01.ExceptWith(taken);
Note in generating the seeds you can remove the trial and error by simply shuffling the seats and taking the first N:
var taken = HashSet<int>(Enumerable.Range(1,seats).Shuffle().Take(passengers));
For hints on how to do shuffle, see e.g. Optimal LINQ query to get a random sub collection - Shuffle
As an aside:
int[] passagerer01 = new int[passengers];
passagerer01[i] = i+1;
listBoxFulde.Items.Add("Passager #" + passagerer01[i] + "på sæde #" + resultat);
looks to be something other than you need :) But I'm assuming it's unfinished and you're likely aware of that
A 'fully' edited take:
if(passengers > seats)
{
MessageBox.Show("For mange passagerer");
}
else
{
HashSet<int> taken = new HashSet<int>();
for(int i = 0; i <= passengers - 1; i++)
{
int resultat;
do {
resultat = rnd.Next(1, seats + 1);
} while(taken.Contains(resultat));
taken.Add(resultat);
listBoxFulde.Items.Add("Passager #" + (i+1) + "på sæde #" + resultat);
}
HashSet<int> ledige01 = new HashSet<int>(
Enumerable.Range(1, seats));
ledige01.ExceptWith(taken);
if(passengers > seats)
{
MessageBox.Show("For mange passagerer");
}
else
{
HashSet<int> taken = new HashSet<int>();
for(int i = 0; i <= passengers - 1; i++)
{
int resultat;
do {
resultat = rnd.Next(1, seats + 1);
} while(taken.Contains(resultat));
taken.Add(resultat);
listBoxFulde.Items.Add("Passager #" + (i+1) + "på sæde #" + resultat);
}
HashSet<int> ledige01 = new HashSet<int>(Enumerable.Range(1, seats));
ledige01.ExceptWith(taken);
foreach(var tilbage in ledige01)
listBoxLedige.Items.Add("Sæde #" + tilbage);
Not a hundred percent sure I understand your problem or solution, but are you aware that you declare and initialize passagerer01 once for each passenger, then deterministically (not randomly) assign seat i+1 to passenger i and afterwards throw away that array? If you wanted to keep the information, you'd have to declare the array outside the for loop.
Also, it doesn't really seem like you're doing anything meaningful in that second part of your code. To determine the empty seats, it would make sense to go through numbers 1 to passengers, check if they are in the set check and, if not, add them to set ledige01. Or, of course, to do something equivalent using a library method as suggested by sehe.
As a last note, in computer science you usually start counting from zero. Thus, you would usually have seat numbers 0 to seats-1 and choose seats randomly like so: rnd.Next(0, seats). And you would usually loop like this: for(int i = 0; i < passengers; i++) instead of for(int i = 0; i <= passengers - 1; i++).

Detecting an empty array without looping

How can i find out an array is empty or not, without looping?!
is there any method or anything else?
I mean, in some code like this:
string[] names = new string[5];
names[0] = "Scott";
names[1] = "jack";
names[2] = null;
names[3] = "Jones";
names[4] = "Mesut";
// or
int[] nums = new int[4];
nums[0] = 1;
// nums[1] = 2;
nums[2] = 3;
nums[3] = 4;
or some code like this:
using System;
class Example
{
static void Main()
{
int size = 10;
int counter;
string[] str = new string[size];
for (counter = 0; counter < size; counter++)
{
str[counter] = "A" + counter;
}
str[3] = null;
if (counter == size)
Console.WriteLine("Our array is full!");
if(counter < size)
Console.WriteLine("Our array is not full");
for (int i = 0; i < size; i++)
{
Console.WriteLine(str[i]);
}
}
}
is there anything else for detecting an empty array without looping?
An array just contains a number of elements. There's no concept of an array being "empty" just because each element happens to contain the default value (0, null, or whatever).
If you want a dynamically sized collection, you should use something like List<T> instead of an array.
If you want to detect whether any element of a collection (whether that's a list, an array or anything else) is a non-default value, you have to do that via looping. (You don't have to loop in your source code, but there'll be looping involved somewhere...)
There is no other way than looping through, even LINQ also does the looping automatically.
Instead, use a list<> and check if (listName!=null && listName.Length!=0)
Hope it helps :)
You can use LINQ for that, to check if any element is empty in the array you just can do:
var hasNulls = myArray.Any( a => a == null );
Or if you want to select the ones with values:
var notNulls = myArray.Where( a => a != null );

How to sort a part of an array with int64 indicies in C#?

The .Net framework has an Array.Sort overload that allows one to specify the starting and ending indicies for the sort to act upon. However these parameters are only 32 bit. So I don't see a way to sort a part of a large array when the indicies that describe the sort range can only be specified using a 64-bit number. I suppose I could copy and modify the the framework's sort implementation, but that is not ideal.
Update:
I've created two classes to help me around these and other large-array issues. One other such issue was that long before I got to my memory limit, I start getting OutOfMemoryException's. I'm assuming this is because the requested memory may be available but not contiguous. So for that, I created class BigArray, which is a generic, dynamically sizable list of arrays. It has a smaller memory footprint than the framework's generic list class, and does not require that the entire array be contiguous. I haven't tested the performance hit, but I'm sure its there.
public class BigArray<T> : IEnumerable<T>
{
private long capacity;
private int itemsPerBlock;
private int shift;
private List<T[]> blocks = new List<T[]>();
public BigArray(int itemsPerBlock)
{
shift = (int)Math.Ceiling(Math.Log(itemsPerBlock) / Math.Log(2));
this.itemsPerBlock = 1 << shift;
}
public long Capacity
{
get
{
return capacity;
}
set
{
var requiredBlockCount = (value - 1) / itemsPerBlock + 1;
while (blocks.Count > requiredBlockCount)
{
blocks.RemoveAt(blocks.Count - 1);
}
while (blocks.Count < requiredBlockCount)
{
blocks.Add(new T[itemsPerBlock]);
}
capacity = (long)itemsPerBlock * blocks.Count;
}
}
public T this[long index]
{
get
{
Debug.Assert(index < capacity);
var blockNumber = (int)(index >> shift);
var itemNumber = index & (itemsPerBlock - 1);
return blocks[blockNumber][itemNumber];
}
set
{
Debug.Assert(index < capacity);
var blockNumber = (int)(index >> shift);
var itemNumber = index & (itemsPerBlock - 1);
blocks[blockNumber][itemNumber] = value;
}
}
public IEnumerator<T> GetEnumerator()
{
for (long i = 0; i < capacity; i++)
{
yield return this[i];
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
And getting back to the original issue of sorting... What I really needed was a way to act on each element of an array, in order. But with such large arrays, it is prohibitive to copy the data, sort it, act on it and then discard the sorted copy (the original order must be maintained). So I created static class OrderedOperation, which allows you to perform an arbitrary operation on each element of an unsorted array, in a sorted order. And do so with a low memory footprint (trading memory for execution time here).
public static class OrderedOperation
{
public delegate void WorkerDelegate(int index, float progress);
public static void Process(WorkerDelegate worker, IEnumerable<int> items, int count, int maxItem, int maxChunkSize)
{
// create a histogram such that a single bin is never bigger than a chunk
int binCount = 1000;
int[] bins;
double binScale;
bool ok;
do
{
ok = true;
bins = new int[binCount];
binScale = (double)(binCount - 1) / maxItem;
int i = 0;
foreach (int item in items)
{
bins[(int)(binScale * item)]++;
if (++i == count)
{
break;
}
}
for (int b = 0; b < binCount; b++)
{
if (bins[b] > maxChunkSize)
{
ok = false;
binCount *= 2;
break;
}
}
} while (!ok);
var chunkData = new int[maxChunkSize];
var chunkIndex = new int[maxChunkSize];
var done = new System.Collections.BitArray(count);
var processed = 0;
var binsCompleted = 0;
while (binsCompleted < binCount)
{
var chunkMax = 0;
var sum = 0;
do
{
sum += bins[binsCompleted];
binsCompleted++;
} while (binsCompleted < binCount - 1 && sum + bins[binsCompleted] <= maxChunkSize);
Debug.Assert(sum <= maxChunkSize);
chunkMax = (int)Math.Ceiling((double)binsCompleted / binScale);
var chunkCount = 0;
int i = 0;
foreach (int item in items)
{
if (item < chunkMax && !done[i])
{
chunkData[chunkCount] = item;
chunkIndex[chunkCount] = i;
chunkCount++;
done[i] = true;
}
if (++i == count)
{
break;
}
}
Debug.Assert(sum == chunkCount);
Array.Sort(chunkData, chunkIndex, 0, chunkCount);
for (i = 0; i < chunkCount; i++)
{
worker(chunkIndex[i], (float)processed / count);
processed++;
}
}
Debug.Assert(processed == count);
}
}
The two classes can work together (that's how I use them), but they don't have to. I hope someone else finds them useful. But I'll admit, they are fringe case classes. Questions welcome. And if my code sucks, I'd like to hear tips, too.
One final thought: As you can see in OrderedOperation, I'm using ints and not longs. Currently that is sufficient for me despite the original question I had (the application is in flux, in case you can't tell). But the class should be able to handle longs as well, should the need arise.
You'll find that even on the 64-bit framework, the maximum number of elements in an array is int.MaxValue.
The existing methods that take or return Int64 just cast the long values to Int32 internally and, in the case of parameters, will throw an ArgumentOutOfRangeException if a long parameter isn't between int.MinValue and int.MaxValue.
For example the LongLength property, which returns an Int64, just casts and returns the value of the Length property:
public long LongLength
{
get { return (long)this.Length; } // Length is an Int32
}
So my suggestion would be to cast your Int64 indicies to Int32 and then call one of the existing Sort overloads.
Since Array.Copy takes Int64 params, you could pull out the section you need to sort, sort it, then put it back. Assuming you're sorting less than 2^32 elements, of course.
Seems like if you are sorting more than 2^32 elements then it would be best to write your own, more efficient, sort algorithm anyway.

Categories