I am trying to implement a generic Mergesort algorithm in C#, but I am having difficulty with the Constraints. I have searched many references but I can't find any that are implementing the algorithm like I am.
MergeSort algorithm in C#
Generic Implementation of Sorting Algorithms
Anyways, I am trying to provide an implementation that only allows the user to Mergesort a dataset that inherits from the IComparable interface.
Below is what I have so far:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace SortUtil
{
class Program
{
static void Main(string[] args)
{
List<int> testList = new List<int> { 1, 5, 2, 7, 3, 9, 4, 6 };
Mergesort.mergeSort<int>(testList); // Compiler Error at this Line.
}
}
class Mergesort
{
public static void mergeSort<T>(ref List<T> inputData)
where T: IComparable<T>
{
mergeSort(ref inputData, 0, inputData.Count - 1);
}
private static void mergeSort<T>(ref List<T> inputData, int firstIndex, int lastIndex)
where T: IComparable<T>
{
// If the firstIndex is greater than the lastIndex then the recursion
// has divided the problem into a single item. Return back up the call
// stack.
if (firstIndex >= lastIndex)
return;
int midIndex = (firstIndex + lastIndex) / 2;
// Recursively divide the first and second halves of the inputData into
// its two seperate parts.
mergeSort(ref inputData, firstIndex, midIndex);
mergeSort(ref inputData, midIndex + 1, lastIndex);
// Merge the two remaining halves after dividing them in half.
merge(ref inputData, firstIndex, midIndex, lastIndex);
}
private static void merge<T>(ref List<T> inputData, int firstIndex, int midIndex, int lastIndex)
where T: IComparable<T>
{
int currentLeft = firstIndex;
int currentRight = midIndex + 1;
T[] tempData = new T[(lastIndex - firstIndex) + 1];
int tempPos = 0;
// Check the items at the left most index of the two havles and compare
// them. Add the items in ascending order into the tempData array.
while (currentLeft <= midIndex && currentRight <= lastIndex)
if (inputData.ElementAt(currentLeft).CompareTo(inputData.ElementAt(currentRight)) < 0)
{
tempData[tempPos++] = inputData.ElementAt(currentLeft++);
}
else
{
tempData[tempPos++] = inputData.ElementAt(currentRight++);
}
// If there are any remaining items to be added to the tempData array,
// add them.
while (currentLeft <= midIndex)
{
tempData[tempPos++] = inputData.ElementAt(currentLeft++);
}
while (currentRight <= lastIndex)
{
tempData[tempPos++] = inputData.ElementAt(currentRight++);
}
// Now that the items have been sorted, copy them back into the inputData
// reference that was passed to this function.
tempPos = 0;
for (int i = firstIndex; i <= lastIndex; i++) {
inputData.Insert(firstIndex, tempData.ElementAt(tempPos));
}
}
}
}
My issue: I am getting a Compiler error in the Main method of the Program class; however, shouldn't I have to supply the mergeSort function the parametrized type when I call it statically?
I am getting the error "The best overloaded method match for... has some invalid arguments."
I would greatly appreciate any implementation suggestions and/or any way of correcting this error. Note, I am most comfortable in Java, and since C# doesn't directly support wildcards this approach is foreign to me. Any explanations on this would be appreciated as well.
You could remove ref from all of your parameters since you do not seem to be using its functionality.
Also you would not need to provide generic parameter type in most cases because the compiler will infer the type for you. So this should work (assuming you've removed ref from the parameters) in most cases:
Mergesort.mergeSort(testList);
Also List<T> and arrays have indexers so you can get at specific elements via inputData[index] instead of ElementAt. It's just less typing that way.
MergeSort reuires a ref parameter, so it needs the ref keyword. This should work:
Mergesort.mergeSort<int>(ref testList);
The ref keyword causes an argument to be passed by reference, not by
value. The effect of passing by reference is that any change to the
parameter in the method is reflected in the underlying argument
variable in the calling method. The value of a reference parameter is
always the same as the value of the underlying argument variable.
Related
Let's say we want to implement a sum algorithm I use C# as an illustration here:
// Iterative
int sum(int[] array) {
int result = 0;
foreach(int item in array) {
result += item;
}
return item;
}
which is equivalent to
// Recursive
int sum(int[] array) {
if(array.Length == 0) {
return 0;
}
// suppose there is a SubArray function here
return array[0] + sum(array.SubArray(1));
}
However, if we want to add a condition to the algorithm where we don't want to add the integer at index 2 to our result, we only need to add one conditional statement to our first (iterative) implementation.
Q: Is there any adaptation to our recursive one to make it work?
The recursive version is inefficient due to the repeated SubArray calls, making the time complexity O(n2). You can re-write this function to accept an additional index parameter, which also happens to be how you can implement skipping a particular index (or set of indices, if you choose).
In C#:
private static int SumSkipIndex(int[] arr, int skip, int i)
{
if (i >= arr.Length) return 0;
return (i == skip ? 0 : arr[i]) + SumSkipIndex(arr, skip, i + 1);
}
If you don't like the added i parameter which changes the function header, just write a separate private recursive "helper" function that can be called from the wrapper with your preferred header.
I'm also assuming you don't wish to hardcode index 2 into the algorithm (if you do, remove the skip parameter and replace i == skip with i == 2).
using System;
class MainClass
{
private static int SumSkipIndex(int[] arr, int skip, int i)
{
if (i >= arr.Length) return 0;
return (i == skip ? 0 : arr[i]) + SumSkipIndex(arr, skip, i + 1);
}
public static int SumSkipIndex(int[] arr, int skip)
{
return SumSkipIndex(arr, skip, 0);
}
public static void Main(string[] args)
{
Console.WriteLine(SumSkipIndex(new int[]{16, 11, 23, 3}, 1)); // => 42
}
}
Lastly, bear in mind that recursion is a terrible choice for this sort of algorithm (summing an array), even with the index version. We have to call a new function just to handle one number, meaning we have a lot of call overhead (allocating stack frames) and can easily blow the stack if the list is too long. But I'm assuming this is just a learning exercise.
A consise solution can be done in C# 8 using array slices.
public static int SumArray(int[] arr, int exclude){
if(arr.Length == 0){
return 0;
}
return (exclude==0?0:arr[0]) + SumArray(arr[1..], exclude-1);
}
The ternary operator checks if the skip index is 0, and if it isn't it will decrement the skip index for the next recursive call. The array is reduced using the slice, which should be more performant than SubArray. (Someone fact check me on the latter)
EDIT: As the other answer has suggested, this causes a bloating of stack frames due to a lack of tail call recursion. The below solution would mitigate the issue by using tail call optimisation, adding the sum variable to the function instead. This means the recursive call can use the same stack frame rather than creating a new one to await the return value before completing the sum.
public static int SumArray(int[] arr, int exclude, int sum=0){
if(arr.Length == 0){
return sum;
}
return SumArray(arr[1..], exclude-1, sum + (exclude==0?0: arr[0]));
}
I created a class that implements the wrapper over the array double [] but I can not change the element of the received array. These are tests
public void SetCorrectly ()
public void IndexerDoesNotCopyArray ()
The problem sounds like this. Write the class Indexer, which is created as a wrapper over the array double [], and opens access to its subarray of some length, starting with some element. Your decision must pass the tests contained in the project. As always, you must monitor the integrity of the data in Indexer.
Here is my code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Incapsulation.Weights
{
public class Indexer
{
double[] array;
int start;
int length;
public int Length
{
get { return length; }
}
public Indexer(double[] array, int start, int length)
{
if (start < 0 || start >= array.Length) throw new ArgumentException();
this.start = start;
if (length < start || length > array.Length) throw new ArgumentException();
this.length = length;
this.array = array.Skip(start).Take(length).ToArray();
}
public double this[int index]
{
get { return array[index]; }
set { array[index] = value; }
}
}
}
This is tests
using NUnit.Framework;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Incapsulation.Weights
{
[TestFixture]
public class Indexer_should
{
double[] array = new double[] { 1, 2, 3, 4 };
[Test]
public void HaveCorrectLength()
{
var indexer = new Indexer(array, 1, 2);
Assert.AreEqual(2, indexer.Length);
}
[Test]
public void GetCorrectly()
{
var indexer = new Indexer(array, 1, 2);
Assert.AreEqual(2, indexer[0]);
Assert.AreEqual(3, indexer[1]);
}
[Test]
public void SetCorrectly()
{
var indexer = new Indexer(array, 1, 2);
indexer[0] = 10;
Assert.AreEqual(10, array[1]);
}
[Test]
public void IndexerDoesNotCopyArray()
{
var indexer1 = new Indexer(array, 1, 2);
var indexer2 = new Indexer(array, 0, 2);
indexer1[0] = 100500;
Assert.AreEqual(100500, indexer2[1]);
}
}
The problem you are facing here is that the last test requires that your wrapper provides access to the underlying array. In other words, whatever number of Indexers are created, they all point to the same underlying array.
Your line here this.array = array.Skip(start).Take(length).ToArray(); violates this requirement creating a new instance of Array class. Because of this the value changed by first indexer is not reflected in the second one - they point to different memory areas.
To fix this, instead of creating a new Array using LINQ, simply store the original array passed through constructor. Your this[] indexer property must take care of passed start and end adding start to the index and checking the out-of-boundaries condition manually.
All linq extension methods create a new enumeration, they do not mutate or return the one the method is called on:
var newArray = array.Skip(...).ToArray();
ReferenceEquals(array, newArray); //returns false
Any change you might make in an element of newArray will not change anything whatsoever in array.
Your SetCorrectly test is comparing indexer and array and it will always fail. Your other test also fails because indexer1 and indexer2 reference two different arrays.
However because Linq is lazy, modifying array can be seen by the result of the Linq extension method depending on when you materialize the enumeration; this can happen:
var skippedA = array.Skip(1); //deferred execution
array[1] = //some different value...,
var newArray =skipped.ToArray(); //Skip is materialized here!
newArray[1] == array[1]; //true!
As input i have object that implements IDataRecord(row of some abstract table), so it have indexer, and by giving it some integer i can retrive object of some type. As output my code must get some range of cells in that row as array of given type objects.
So I've written this method(yes, i know, it can be easly converted to extension method, but i don't need this, and also i don't really want to have this method visible outside of my class):
private static T[] GetRange<T>(IDataRecord row, int start, int length)
{
var result = new List<T>();
for (int i = start; i < (start + length); i++)
{
result.Add((T)row[i]);
}
return result.ToArray();
}
It works fine, but this method logic seems like something very common. So, is there any method that can give same(or almost same) result in .NET Framework FCL/BCL?
Use Skip and Take.
var rangeList = result.Skip(start - 1).Take(length);
No, it is not in the BCL.
You should however not create a List<> first and then copy that to the array. Either return the List<> itself (and construct it with the appropriate initial capacity), or create the array immediately like this:
private static T[] GetRange<T>(IDataRecord row, int start, int length)
{
var result = new T[length];
for (int i = 0; i < length; i++)
{
result[i] = (T)row[start + i];
}
return result;
}
Here is an alternative (for all you LINQ lovers):
// NB! Lazy enumeration
private static IEnumerable<T> GetRange<T>(IDataRecord row, int start, int length)
{
return Enumerable.Range(start, length).Select(i => (T)row[i]);
}
We repeat here what was stated in the comments to the question: The interface System.Data.IDataRecord (in System.Data.dll assembly) does not inherit IEnumerable<> or IEnumerable.
If you want to 'TakeARange' you should have a collection as input parameter.
Here you don't have one.
You just have a IDataRecord (eg. a single row) that has an indexer.
You should expose a property called Cells that return the list you work with in the indexer implementation.
Your method should look like this:
private static T[] TakeRange<T>(IEnumerable cells, int start, int length)
{
return cells.Skip(start - 1).Take(length)
}
Well.. Seem's like there is no such method in FCL/BCL
I ran into what was to me an unexpected result when testing a simple ForEach extension method.
ForEach method
public static void ForEach<T>(this IEnumerable<T> list, Action<T> action)
{
if (action == null) throw new ArgumentNullException("action");
foreach (T element in list)
{
action(element);
}
}
Test method
[TestMethod]
public void BasicForEachTest()
{
int[] numbers = new[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
numbers.ForEach(num =>
{
num = 0;
});
Assert.AreEqual(0, numbers.Sum());
}
Why would numbers.Sum() be equal to 55 and not 0?
num is the copy of the value of the current element you are iterating over. So you are just changing the copy.
What you do is basically this:
foreach(int num in numbers)
{
num = 0;
}
Surely you do not expect this to change the content of the array?
Edit: What you want is this:
for (int i in numbers.Length)
{
numbers[i] = 0;
}
In your specific case you could maintain an index in your ForEach extension method and pass that as second argument to the action and then use it like this:
numbers.ForEachWithIndex((num, index) => numbers[index] = 0);
However in general: Creating Linq style extension methods which modify the collection they are applied to are bad style (IMO). If you write an extension method which cannot be applied to an IEnumerable<T> you should really think hard about it if you really need it (especially when you write with the intention of modifying the collection). You have not much to gain but much to loose (like unexpected side effects). I'm sure there are exceptions but I stick to that rule and it has served me well.
Because num is a copy.
It's as if you were doing this:
int i = numbers[0];
i = 0;
You wouldn't expect that to change numbers[0], would you?
Because int is a value type and is passed to your extension method as a value parameter. Thus a copy of numbers is passed to your ForEach method. The values stored in the numbers array that is initialized in the BasicForEachTest method are never modified.
Check this article by Jon Skeet to read more on value types and value parameters.
I am not claiming that the code in this answer is useful, but (it works and) I think it illustrates what you need in order to make your approach work. The argument must be marked ref. The BCL does not have a delegate type with ref, so just write your own (not inside any class):
public delegate void MyActionRef<T>(ref T arg);
With that, your method becomes:
public static void ForEach2<T>(this T[] list, MyActionRef<T> actionRef)
{
if (actionRef == null)
throw new ArgumentNullException("actionRef");
for (int idx = 0; idx < list.Length; idx++)
{
actionRef(ref list[idx]);
}
}
Now, remember to use the ref keyword in your test method:
numbers.ForEach2((ref int num) =>
{
num = 0;
});
This works because it is OK to pass an array entry ByRef (ref).
If you want to extend IList<> instead, you have to do:
public static void ForEach3<T>(this IList<T> list, MyActionRef<T> actionRef)
{
if (actionRef == null)
throw new ArgumentNullException("actionRef");
for (int idx = 0; idx < list.Count; idx++)
{
var temp = list[idx];
actionRef(ref temp);
list[idx] = temp;
}
}
Hope this helps your understanding.
Note: I had to use for loops. In C#, in foreach (var x in Yyyy) { /* ... */ }, it is not allowed to assign to x (which includes passing x ByRef (with ref or out)) inside the loop body.
Back from interview. I share with you and a good and precise answer is welcome.
The purpose, you have a static method, this method receive an IList<int> you have
to get back the values you can divise by 3 and make the code.
Constraint :
The original list (in the main) has a reference on the stack and the values on the heap,
the result must be return (it's a void method) in the same space (on the heap) than the original list. The solution show here is not correct because in the method a new pointer
on the stack + heap are created in the method domain. Solution ?
Bonus : how change the code to receive not only int but float, double, ....
static void Main(string[] args)
{
IList<int> list = new List<int>() { 9, 3, 10, 6, 14, 16, 20};
CanBeDivedByThree(list);
}
static void CanBeDivedByThree(IList<int> list)
{
list = (from p in list
where p % 3 == 0
orderby p descending
select p).ToList<int>();
}
That's meaningless as the internal storage to an IList is not under your control. Adding (or possibly removing) items might re-allocate the internal data structures.
It is especially meaningless as the list in your sample contains value types which are copied anyway when you access them.
Last but not least it's basically the whole point of using a managed language that you don't have to worry about memory (al)locations. Such things are implementation details of the platform.
To take up on your bonus question: There is no simple way to achieve that. One could think that using generics with a type constraint would solve the problem here (something like static void CanBeDivedByThree<T>(IList<T> list) where T : struct), but the problem is that C# does not (yet?) have support for generic arithmetic. C# doesn't have a modulo operator that can take a generic parameter of type 'T' and 'int'.
list.RemoveAll(n => n % 3 == 0);
or
for (int i = list.Count - 1; i >= 0; --i)
{
if (list[i] % 3 != 0)
list.RemoveAt(i);
}
The first approach works only for List<T>.
One could make it a template method, but remainder operation doesn't make much sense on floats.
Unfortunately only List but not IList does implement RemoveAll. So I first implement it as an extension method.
public static int RemoveAll<T>(this IList<T> list, Predicate<T> match)
{
if (match == null)
throw new ArgumentNullException("match");
int destIndex=0;
int srcIndex;
for(srcIndex=0;srcIndex<list.Count;srcIndex++)
{
if(!match(list[srcIndex]))
{
//if(srcIndex!=destIndex)//Small optimization, can be left out
list[destIndex]=list[srcIndex];
destIndex++;
}
}
for(int removeIndex=list.Count-1;removeIndex>=destIndex;removeIndex--)
{
list.RemoveAt(removeIndex);
}
return srcIndex-destIndex;
}
Then you can use:
list.RemoveAll(n => n % 3 != 0);
You can then use overloads for other types. Unfortunately you can't (easily) make it generic since generics don't work with operator overloading.
Others have covered the list part - this is just for the bonus bit.
You can't do this in a statically typed way using C# generics, but if you're using C# 4 you can do it with dynamic typing. For example:
using System;
using System.Collections.Generic;
class Test
{
static void Main()
{
ShowDivisibleBy3(new List<int> { 1, 3, 6, 7, 9 });
ShowDivisibleBy3(new List<decimal> { 1.5m, 3.3m, 6.0m, 7m, 9.00m });
}
static void ShowDivisibleBy3<T>(IEnumerable<T> source)
{
foreach (dynamic item in source)
{
if (item % 3 == 0)
{
Console.WriteLine(item);
}
}
}
}