How can I dynamically generate and populate a multi-dimensional array - c#

I'm working on a serializer and have run into a real wall with multi-dimensional arrays. If I use Activator.CreateInstance() it creates a one dimensional array just fine, but it fails to work when used as follows:
var indices = new[] { 2, 2 };
Type type = typeof(int[,]);
var result = Activator.CreateInstance(type, indices) as Array;
If I instead use Array.CreateInstance() to generate my array, it works for single and multi-dimensional arrays alike. However, all my calls to the SetValue() method on the array, which I use to dynamically set values, generates an exception, whereas it works fine on the single dimensional arrays I created using Activator.CreateInstance(). I'm really struggling to find a viable solution that allows me to dynamically create an array of any dimension/size and then populate the array with values. I'm hoping someone with more reflection experience can shed some light on this.
When trying to create a multi-dimensional array with Activator I get the exception:
Constructor on type 'System.Int32[,]' not found.
When I instead use Array.CreateInstance() and then call SetValue() I get the following exception from the SetValue() call:
Object cannot be stored in an array of this type.
Which frankly makes no sense to me since the value is an int and the array is an int[,].
I am using the 4.5 framework for my project though I recreated the problem with 4.6 as well.

You can call Array.CreateInstance with the actual ElementType which is int in this case.
var indices = new[] { 2, 3 };
var arr = Array.CreateInstance(typeof(int), indices);
Then you can populate the array with SetValue without any exception. For example
var value = 1;
for (int i = 0; i < indices[0]; i++)
{
for (int j = 0; j < indices[1]; j++)
{
arr.SetValue(value++, new[] { i, j });
}
}
//arr = [ [ 1, 2, 3 ], [ 4, 5, 6 ] ]

Related

Overload resolution on generic method doesn't work as expected

Last year I asked how to traverse and print jagged arrays, without having to write an overloaded function for each dimension that gets added. Generic printing of jagged arrays.
I picked up the problem again and was able to solve it like this. It is similar to one of the answers I got, but not quite the same.
static string Print<T>(T[] array)
{
string str = "[ ";
for (int i = 0; i < array.Length; i++)
{
str += array[i];
if (i < array.Length - 1)
str += ", ";
}
return str + " ]\n";
}
static string Print<T>(T[][] array)
{
string str = "";
for (int i = 0; i < array.Length; i++)
{
var sub = array[i];
if (sub.Length != 0 && sub[0] is Array)
str += PrintDynamic(sub);
else
str += Print(sub);
}
return str + "\n";
}
private static string PrintDynamic(dynamic array)
{
return Print(array);
}
It works fine and I get the correct output:
var twoDim = new int[][]
{
new int[] { 0, 1, 2, 3 },
new int[] { 0, 1, 2 },
new int[] { 0 }
};
var threeDim = new int[][][] { twoDim, twoDim }
Console.WriteLine(Print(threeDim));
// Output:
// [ 0, 1, 2, 3]
// [ 0, 1, 2]
// [ 0 ]
//
// [ 0, 1, 2, 3]
// [ 0, 1, 2]
// [ 0 ]
But I'm still not satisfied, because it would be a lot nicer if I didnt't need PrintDynamic() and if I could just write
str += Print(sub);
instead of
str += PrintDynamic(sub);
That's where my question comes from. If I change that one line, I do not get any errors, but the output becomes
// [ System.Int32[], System.Int32[], System.Int32[], System.Int32[]]
// [ System.Int32[], System.Int32[], System.Int32[]]
// [ System.Int32[] ]
//
// [ System.Int32[], System.Int32[], System.Int32[], System.Int32[]]
// [ System.Int32[], System.Int32[], System.Int32[]]
// [ System.Int32[] ]
because Print<T>(T[] array) gets called instead of Print<T>(T[][] array). How does the compiler know which Print<T>() to use, when it's called from PrintDynamic(dynamic array), but doesn't when it's called from within Print<T>()?
To answer your original question. When you call:
str += Print(sub)
and the original object for the method was int[][][], then the <T> for the method is int[]. So you are calling Print(sub) with T[], where T is int[].
Therefore, the T[] overload of Print is selected, with T as int[] - and everything follows as expected. This is compile time resolved, and is the best the compiler can do with the information it has.
Remember, a generic method is only compiled to IL once - you don't get 'different versions' for different ways it happens to be called (unlike with C++ templates). The generic method's behaviour must be valid for all possible inputs. So if the method receives T[][], and you extract sub-elements internally, it can only consider the sub-object type to be T[]. It can't detect at runtime that 'oh, actually, T is an int[], so I'll call a different overload'. The method will only ever call the T[] overload of Print(sub), no matter what the input.
However, in the case where you use dynamic, you're ignoring all the generic type information baked in at compile time, and saying 'what is this type actually NOW, at runtime, using reflection. Which method is the best match now? Use that one!'. This behaviour has a significant overhead, and therefore must be explicitly requested by using the dynamic keyword.
"How does the compiler know which Print() to use, when it's called from PrintDynamic(dynamic array)"
The answer is virtual function tables. Since you are working with C#, all types inherit from the class "object". A simple call to Print() attempts to print the objects themselves and not their content. Why? Because the ToString() method is called for the object since a more appropriate method has not been overloaded. Whenever you're working with strongly typed OOP languages such as C#, each object is a pointer to it's data structure (generally on the heap), and the first entry of this data structure is a pointer to the virtual function table for that object. A virtual function table is essentially an array of function pointers for each respective function that the class supports. Since by calling PrintDynamic you are actually passing the pointer of your object, the resolution of your object's pointer maps back to the virtual function table of its class. Then, the appropriate overloaded functions can be called. This is a high level description of the process. The concept is similar in languages such as C++. I hope this helps you understand a bit more about what the compiler is actually doing behind the scenes. I'd recommend some academic reading or perhaps the following link for some more details.
https://en.wikipedia.org/wiki/Virtual_method_table
If I were you, as this is a problem that can't really be resolved at compile time for arbitrary dimensions, I'd avoid using generics altogether:
public static string Print(Array array)
{
string str = "[ ";
for (int i = 0; i < array.Length; i++)
{
var element = array.GetValue(i);
if (element is Array)
str += Print(element as Array);
else
{
str += element;
if (i < array.Length - 1)
str += ", ";
}
}
return str + " ]";
}
This produces nested output, which I think is nicer, and it will nest arbitrarily as the depth increases.
[ [ [ 0, 1, 2, 3 ][ 0, 1, 2 ][ 0 ] ][ [ 0, 1, 2, 3 ][ 0, 1, 2 ][ 0 ] ] ]

What is the structure of an Array ?

Well I know that Array in C# is an object
but a bit of code has confused me actually
int[] numbers = {4, 5, 6, 1, 2, 3, -2, -1, 0};
foreach (int i in numbers)
Console.WriteLine(i);
to access an arbitrary property of an arbitrary object int value= object.property;
in this loop it's kind of accessing the properties, but how ?
and what is the property itself here ? how they are being organized?
How data is stored
Basically an array is a blob of data. Integers are value types of 32-bit signed integers.
Identifiers in C# are either pointers to objects, or the actual values. In the case of reference types, they are real pointers, in the case of values types (e.g. int, float, etc) they are the actual data. int is a value type, int[] (array to integers) is a reference type.
The reason it works like this is basically "efficiency": the overhead of copying 4 bytes or 8 bytes for a value type is very small, while the overhead of copying an entire array can be quite extensive.
If you have an array containing a N integers, it's nothing more than a blob of N*4 bytes, with the variable pointing to the first element. There is no name for each element in the blob.
E.g.:
int[] foo = new int[10]; // allocates 40 bytes, naming the whole thing 'foo'
int f = foo[2]; // allocates variable 'f', copies the value of 'foo[2]' into 'f'.
Access the data
As for foreach... In C#, all collections implement an interface named IEnumerable<T>. If you use it, the compiler will in this case notice that it's an integer array and it will loop through all elements. In other words:
foreach (int f in foo) // copy value foo[0] into f, then foo[1] into f, etc
{
// code
}
is (in the case of arrays!) the exact same thing as:
for (int i=0; i<foo.Length; ++i)
{
int f = foo[i];
// code
}
Note that I explicitly put "in the case of arrays" here. Arrays are an exceptional case for the C# compiler. If you're not working with arrays (but f.ex. with a List, a Dictionary or something more complex), it works a bit differently, namely by using the Enumerator and IDisposable. Note that this is just a compiler optimization, arrays are perfectly capable of handling IEnumerable.
For those interested, basically it'll generate this for non-arrays and non-strings:
var e = myEnumerable.GetEnumerator();
try
{
while (e.MoveNext())
{
var f = e.Current;
// code
}
}
finally
{
IDisposable d = e as IDisposable;
if (d != null)
{
d.Dispose();
}
}
If you want a name
You probably need a Dictionary.
An array is just a collection, or IEnumerable<T> where T represents a particular type; in your case int (System.Int32)
Memory allocation...
An int is 32 bits long, and you want an array of 10 items
int[] numbers = new int[10];
32 x 10 = 320 bits of memory allocated
Memory access...
Say your array starts at 0000:0000 and you want to access index n [0-9] of the array...
Pseudo code...
index = n
addressof(n) = 0000:0000 + (sizeof(int) * index)
or
index = 2
addressof(n) = 0000:0000 + (32 * 2)
This is a simplified example of what happens when you access each indexed item within the array (your foreach loop sort of does this)
A foreach loop works by referencing each element in the array (in your case, the reference is called i).
Why you are NOT accessing by property...
In an array, items are stored by index, not by name...
... so you can't...
numbers.1
numbers.2
...but you can...
numbers[1]
numbers[2]
Since every object derives from object, you can access arbitrary members of the particular type...
numbers[1].GetHashCode();
Example:
//Define an array if integers, shorthand (see, memory allocation)...
int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
//For each element in the array (under the hood, get the element's address in memory and hold this in a reference called i)...
foreach(int i in numbers)
{
// Call GetHashCode() on the instance pointed to by i and pass the reference (by value) to the Console.WriteLine method.
Console.WriteLine(i.GetHashCode());
}

Generically accessing multidimensional arrays in C#

C# allows creating and populating multidimensional arrays, here is a simple example:
public static void Main(String[] args)
{
var arr = (int[,])CreateArray(new [] {2, 3}, 8);
Console.WriteLine("Value: " + arr[0,0]);
}
// Creates a multidimensional array with the given dimensions, and assigns the
// given x to the first array element
public static Array CreateArray<T>(int[] dimLengths, T x)
{
var arr = Array.CreateInstance(typeof(T), dimLengths);
var indices = new int[dimLengths.Length];
for (var i = 0; i < indices.Length; i++)
indices[i] = 0;
arr.SetValue(x, indices); // Does boxing/unboxing
return arr;
}
This works well. However, for some reason there is no generic version of Array.SetValue(), so the code above does boxing/unboxing, which I'd like to avoid. I was wondering if I missed something or if this is an omission in the .NET API?
No, you are not missing anything: Arrays does not have an option that sets the value without boxing and unboxing.
You do have an alternative to this with LINQ, but it is probably going to be slower than boxing/unboxing for a single element, because compiling a dynamic lambda would "eat up" the potential benefits:
public static Array CreateArray<T>(int[] dimLengths, T x) {
var arr = Array.CreateInstance(typeof(T), dimLengths);
var p = Expression.Parameter(typeof(object), "arr");
var ind = new Expression[dimLengths.Length];
for (var i = 0; i < dimLengths.Length; i++) {
ind[i] = Expression.Constant(0);
}
var v = Expression.Variable(arr.GetType(), "cast");
var block = Expression.Block(
new[] {v}
, new Expression[] {
Expression.Assign(v, Expression.Convert(p, arr.GetType()))
, Expression.Assign(Expression.ArrayAccess(v, ind), Expression.Constant(x))
, Expression.Constant(null, typeof(object))
}
);
Expression.Lambda<Func<object, object>>(block, p).Compile()(arr);
return arr;
}
If you wanted to set all elements in a loop, you could modify the above to compile a dynamically created lambda with multiple nested loops. In this case, you could get an improvement on having to perform multiple boxing and unboxing in a series of nested loops.
for some reason there is no generic version of Array.SetValue()
While it is definitely possible to write a generic method similar to SetValue in the Array class, it may not be desirable. A generic method on a non-generic class would give a false promise of compile-time type safety, which cannot be guaranteed, because the compiler does not know the runtime type of the Array object.
I didn't find any generic ways either to set a value into an Array instance, so I guess the only workaround is to use the unsafe context to avoid boxing.
However, there can be no generic version, now when I think of it. See, when you define a generic method method<T>()..., you do define the parameter for the method: ...<T>(T[] a)... where you have to be specific about the dimensions count, which is one. To create a twodimensional parameter, you define it like this ...<T>(T[,] a)... and so on.
As you can see, by the current syntax of C#, you simple cannot create a generic method, which can accept any-dimensional array.

Checking if two arrays are equal: returns incorrect result

I am trying to check if two arrays of equal size contain the same integers at the same indexes. If some element are not equal I want to return true, and otherwise return false.
public bool multipleSolutions(int[,] grid)
{
int[,] front = new int[9, 9];
front = grid;
int[,] back = new int[9, 9];
back = grid;
front = solve(front);
back = solveBackwards(back);
for (int r = 0; r < 9; r++)
{
for (int c = 0; c < 9; c++)
{
if (back[r, c] != front[r, c])
{
return true;
}
}
}
return false;
}
When tested separately, solve and solveBackwards give two different arrays, but when I try multipleSolutions it still gives me false (since they are two different arrays, I want it to return true).
Since the test logic is correct the most likely cause to this error is that the implementation of solve and solvebackwards changes the array passed in and returns the same array.
For both the call to solve and to solveBackwards the array identified by the parameter grid is passed in. So if solve changes the passed-in array then the input for solveBackwards has been changed accoring to the first run. Which might affect solveBackwards. The result wouldn't differ though because under th above assumption when solveBackwards is done. both front and back would then be the result of the solveBackwards
assumptions
solve and solveBackwards alters the array passed in
the return value of solve and solveBacrwards are the array passed in
EDIT
Seeing the the assumptions are correct you could insert this as the first line in both solve and solveBackwards
var temp = new int[9,9]
Array.Copy(grid, 0, temp, 0, grid.Length);
and then use temp through the implementation of solve and solvebackwards.
Alternatively you could do the same for front and back before passing them in as an argument
however you should then change the return type of the two methods. Specifying a return type is indicative of returning a different object and not mutating the argument. This is also why I'd prefer the first option (copying the array inside the method)
However even better in my opinion would be to iterate the input array and constructing an inemurable with the result
The problem is that you put the grid array in both the front and back variables. When you assing an array to an array variable, it doesn't copy the data from one array to another, it copies the reference to the array object. All your references will end up pointing to the same array object.
To copy the data from one array to another you can use the Copy method:
int[,] front = new int[9, 9];
Array.Copy(grid, front, grid.Length);
int[,] back = new int[9, 9];
Array.Copy(grid, back, grid.Length);
Now you have two new array objects that contain the data from the original grid.

Why wouldn't `new int[x]{}` be valid?

In MonoDevelop I have the following code which compiles:
int[] row = new int[indices.Count]{};
However, at run-time, I get:
Matrix.cs(53,53): Error CS0150: A
constant value is expected (CS0150)
(testMatrix)
I know what this error means and forces me to then resize the array:
int[] row = new int[indices.Count]{};
Array.Resize(ref row, rowWidth);
Is this something I just have to deal with because I am using MonoDevelop on Linux? I was certain that under .Net 3.5 I was able to initialize an array with a variable containing the width of the array. Can anyone confirm that this is isolated? If so, I can report the bug to bugzilla.
You can't mix array creation syntax with object initialization syntax. Remove the { }.
When you write:
int[] row = new int[indices.Count];
You are creating a new array of size indices.Count initialized to default values.
When you write:
int[] row = new int[] { 1, 2, 3, 4 };
You are creating an array and then initializing it's content to the values [1,2,3,4]. The size of the array is inferred from the number of elements. It's shorthand for:
int[] row = new int[4];
row[0] = 1;
row[1] = 2;
row[2] = 3;
row[3] = 4;
The array is still first initialized to defaults, this syntax just provides a shorthand to avoid havind to write those extra assignments yourself.
The following code fails to compile for the same reason on Windows/.NET/LINQPad:
void Main()
{
int[] row = new int[indices.Count]{};
row[2] = 10;
row.Dump();
}
// Define other methods and classes here
public class indices {
public static int Count = 5;
}
However, removing the object initialisation from the declaration ({}) makes it work.
In C#, if you want to declare an empty array the syntax should be:
int[] row = new int[indices.Count];
Because when you to use use array initialization syntax AND specify the size of the array
int[] arr = new int[5]{1,2,3,4,5};
The size of the array is superfluous information. The compiler can infer the size from the initialization list. As others have said, you either create empty array:
int[] arr = new int[5];
or use the initialization list:
int[] arr = {1,2,3,4,5};

Categories