What is the structure of an Array ? - c#

Well I know that Array in C# is an object
but a bit of code has confused me actually
int[] numbers = {4, 5, 6, 1, 2, 3, -2, -1, 0};
foreach (int i in numbers)
Console.WriteLine(i);
to access an arbitrary property of an arbitrary object int value= object.property;
in this loop it's kind of accessing the properties, but how ?
and what is the property itself here ? how they are being organized?

How data is stored
Basically an array is a blob of data. Integers are value types of 32-bit signed integers.
Identifiers in C# are either pointers to objects, or the actual values. In the case of reference types, they are real pointers, in the case of values types (e.g. int, float, etc) they are the actual data. int is a value type, int[] (array to integers) is a reference type.
The reason it works like this is basically "efficiency": the overhead of copying 4 bytes or 8 bytes for a value type is very small, while the overhead of copying an entire array can be quite extensive.
If you have an array containing a N integers, it's nothing more than a blob of N*4 bytes, with the variable pointing to the first element. There is no name for each element in the blob.
E.g.:
int[] foo = new int[10]; // allocates 40 bytes, naming the whole thing 'foo'
int f = foo[2]; // allocates variable 'f', copies the value of 'foo[2]' into 'f'.
Access the data
As for foreach... In C#, all collections implement an interface named IEnumerable<T>. If you use it, the compiler will in this case notice that it's an integer array and it will loop through all elements. In other words:
foreach (int f in foo) // copy value foo[0] into f, then foo[1] into f, etc
{
// code
}
is (in the case of arrays!) the exact same thing as:
for (int i=0; i<foo.Length; ++i)
{
int f = foo[i];
// code
}
Note that I explicitly put "in the case of arrays" here. Arrays are an exceptional case for the C# compiler. If you're not working with arrays (but f.ex. with a List, a Dictionary or something more complex), it works a bit differently, namely by using the Enumerator and IDisposable. Note that this is just a compiler optimization, arrays are perfectly capable of handling IEnumerable.
For those interested, basically it'll generate this for non-arrays and non-strings:
var e = myEnumerable.GetEnumerator();
try
{
while (e.MoveNext())
{
var f = e.Current;
// code
}
}
finally
{
IDisposable d = e as IDisposable;
if (d != null)
{
d.Dispose();
}
}
If you want a name
You probably need a Dictionary.

An array is just a collection, or IEnumerable<T> where T represents a particular type; in your case int (System.Int32)
Memory allocation...
An int is 32 bits long, and you want an array of 10 items
int[] numbers = new int[10];
32 x 10 = 320 bits of memory allocated
Memory access...
Say your array starts at 0000:0000 and you want to access index n [0-9] of the array...
Pseudo code...
index = n
addressof(n) = 0000:0000 + (sizeof(int) * index)
or
index = 2
addressof(n) = 0000:0000 + (32 * 2)
This is a simplified example of what happens when you access each indexed item within the array (your foreach loop sort of does this)
A foreach loop works by referencing each element in the array (in your case, the reference is called i).
Why you are NOT accessing by property...
In an array, items are stored by index, not by name...
... so you can't...
numbers.1
numbers.2
...but you can...
numbers[1]
numbers[2]
Since every object derives from object, you can access arbitrary members of the particular type...
numbers[1].GetHashCode();
Example:
//Define an array if integers, shorthand (see, memory allocation)...
int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
//For each element in the array (under the hood, get the element's address in memory and hold this in a reference called i)...
foreach(int i in numbers)
{
// Call GetHashCode() on the instance pointed to by i and pass the reference (by value) to the Console.WriteLine method.
Console.WriteLine(i.GetHashCode());
}

Related

Getting the number of elements in a generic array

For a school project, I need to return the number of elements in a generic array (a T[] array).
In the constructor I set the array like this:
T[] arr = new T[100];
arr.add(3);
arr.add(2);
arr.add(1);
To find the size of the array I tried array.length, however this return the capacity which would be 100 but it should be 3.
How could I find the correct answer 3?
EDIT:
The add function is a simple function that checks if the size is smaller than default_capacity add data to array. Size variable is crucial and the function expects the total number of elements in the array.
public void Add(T x)
{
if(size < DEFAULT_CAPACITY)
{
array[size] = x;
}
}
Here the array has not a capacity of 100: it has 100 items.
An array has no capacity, it has Length items...
For example, an array of 100 integers has 100 boxes initialized to 0.
And an array has no add method...
You may use a List<T> instead and you'll have Count property and Add method.
List<> is more smooth and usefull than arrays, but a little less optimized.
If the work is to use a such initialized array, you may use #itsme86 advice in question comment.
But here what is add method in your code?
You can use your intended array like that:
public class GenericArray<T>
{
public readonly T[] arr = new T[100];
}
var myArray = new GenericArray<int>();
myArray.arr[0] = 3;
myArray.arr[1] = 2;
myArray.arr[10] = 1;
And you still have 100 items: myArray.arr.Length is 100.
You can use a generic list like that:
public class GenericList<T>
{
public readonly List<T> list = new List<T>(100);
}
var myList = new GenericList<int>();
myList.list.Add(3);
myList.list.Add(2);
myList.list.Add(1);
And here you have 3 items: myList.list.Count is 3.
The list has here a capacity of 100: it means you can add items without resizing the internal array.

Why isn't assign on a foreach iteration variable in Array.ForEach an error?

Given an array of int numbers like:
int[] arr = new int[] { 0, 1, 2, 3, 4, 5 };
If we want to increment every number by 1, the best choice would be:
for(int i = 0; i < arr.Length; i++)
{
arr[i]++;
}
If we try to do it using foreach
foreach(int n in arr)
{
n++;
}
as expected, we meet the error:
Cannot assign to 'n' because it is a 'foreach iteration variable'
Why if we use this approach:
Array.ForEach(arr, (n) => {
n++;
});
which is equal to the foreach above, visual studio and compiler aren't going to tell us anything, the code is going to compile and just not producing any result in runtime, neither throw an exception?
foreach(int n in arr)
{
n++;
}
This is a language construct, the compiler knows exactly what a foreach-loop is supposed to do and what nis. It can therefore prevent you from changing the iteration variable n.
Array.ForEach(arr, (n) => {
n++;
});
This is a regular function call passing in a lambda. It is perfectly valid to modify local variables in a function (or lambda), so changing n is okay. While the compiler could warn you that the increment has no effect as it's never been used afterwards, it's valid code, and just because the function is called ForEach and actually does something similar to the foreach-loop doesn't change the fact that this is a regular function and a regular lambda.
As pointed out by #tkausl, n with ForEach is a local variable. Therefore:
static void Main()
{
int[] arr = new int[] { 0, 1, 2, 3, 4, 5 };
Console.WriteLine(string.Join(" ",arr));
Array.ForEach(arr, (n) => {
n++;
});
Console.WriteLine(string.Join(" ",arr));
}
will output:
0 1 2 3 4 5
0 1 2 3 4 5
Meaning you don't change the values of arr.
Array.ForEach is not identical to a foreach-loop. It´s an extension-method which will iterate a collection and performs an action on every of its elements.
Array.ForEach(arr, (n) => {
n++;
});
however won´t modify the actuzal collection, it will just re-assign a new value to n which has no relation to the underlying value in the array, because it´s a value-type which is **copied* to the anonymous method. So whatever you do with the param in your anonymous method isn´t reflected to the ForEach-method and thus has no effect in your array. This is why you can do this.
But even if you had an array of reference-types that would work, because you simply re-assign a new instance to the provided parameter, which again has no effect to the underlying array.
Take a look at this simplified example:
MyClass
{
void ForEach(Action<Item> a)
{
foreach(var e in myList)
Action(e);
}
}
In your case the action looks like this:
x => x++
which simply assigns a new value to x. As x however is passed by value, this won´t have any effect to the calling method and thus to myList.
Both are two different things.
First we need to be clear what we need. If the requirement is to mutate the existing values then you can use for loop as modifying the values while enumerating the collection shouldn't be done that' why you face error for the first foreach loop.
So one approach could be if mutating is the intention:
for(int i=0; i< arr.Length; i++)
{
arr[i] = arr[i] +1;
}
Secondly, If the intention is to get a new collection with the updated values then consider using linq Select method which will return a new collection of int.
var incrementedArray = arr.Select( x=> (x+1));
EDIT:
the key difference is in the first example we are modifying the values of colelction while enumerating it while in lambda syntax foreach a delegate is used which get input as local variable.
The foreach statement executes a statement or a block of statements for each element in an instance of the type that implements the System.Collections.IEnumerable or System.Collections.Generic.IEnumerable<T> interface. You cannot modify iterated value because you are using System.Collections.IEnumberable or System.COllections.Generic.IEnumberable<T> interfaces which support deferred execution.
If you want to modify value you can also use
foreach(ref int n in arr)
{
n++;
}
Updated
The Array.Foreach is a method that performs specified action on each element of the specified array. This function support immediate execution behavior and can be applied to only data that holds in memory. The Array.Foreach method take an array and used For loop to iterate through collection.
foreach and Array.Foreach both looks same but are different in their working.

How to extract a fragment of an element in an existing array to generate a new array

I have an array :
string[] arr = new string[2]
arr[0] = "a=01"
arr[1] = "b=02"
How can I take those number out and make a new array to store them? What I am expecting is :
int [] newArr = new int[2]
Inside newArr, there are 2 elements, one is '01' and the other one is '02' which both from arr.
Another way besides Substring to get the desired result is to use String.Split on the = character. This is assuming the string will always have the format of letters and numbers, separated by a =, with no other = characters in the input string.
for (var i = 0; i < arr.Length; i++)
{
// Split the array item on the `=` character.
// This results in an array of two items ("a" and "01" for the first item)
var tmp = arr[i].Split('=');
// If there are fewer than 2 items in the array, there was not a =
// character to split on, so continue to the next item.
if (tmp.Length < 2)
{
continue;
}
// Try to parse the second item in the tmp array (which is the number
// in the provided example input) as an Int32.
int num;
if (Int32.TryParse(tmp[1], out num))
{
// If the parse is succesful, assign the int to the corresponding
// index of the new array.
newArr[i] = num;
}
}
This can be shortened in a lambda expression like the other answer like so:
var newArr = arr.Select(x => Int32.Parse(x.Split('=')[1])).ToArray();
Though doing it with Int32.Parse can result in an exception if the provided string is not an integer. This also assumes that there is a = character, with only numbers to the right of it.
Take a substring and then parse as int.
var newArr = arr.Select(x=>Int32.Parse(x.Substring(2))).ToArray();
As other answers have noted, it's quite compact to use linq. PM100 wrote:
var newArr = arr.Select(x=>Int32.Parse(x.Substring(2))).ToArray();
You asked what x was.. that linq statement there is conceptually the equivalent of something like:
List<int> nums = new List<int>();
foreach(string x in arr)
nums.Add(Int32.Parse(x.Substring(2);
var newArr = nums.ToArray();
It's not exactly the same, internally linq probably doesn't use a List, but it embodies the same concept - for each element (called x) in the string array, cut the start off it, parse the result as an int, add it to a collection, convert the collection to an array
Sometimes I think linq is overused; here probably efficiencies could be gained by directly declaring an int array the size of the string one and filling it directly, rather than adding to a List or other collection, that is later turned into an int array. Proponents of either style could easily be found; linq is compact and makes relatively trivial work of more long hand constructs such as loops within loops within loops. Though not necessarily easy to work out for those unfamiliar with how to read it it does bring a certain self documenting aspect to code because it uses English words like Any, Where, Distinct and these more quickly convey a concept than does looking at a loop code that exits early when a test returns true (Any) or builds a dictionary/hashset from all elements and returns it (Distinct)

Is equivalent the memory used by an array of ints vs an array of structs having just one int?

Considering the next struct...
struct Cell
{
int Value;
}
and the next matrix definitions
var MatrixOfInts = new int[1000,1000];
var MatrixOfCells = new Cell[1000,1000];
which one of the matrices will use less memory space? or are they equivalent (byte per byte)?
Both are the same size because structs are treated like any of the other value type and allocated in place in the heap.
long startMemorySize2 = GC.GetTotalMemory(true);
var MatrixOfCells = new Cell[1000, 1000];
long matrixOfCellSize = GC.GetTotalMemory(true);
long startMemorySize = GC.GetTotalMemory(true);
var MatrixOfInts = new int[1000, 1000];
long matrixOfIntSize = GC.GetTotalMemory(true);
Console.WriteLine("Int Matrix Size:{0}. Cell Matrix Size:{1}",
matrixOfIntSize - startMemorySize, matrixOfCellSize - startMemorySize2);
Here's some fun reading from Jeffery Richter on how arrays are allocated http://msdn.microsoft.com/en-us/magazine/cc301755.aspx
By using the sizeof operator in C# and executing the following code (under Mono 3.10.0) I get the following results:
struct Cell
{
int Value;
}
public static void Main(string[] args)
{
unsafe
{
// result is: 4
var intSize = sizeof(int);
// result is: 4
var structSize = sizeof(Cell);
}
}
So it looks like that an integer and a struct storing an integer consume the same amount of memory, I would therefore assume that arrays would also require an equal amount of memory.
In an array with value-type elements, all of the elements are required to be of the exact same type. The object holding the array needs to store information about the type of elements contained therein, but that information is only stored once per array, rather than once per element.
Note that because arrays receive special handling in the .NET Framework (compared to other collection types) arrays of a structure type will allow elements of the structures contained therein to be acted upon "in-place". As a consequence, if one can limit oneself to storing a structure within an array (rather than some other collection type) and can minimize unnecessary copying of struct instances, it is possible to operate efficiently with structures of almost any size. If one needs to hold a collection of things, each of which will have associated with it four Int64 values and four Int32 values (a total of 48 bytes), using an array of eight-element exposed-field structures may be more efficient and semantically cleaner than representing each thing using four elements from an Int64[] and four elements from an Int32[], or using an array of references to unshared mutable class objects.

Why wouldn't `new int[x]{}` be valid?

In MonoDevelop I have the following code which compiles:
int[] row = new int[indices.Count]{};
However, at run-time, I get:
Matrix.cs(53,53): Error CS0150: A
constant value is expected (CS0150)
(testMatrix)
I know what this error means and forces me to then resize the array:
int[] row = new int[indices.Count]{};
Array.Resize(ref row, rowWidth);
Is this something I just have to deal with because I am using MonoDevelop on Linux? I was certain that under .Net 3.5 I was able to initialize an array with a variable containing the width of the array. Can anyone confirm that this is isolated? If so, I can report the bug to bugzilla.
You can't mix array creation syntax with object initialization syntax. Remove the { }.
When you write:
int[] row = new int[indices.Count];
You are creating a new array of size indices.Count initialized to default values.
When you write:
int[] row = new int[] { 1, 2, 3, 4 };
You are creating an array and then initializing it's content to the values [1,2,3,4]. The size of the array is inferred from the number of elements. It's shorthand for:
int[] row = new int[4];
row[0] = 1;
row[1] = 2;
row[2] = 3;
row[3] = 4;
The array is still first initialized to defaults, this syntax just provides a shorthand to avoid havind to write those extra assignments yourself.
The following code fails to compile for the same reason on Windows/.NET/LINQPad:
void Main()
{
int[] row = new int[indices.Count]{};
row[2] = 10;
row.Dump();
}
// Define other methods and classes here
public class indices {
public static int Count = 5;
}
However, removing the object initialisation from the declaration ({}) makes it work.
In C#, if you want to declare an empty array the syntax should be:
int[] row = new int[indices.Count];
Because when you to use use array initialization syntax AND specify the size of the array
int[] arr = new int[5]{1,2,3,4,5};
The size of the array is superfluous information. The compiler can infer the size from the initialization list. As others have said, you either create empty array:
int[] arr = new int[5];
or use the initialization list:
int[] arr = {1,2,3,4,5};

Categories