Looking for usable immutable bool array in C#

Looking for usable immutable bool array in C# - c#

I have a class which has an bool array member. If I modify an element of this array, a new modified copy of the instance should be created. Sounds like a perfect opportunity for using an Immutable type. Googling around showed that Microsoft provides a new library Immutable Collections which works quite well for another use case. But not for the aforementioned bool array member.
The seemingly fitting type ImmutableArray has been removed for time being and the documentation didn't seem to contain an indexer as well. The potential replacement ImmutableList doesn't work with structs. I'm loathe to introduce another third party library, so I'm wondering what options I have and which I should choose.
I could create a class Bool to satisfy the reference type requirement. Or I could use BitArray, but trying to use like this fails with a compile error:
IReadOnlyList<BitArray> test = new IReadOnlyList<BitArray>(new BitArray());
So any ideas what I should do?

Note that this is perfectly valid:
var ba = new BitArray(10);
ba.SetAll(true);
IImmutableList<bool> test = ba.Cast<bool>().ToImmutableList();
Your problem is that the immutable item type is bool, not BitArray! And that BitArray is from the pre-generics era, so it doesn't support IEnumerable<bool>, ICollection<bool>, IList<bool>, so you can't use it directly (see the .Cast<bool>() to solve this)

Related

ILNumerics: ILArray<T> as instance variables;

I am using ILNumerics to represent some time series.
Ideally I would like to have all data incapsulated a la object oriented and, therefore, to use instance variables and instance methods to process such variables.
I have several questions, but all related to what is the best way to implement ILArray in a class, in an efficient way and, possibly, as instance variables. I have gone through the relevant documentation and checked previous SO examples, but none seems to explicitly address these issues.
First: the example proposed on the website for 'Array Utilization Class'
[source: http://ilnumerics.net/ClassRules.html] does not seem to compile, at least with ILNumerics trial edition and VS 2013 professional (.net 4.5). Am I missing something?
Or is it because this part of the code:
public ILRetArray<double> A
{
get
{
// lazy initialization
if (m_a.IsEmpty)
{
m_a.a = ILMath.rand(100,100);
}
}
set { m_a.a = value; }
does not have a return statement?
In the mentioned example then the m_a array may be modified through the following instance method:
public void Do()
{
using (ILScope.Enter())
{
// assign via .a property only!
m_a.a = m_a + 2;
}
}
How can one access a specific component of the vector: suppose we want something like
m_a[0] = 2.2; would this get in the way of the memory management?
As a general observation, it would seem to me that the natural way of using ILNumerics is through static methods as one would write the code in Fortran (or possibly in R/Matlab): this is how I have used it, so far. Am I right or class definition having ILArray types as instance variables and relevant methods should be as efficient and straightforward?
Alternatively, would you recommend adopting System arrays as instance variables and then importing/exporting to ILarray only through static methods to perform array operation? I would tend to avoid this path or I would like to keep it as confined as possible.

The documentation section 'ILArray and Classes' has been updated. As you stated, there was a mistake in the sample code.
Modifying ILArray instances as Class Member
By following the rules described in the documentation, all array members will be of type ILArray (or ILLogical or ILCell). These types are mutable types. You can alter them freely during their lifetime. m_a[0] = 2.2; works as expected. You may also decide to replace the array completely:
m_a.a = ILMath.rand(2,3,5);
Just keep in mind, not to simply assign to the array but to use the .a = property or .Assign() method on the array. The compiler will prevent you from mistakenly assigning anyway, since you have declared your array as readonly.
Such alteration does work with the memory management smoothly.
Mixing Static Methods and Class Instances
As long as you keep an eye on the rules for both: functions (ILScope blocks, distinct input parameter array types, assignemens via .a property) and classes (readonly ILArray<T> declaration, ILMath.localMember<T> initialization) you can freely mix both schemes. It will work both ways and reuse all memory not needed anymore immediately.
Mixing intensive use of System.Array with ILArray<T> on the other side may lead to disadvantageous allocation patterns. In general, it is easy to create ILArray from System.Array. The System.Array will be used directly by the ILArray if it fits into the storage scheme (i.e. if it is 1dimensional). But the other way around is not very efficient. It generally involves a copy of the data and the ILNumerics memory management cannot work efficiently either.
That's why we recommend to stay with ILArray and the like. As you see, there are some rules to keep in mind, but usually you will internalize them very quickly.

When should `Object` be used in C# 2.0 and newer? Do Generics replace all occurrences of Object?

My coworker made the claim that there is never a need to use Object when declaring variables, return parameters, etc in .NET 2.0 and newer.
He went further and said in all such cases, a Generic should be used as the alternative.
Is there any validity to this claim? Off the top of my head I use Object for locking concurrent threads...

Generics do trump object in a lot of cases, but only where the type is known.
There are still times when you don't know the type - object, or some other relevant base type is the answer in those instances.
For example:
object o = Activator.CreateInstance("Some type in an unreferenced assembly");
You won't be able to cast that result or maybe even know what the type is at compile time, so object is a valid use.
Your co-worker is generalising too much - perhaps point him at this question. Generics are great, give him that much, but they do not "replace" object.

object is perfect for a lock. Generics allow you to keep it typed appropriately. You can even constrain it to an interface or base class. You can't do that with object.
Consider this:
void DoSomething(object foo)
{
foo.DoFoo();
}
That won't work without any casting. But with generics...
void DoSomething<T>(T foo) where T : IHasDoFoo
{
foo.DoFoo();
}
With C# 4.0 and dynamic, you could deffer this to runtime, but I really haven't seen a need.
void DoSomething(dynamic foo)
{
foo.DoFoo();
}

When using interop with COM, you don't always have a choice... Generic don't really cater for the issues of interop.
Object is also the most lightweight option for a lock, as #Daniel A. White mentioned in his answer.

Yes there is validity. A good breakdown has already been made here.
However, I cannot confirm if there is no instance where you will never use objects, but personally I do not use them and even before generics I avoided boxing/unboxing.

There are lots of counterexamples, including the one you mentioned, using an object for synchronisation.
Another example is the DataSource property used in databinding, which can be set to one of a variety of different object types.

Broad counterexample: The System.Collections namespace is alive and well in .NET 4, no sign of deprecation or warning against its use on MSDN. The methods you find there take and return Objects.

Inherent in the question are actually two questions:
When should storage locations of type `Object` be used
When should instances of type `Object` be used
Storage locations of type Object must obviously be used in any circumstance where it will be necessary to hold references to instances of that type (since references to such instances cannot be held in any other type). Beyond that, they should be used in cases where they will hold references to objects which have no single useful common base type. This is obviously true in many scenarios using Reflection (where the type of an object may depend upon a string computed at run-time), but can also apply to certain varieties of collection which are populated with things whose type is known at compile time. As a simple example, one could represent a hierarchical collection of string indexed by sequences of int by having each node be of type Object, and having it hold either a String or an Object[]. Reading out items from such a collection would be somewhat clunky, since one would have to examine each item and determine whether it was an instance of Object[] or String, but such a method of storage would be extremely memory-efficient, since the only object instances would be those which either held the strings or the arrays. One could define a Node type with a field of type String and one of type Node[], or even define an abstract Node type with derived types StringNode (including a field of type String) and ArrayNode (with a field of type Node[]) but such approaches would increase the number of heap objects used to hold a given set of data.
Note that in general it's better to design collections so that the type of an object to be retrieved won't depend upon what's been shoved into the collection (perhaps using "parallel collections" for different types) but not everything works out that way semantically.
With regard to instances of type Object, I'm not sure there's any role they can fill which wouldn't be just as well satisfied by a sealed type called something like TokenObject which inherits from Object. There are a number of situations where it is useful to have an object instance whose sole purpose is to be a unique token. Conceptually, it might have been nicer to say:
TokenObject myLock = new TokenObject;
than to say
Object myLock = new Object;
since the former declaration would make clear that the declared variable was never going to be used to hold anything other than a token object. Nonetheless, common practice is to use instances of type Object in cases where the only thing that matters about the object is that its reference will be unique throughout the lifetime of the program.

Are there drawbacks to creating a class that encapsulates Generic Collection?

A part of my (C# 3.0 .NET 3.5) application requires several lists of strings to be maintained. I declare them, unsurprisingly, as List<string> and everything works, which is nice.
The strings in these Lists are actually (and always) Fund IDs. I'm wondering if it might be more intention-revealing to be more explicit, e.g.:
public class FundIdList : List<string> { }
... and this works as well. Are there any obvious drawbacks to this, either technically or philosophically?

I would start by going in the other direction: wrapping the string up into a class/struct called FundId. The advantage of doing so, I think, is greater than the generic list versus specialised list.
You code becomes type-safe: there is a lot less scope for you to pass a string representing something else into a method that expects a fund identifier.
You can constrain the strings that are valid in the constructor to FundId, i.e. enforce a maximum length, check that the code is in the expected format, &c.
You have a place to add methods/functions relating to that type. For example, if fund codes starting 'I' are internal funds you could add a property called IsInternal that formalises that.
As for FundIdList, the advantage to having such a class is similar to point 3 above for the FundId: you have a place to hook in methods/functions that operate on the list of FundIds (i.e. aggregate functions). Without such a place, you'll find that static helper methods start to crop up throughout the code or, in some static helper class.

List<> has no virtual or protected members - such classes should almost never be subclassed. Also, although it's possible you need the full functionality of List<string>, if you do - is there much point to making such a subclass?
Subclassing has a variety of downsides. If you declare your local type to be FundIdList, then you won't be able to assign to it by e.g. using linq and .ToList since your type is more specific. I've seen people decide they need extra functionality in such lists, and then add it to the subclassed list class. This is problematic, because the List implementation ignores such extra bits and may violate your constraints - e.g. if you demand uniqueness and declare a new Add method, anyone that simply (legally) upcasts to List<string> for instance by passing the list as a parameter typed as such will use the default list Add, not your new Add. You can only add functionality, never remove it - and there are no protected or virtual members that require subclassing to exploit.
So you can't really add any functionality you couldn't with an extension method, and your types aren't fully compatible anymore which limits what you can do with your list.
I prefer declaring a struct FundId containing a string and implementing whatever guarantees concerning that string you need there, and then working with a List<FundId> rather than a List<string>.
Finally, do you really mean List<>? I see many people use List<> for things for which IEnumerable<> or plain arrays are more suitable. Exposing your internal List in an api is particularly tricky since that means any API user can add/remove/change items. Even if you copy your list first, such a return value is still misleading, since people might expect to be able to add/remove/change items. And if you're not exposing the List in an API but merely using it for internal bookkeeping, then it's not nearly as interesting to declare and use a type that adds no functionality, only documentation.
Conclusion
Only use List<> for internals, and don't subclass it if you do. If you want some explicit type-safety, wrap string in a struct (not a class, since a struct is more efficient here and has better semantics: there's no confusion between a null FundId and a null string, and object equality and hashcode work as expected with structs but need to be manually specified for classes). Finally, expose IEnumerable<> if you need to support enumeration, or if you need indexing as well use the simple ReadOnlyCollection<> wrapper around your list rather than let the API client fiddle with internal bits. If you really need a mutatable list API, ObservableCollection<> at least lets you react to changes the client makes.

Personally I would leave it as a List<string>, or possibly create a FundId class that wraps a string and then store a List<FundId>.
The List<FundId> option would enforce type correct-ness and allow you to put some validation on FundIds.

Just leave it as a List<string>, you variable name is enough to tell others that it's storing FundIDs.
var fundIDList = new List<string>();
When do I need to inherit List<T>?
Inherit it if you have really special actions/operations to do to a fund id list.
public class FundIdList : List<string>
{
public void SpecialAction()
{
//can only do with a fund id list
//sorry I can't give an example :(
}
}

Unless I was going to want someone to do everything they could to List<string>, without any intervention on the part of FundIdList I would prefer to implement IList<string> (or an interface higher up the hierarchy if I didn't care about most of that interface's members) and delegate calls to a private List<string> when appropriate.
And if I did want someone to have that degree of control, I'd probably just given them a List<string> in the first place. Presumably you have something to make sure such strings actually are "Fund IDs", which you can't guarantee any more when you publicly use inheritance.
Actually, this sounds (and often does with List<T>) like a natural case for private inheritance. Alas, C# doesn't have private inheritance, so composition is the way to go.

In C# and also Java, what's the relationship between Object[] and String[]?

I recently started to think of this problem and I can't find the answer.
The following code compiles and executes as expected
object[] test = new string[12];
However, I don't know why.
I mean, should we consider string[] as the derived class of object[]?
I think in C#, every array is an instance of Array class. If Array is generic, it should be Array<T>, and Array<string> can be assigned to Array<object>, it doesn't make sense. I remember only interface can use in/out keyword.
And in Java, I'm not sure, but still feel weird. Why different types of references can be possibly assigned to each other when they don't have super-sub class relationship?
Can somebody explain a little?
Thanks a lot!

It's because reference type arrays support covariance in both Java and C#. It also means that every write into a reference type array has to be checked at execution time, to make sure you don't write the wrong type of element into it :(
Don't forget that both Java and C# (and .NET in general) started off without generics. If they had had generics to start with, life could have been somewhat different.
Note that both Java and C# support generic variance now, but in rather different ways. So for example in C# 4 you can write:
IEnumerable<string> strings = // Get some string sequence here
IEnumerable<object> objects = strings;
but you can't write
IList<string> strings = // Get some string list here
// Compile-time error: IList<T> isn't covariant in T
IList<object> objects = strings;
This wouldn't be safe, because you can add to an IList<T> as well as taking items from it.
This is a big topic - for more details, see Eric Lippert's blog series.

In C# there is (and always been) covariance of arrays of reference-types. It still is a string[], but you can legally cast it to an object[] (and access values as you would expect).
But try putting in an int (or any other non-string value) and you'll see that it still behaves appropriately (i.e. doesn't let you).

This is because object is the parent (or the superclass) for all other classes. Search for boxing/ unboxing for more data.

Since all the really smart guys are talking about covariance and contravariance and I couldn't for the life of me explain (or understand) this stuff, listen to Eric Lippert:
Covariance and Contravariance FAQ

(Deep) comparison of an object to a reference in unit tests (C#)

In a Unit Test (in Visual Studio 2008) I want to compare the content of a large object (a list of custom types, to be precise) with a stored reference of this object. The goal is to make sure, that any later refactorings of the code produces the same object content.
Discarded Idea:
A first thought was to serialize to XML, and then compare the hardcoded strings or a file content. This would allow for easy finding of any difference. However since my types are not XML serializable without a hack, I must find another solution. I could use binary serialization but this will not be readable anymore.
Is there a simple and elegant solution to this?
EDIT: According to Marc Gravell's proposal I do now like this:
using (MemoryStream stream = new MemoryStream())
{
//create actual graph using only comparable properties
List<NavigationResult> comparableActual = (from item in sparsed
select new NavigationResult
{
Direction = item.Direction,
/*...*/
VersionIndication = item.VersionIndication
}).ToList();
(new BinaryFormatter()).Serialize(stream, comparableActual);
string base64encodedActual = System.Convert.ToBase64String(stream.GetBuffer(), 0, (int)stream.Length);//base64 encoded binary representation of this
string base64encodedReference = #"AAEAAAD....";//this reference is the expected value
Assert.AreEqual(base64encodedReference, base64encodedActual, "The comparable part of the sparsed set is not equal to the reference.");
}
In essence I do select the comparable properties first, then encode the graph, then compare it to a similarly encoded reference.
Encoding enables deep comparison in a simple way. The reason I use base64 encoding is, that I can easily store the reference it in a string variable.

I would still be inclined to use serialization. But rather than having to know the binary, just create an expected graph, serialize that. Now serialize the actual graph and compare bytes. This is only useful to tell you that there is a difference; you'd need inspection to find what, which is a pain.

I would use the hack to do XML comparision. Or you could use reflection to automaticaly traverse object properties (but this will traverse ALL of them, also some you could not want to).

I would make each custom type inherit IComparable, and provide equality methods, that compare each custom types, as well as making the main class ICompareble, You can then simply compare the 2 objects ( if you have them in memory when running unit tests) If not then I would suggest either serializing, or defining constants which you expect the refactored object to have.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.