Initializing a collection so the user doesn't have to - c#

This might be a stupid question, but is there any common practice for initializing collection properties for a user, so they don't have to new up a new concrete collection before using it in a class?
Are any of these preferred over the other?
Option 1:
public class StringHolderNotInitialized
{
// Force user to assign an object to MyStrings before using
public IList<string> MyStrings { get; set; }
}
Option 2:
public class StringHolderInitializedRightAway
{
// Initialize a default concrete object at construction
private IList<string> myStrings = new List<string>();
public IList<string> MyStrings
{
get { return myStrings; }
set { myStrings = value; }
}
}
Option 3:
public class StringHolderLazyInitialized
{
private IList<string> myStrings = null;
public IList<string> MyStrings
{
// If user hasn't set a collection, create one now
// (forces a null check each time, but doesn't create object if it's never used)
get
{
if (myStrings == null)
{
myStrings = new List<string>();
}
return myStrings;
}
set
{
myStrings = value;
}
}
}
Option 4:
Any other good options for this?

In this case, I don't see the reason for the lazy loading, so I would go with option 2. If you are creating a ton of these objects, then the number of allocations and GCs that result would be an issue, but that's not something to consider really unless it proves to be a problem later.
Additionally, for things like this, I would typically not allow the assignment of the IList to the class. I would make this read-only. In not controlling the implementation of the IList, you open yourself up to unexpected implementations.

For Option 1: if you want to force user to initialize something before using your class then best place to force it is in constructor.
For Option 2: If you're not forcing the user then you really have to initialize an empty collection yourself in the constructor/initializer.
For Option 3: Lazy initialization only makes sense if it involves too much work or its a slow/bulky operation.
My vote goes for option 2.

The only real reason for using the lazy loading solution is for an optimization. And the first rule of optimization is "don't optimize unless you've measured" :)
Based on that I would go with the solution least likely to cause an error. In this case that would be solution #2. Setting it in an initializer virtually eliminates the chance of a null ref here. The only way it will occur is if the user explicitly sets it to null.

Related

smarter way of protecting foreach loop against null

Is there a smarter way of protecting foreach loops against NullReference exceptions than this:
if (G_Locatie.OverdrachtFormulierList != null)
{
foreach (OverdrachtFormulier otherform in G_Locatie.OverdrachtFormulierList)
{
...
}
}
I use a lot of foreach loops, often nested, and a lot of variables where e.g. G_Location certainly exists, but datamember .OverdrachtFormulierList may not have been assigned a list use new yet.
Dear friends, thanks for all your comments. After getting the idea of your suggestions, while having a lot of trouble understanding exactly, after digging through the Lasagna code I got to work on, and after some experimentation, I found that the easiest and cleanest way is to simply avoid having the NULL, by proper initialization. While I kind of resist having to initialize the OverdrachtFormulierList in my code, with the risk of forgetting one instance, I found the proper place for initialization, namely in the original class definition.
For simplicity, look at this code:
class MyClass
{
public List<string> items = new List<string>();
public IEnumerator<string> GetEnumerator()
{
return items.GetEnumerator();
}
}
class MyComplexClass
{
private MyClass _itemlist /*= new MyClass()*/;
public MyClass itemlist
{
get { return _itemlist; }
set { _itemlist = value; }
}
}
void Sandbox()
{
MyClass mc /*= new MyClass()*/;
foreach (string Sb in mc.items)
{
string x = Sb;
}
MyComplexClass mcc = new MyComplexClass();
foreach (string Mb in mcc.itemlist) // <--- NullReferenceException
{
string x = Mb;
}
return;
}
The fun thing is that C# seems to protect you from a lot of buggy mistakes. This code will not build if you do not uncomment the initialization in Sandbox(), so the first foreach will not get a NullReferenceException.
However, you'd better uncomment the init in MyComplexClass to avoid the exception in the second foreach. C# will build with and without this initialization.
So it turns out that in my real code I just have to add a simple initialization in the Class definition of G_Locatie.
The only issue now is that I always wanted to simplify the above code with {get; set;} but that would not be possible with the initialization as described. I will have to live with that minor issue.
In fact, on object-type properties, you don't really need the setter.
Finally, I realized that I could not find a proper title for my problem. So far, every problem I had was already answered in this forum, and I feel that I had to post today only because I could not find posts similar to this one. Perhaps someone can come up with title and tags that make this solution better findable.
Yes, your collection properties should return empty collections rather than null. One way you can ensure this is by using a backing field and assigning a new list in the getter:
private List<string> overdrachtFormulierList;
public List<string> OverdrachtFormulierList
{
get
{
return this.overdrachtFormulierList ??
(this.overdrachtFormulierList = new List<string>());
}
set
{
this.overdrachtFormulierList = value;
}
}
You can also use Enumerable.Empty<T> if your types are IEnumerable<T>
One option would be to create an extension method:
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable source)
{
return source ?? Enumerable.Empty<T>();
}
Then:
foreach (var otherform in G_Locatie.OverdrachtFormulierList.EmptyIfNull())
{
...
}
It would still be preferable to always use an empty collection instead of a null reference, mind you.

Pros/Cons on Lists with subsidiary objects

I'm again in the position to figure a way out to handle lists with subsidiary objects on our business objects.
Actually, our code often looks like this:
public class Object
{
private List<SubsidiaryObject> subsidiaryObjects = null;
public List<SubsidiaryObject> SubsidiaryObjects
{
get
{
if (this.subsidiaryObjects == null)
{
this.subsidiaryObjects = DBClass.LoadListFromDatabase();
}
return this.subsidiaryObjects;
}
set
{
this.subsidiaryObjects = value;
}
}
}
The Con on this:
The property is referenced in presentation layer and used for DataBinding. Releasing the reference to the actual list and replacing it with a new one will end in an referenced list in the GUI that does not have anything left with the list on the object.
The Pro on this:
Easy way of reloading the list (just set the reference to null and then get it again).
I developed another class that uses the following pattern:
public class Object2
{
private readonly List<SubsidiaryObject> subsidiaryObjects = new List<SubsidiaryObject>();
public List<SubsidiaryObject> SubsidiaryObjects
{
get
{
return this.subsidiaryObjects;
}
}
public void ReloadSubsidiaryObjects()
{
this.SubsidiaryObjects.Clear();
this.SubsidiaryObjects.AddRange(DBClass.LoadListFromDatabase());
}
}
Pro on this:
Reference is continous.
The Con on this:
Reloading the list is more difficult, since it just cannot be replaced, but must be cleared/filled with reloaded items.
What is your preferred way, for what situations?
What do you see as Pro/Con for either of these to patterns?
Since this is only a general question, not for a specific problem, every answer is welcome.
Do you need the caller to be able to modify the list? If not you should consider returning IEnumerable<T> or ReadOnlyCollection instead. And even if you do, you will probably be better off making cover versions for Add/Remove so you can intercept modifications. Handing a reference to internal state is not a good idea IMO.
A third option would be to go with option 2, but to create a new instance of the Object2 type each time you need to repopulate the list. Without additional context for the question, that is the option I would select, but there may be reasons why you would want to hold on to the original instance.

C#: Encapsulation of for example collections

I am wondering which one of these would be considered the cleanest or best to use and why.
One of them exposes the a list of passengers, which let the user add and remove etc. The other hides the list and only let the user enumerate them and add using a special method.
Example 1
class Bus
{
public IEnumerable<Person> Passengers { get { return passengers; } }
private List<Passengers> passengers;
public Bus()
{
passengers = new List<Passenger>();
}
public void AddPassenger(Passenger passenger)
{
passengers.Add(passenger);
}
}
var bus = new Bus1();
bus.AddPassenger(new Passenger());
foreach(var passenger in bus.Passengers)
Console.WriteLine(passenger);
Example 2
class Bus
{
public List<Person> Passengers { get; private set; }
public Bus()
{
Passengers = new List<Passenger>();
}
}
var bus = new Bus();
bus.Passengers.Add(new Passenger());
foreach(var passenger in bus.Passengers)
Console.WriteLine(passenger);
The first class I would say is better encapsulated. And in this exact case, that might be the better approach (since you should probably make sure it's space left on the bus, etc.). But I guess there might be cases where the second class may be useful as well? Like if the class doesn't really care what happens to that list as long as it has one. What do you think?
In example one, it is possible to mutate your collection.
Consider the following:
var passengers = (List<Passenger>)bus.Passengers;
// Now I have control of the list!
passengers.Add(...);
passengers.Remove(...);
To fix this, you might consider something like this:
class Bus
{
private List<Passenger> passengers;
// Never expose the original collection
public IEnumerable<Passenger> Passengers
{
get { return passengers.Select(p => p); }
}
// Or expose the original collection as read only
public ReadOnlyCollection<Passenger> ReadOnlyPassengers
{
get { return passengers.AsReadOnly(); }
}
public void AddPassenger(Passenger passenger)
{
passengers.Add(passenger);
}
}
In most cases I would consider example 2 to be acceptable provided that the underlying type was extensible and/or exposed some form of onAdded/onRemoved events so that your internal class can respond to any changes to the collection.
In this case List<T> isn't suitable as there is no way for the class to know if something has been added. Instead you should use a Collection because the Collection<T> class has several virtual members (Insert,Remove,Set,Clear) that can be overridden and event triggers added to notify the wrapping class.
(You do also have to be aware that users of the class can modify the items in the list/collection without the parent class knowing about it, so make sure that you don't rely on the items being unchanged - unless they are immutable obviously - or you can provide onChanged style events if you need to.)
Run your respective examples through FxCop and that should give you a hint about the risks of exposing List<T>
I would say it all comes down to your situation. I would normally go for option 2 as it is the simplest, unless you have a business reason to add tighter controls to it.
Option 2 is the simplest, but that lets other classes to add/remove elements to the collection, which can be dangerous.
I think a good heuristic is to consider what the wrapper methods do. If your AddPassenger (or Remove, or others) method is simply relaying the call to the collection, then I would go for the simpler version. If you have to check the elements before inserting them, then option 1 is basically unavoidable. If you have to keep track of the elements inserted/deleted, you can go either way. With option 2 you have to register events on the collection to get notifications, and with option 1 you have to create wrappers for every operation on the list that you want to use (e.g. if you want Insert as well as Add), so I guess it depends.

using two different public properties to "get" the same private variable with different return types

I've got a Customer class that has a List<string> Roles property. Much of the time I want to access that property as a list of strings, but on occasion I want to see it as a comma-delimited list.
I could certainly do that in a new method, and if I anticipated wanting to get the value of the variable in different formats (comma-delimited, tab-delimited, &ct) I would certainly do so. However, I'm toying with the idea of using two different properties to access the variable value, something along the lines of
public List<string> Roles
{
get { return this._Roles; }
set { this._Roles = value; }
}
and
public string RolesToString
{
get { do some work here to comma-delimit the list; }
}
Essentially I want to override the ToString() method of this particular list. Are there compelling reasons for doing 1 over the other? Is using two different properties to return the same variable value sufficiently non-standard to cause red flags?
I would make your second "property" a method. It's doing additional processing on your list, and returning something that isn't a direct "property" of the object, but more a processed version of the object's property. This seems like a reasonable method candidate.
My preference would be:
public List<string> Roles
{
get { return this._Roles; }
set { this._Roles = value; }
}
public string GetRolesAsString()
{
// Do processing on Roles
}
As Reed says it should probably be a Method, but thats kindof subjective.
Note that you don't need much code to do it - just a call to Join()
public string RolesAsString()
{
return String.Join(", ", this._Roles);
}
And given that string joining is so easy in .NET, do you really need a method/property for it?
I have no problem with what you propose. Except I would name it RolesString.
But... Why only a getter? If I can set the Roles property, why could I not set the RolesString property?
Additional processing does not necessarily mean a method should be used.

Best way of protect a backing field from mistaken use in C#

I have a class (Foo) which lazy loads a property named (Bar). What is your preferred way to protect against mistaken use (due to intellisense or inexperienced staff) of the uninitialized backing field?
I can think of 3 options:
class Foo {
// option 1 - Easy to use this.bar by mistake.
string bar;
string Bar {
get {
// logic to lazy load bar
return bar;
}
}
// option 2 - Harder to use this._bar by mistake. It is more obscure.
string _bar2;
string Bar2 {
get {
// logic to lazy load bar2
return _bar2;
}
}
//option 3 - Very hard to use the backing field by mistake.
class BackingFields {
public string bar;
}
BackingFields fields = new BackingFields();
string Bar3 {
get {
// logic to lazy load bar
return fields.bar;
}
}
}
Keep in mind, the only place I want people mucking around with the backing field bar is in setter and getter of the property. Everywhere else in the class they should always use this.Bar
Update
I am currently using the following Lazy implementation (not for all properties with backing fields, but for select ones that require lazy loading, synchronization and notification). It could be extended to support futures as well (force evaluation in a separate thread in a later time)
Note My implementation locks on read, cause it supports an external set.
Also, I would like to mention that I think this is a language limitation which can be overcome in Ruby for example.
You can implement lazy in this way.
x = lazy do
puts "<<< Evaluating lazy value >>>"
"lazy value"
end
puts x
# <<< Evaluating lazy value >>>
# lazy value
How about use of ObsoleteAttribute and #pragma - hard to miss it then!
void Test1()
{
_prop = ""; // warning given
}
public string Prop
{
#pragma warning disable 0618
get { return _prop; }
set { _prop = value; }
#pragma warning restore 0618
}
[Obsolete("This is the backing field for lazy data; do not use!!")]
private string _prop;
void Test2()
{
_prop = ""; // warning given
}
Option 5
Lazy<T>
works quite nicely in several situations, though option 1 should really be just fine for most projects so long as the developers aren't idiots.
Adding [EditorBrowsable(EditorBrowsableState.Never)] to the field won't help if it is private since this logic only kicks in for intellisense generated from metadata rather than the current code (current project and anything done via project references rather than dlls).
Note: Lazy<T> is not thread safe (this is good, there's no point locking if you don't need to) if you require thread safety either use one of the thread safe ones from Joe Duffy or the Parallel Exetensions CTP
I usually go for option 2, as it is easier to spot mistakes later on, although option 1 would pass a code review. Option 3 seems convoluted and whilst it may work, it's not going to be nice code to revisit 6 months down the line whilst trying to refactor/fix a bug/etc.
Option 1, coupled with some education.
Rationale: software is meant to be read more often than written, so optimize for the common case and keep it readable.
Code reviews will catch misuse so just go with the most readable. I dislike attempts to work around bad programmers in code, because 1) they don't work, 2) they make it harder for smart programmers to get their work done, and 3) it addresses the symptom rather than the cause of the problem.
I usually just go for option 1. Because it is a private field I don't think it really an issue, and using something like the wrapper class as in your option 3 only makes code difficult to read and understand.
I would just put a large comment block on the top of the class that would look like that:
/************************************************************
* Note: When updating this class, please take care of using *
* only the accessors to access member data because of *
* ... (state the reasons / your name, so they can ask *
* questions) *
*************************************************************/
Usually, just a note like that should be enough, but if this is the same for all the classes in the project, you might prefer to put it in a simple document that you give to programmers working on the project, and everytime you see code that isn't conform, you point them to the document.
Automatic properties:
public int PropertyName { get; set; }
will prevent access to the backing field. But if you want to put code in there (e.g. for lazy loading on first access) this clearly won't help.
The simplest route is likely to be a helper type which does the lazy loading, and have an private field of that type, with the public property calling to the correct property/method of the helper type.
E.g.
public class Foo {
private class LazyLoader {
private someType theValue;
public someType Value {
get {
// Do lazy load
return theValue;
}
}
}
private LazyLoader theValue;
public someType {
get { return theValue.Value; }
}
}
This has the advantage that the backing field is harder to use than the property.
(Another case of an extra level of indirection to solve problems.)
// option 4
class Foo
{
public int PublicProperty { get; set; }
public int PrivateSetter { get; private set; }
}
C# 3.0 feature, the compiler will generate anonymous private backing fields which can't be accessed by mistake, well unless you use reflection...
EDIT: Lazy instantiation
You can have laziness like this:
// I changed this with respect to ShuggyCoUk's answer (Kudos!)
class LazyEval<T>
{
T value;
Func<T> eval;
public LazyEval(Func<T> eval) { this.eval = eval; }
public T Eval()
{
if (eval == null)
return value;
value = eval();
eval = null;
return value;
}
public static implicit operator T(LazyEval<T> lazy) // maybe explicit
{
return lazy.Eval();
}
public static implicit operator LazyEval<T>(Func<T> eval)
{
return new LazyEval(eval);
}
}
Those implicit conversion make the syntax tidy...
// option 5
class Foo
{
public LazyEval<MyClass> LazyProperty { get; private set; }
public Foo()
{
LazyProperty = () => new MyClass();
}
}
And closures can be used to carry scope:
// option 5
class Foo
{
public int PublicProperty { get; private set; }
public LazyEval<int> LazyProperty { get; private set; }
public Foo()
{
LazyProperty = () => this.PublicProperty;
}
public void DoStuff()
{
var lazy = LazyProperty; // type is inferred as LazyEval`1, no eval
PublicProperty = 7;
int i = lazy; // same as lazy.Eval()
Console.WriteLine(i); // WriteLine(7)
}
}
Currently, I'm in a similar situation.
I have a field which should only be used by its property accessor.
I can't use automatic properties, since I need to perform additional logic when the property is set. (The property is not lazy loaded as well).
Wouldn't it be great if a next version of C# would allow something like this:
public class MyClass
{
public int MyProperty
{
int _backingField;
get
{
return _backingField;
}
set
{
if( _backingField != value )
{
_backingField = value;
// Additional logic
...
}
}
}
}
With such construct, the _backingField variable's scope is limited to the property.
I would like to see a similar language construction in a next version of C# :)
But, I'm afraid this feature will never be implemented:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=381625
This might be overly simple, but why not abstract all the lazy to a base class
public class LazyFoo{
private string bar;
public string Bar{
get{
// lazy load and return
}
set {
// set
}
}
}
public class Foo : LazyFoo{
// only access the public properties here
}
I could see the argument that it is unnecessary abstraction, but it is the simplest way I can see to eliminate all access to backing fields.
This seems like trying to design-out mistakes that might not happen in the first place, and basically it's worrying about the wrong thing.
I would go with option 1 + comments:
///<summary>use Bar property instead</summary>
string bar;
///<summary>Lazy gets the value of Bar and stores it in bar</summary>
string Bar {
get {
// logic to lazy load bar
return bar;
}
}
If you do get a developer who keeps using the backing variable then I'd worry about their technical competence.
By all means design to make your code easier to maintain, but try to keep it simple - any rule that you make for yourself here is going to be more hassle than it's worth.
And if you're still really worried about it create an FxCop (or whatever you're using) rule to check for this sort of thing.
Option 6:
Makes it very dumb indeed if you use it.
string doNotUseThisBackingField_bar6;
string Bar6 {
get {
// logic to lazy load
return doNotUseThisBackingField_bar6;
}
}
Option 4 (a new solution):
See if the question is really about how to prevent people from using an uninitialized variable then init it with an KNOWN INVALID value.
I would say something like:
string str = "SOMETHING_WRONG_HERE";
Who ever is using 'str' will have some sort of warning.
Otherwise Option 3 if preventing users from using 'str' is more important than readability etc.

Categories