I have a function that identify coordinates on a page, and I am returning them as a
Dictionary<int, Collection<Rectangle>> GetDocumentCoordinates(int DocumentId)
However, later I need information about each page - if it was validated, what is the page resolution, color/bw, etc. I could create another function and run through pretty much the same result set as the previous function and get that information.
Dictionary<int, PageInfo> GetDocumentAttributes(int DocumentId)
Another alternative would be to add a ref parameter so I can get these values back.
Dictionary<int, Collection<Rectangle>> GetCoordinates(int DocumentId, ref Dictionary<int, PageInfo> PageAttributes)
Yet another alternative is to create an encompassing class that contains the Dictionary and the page information:
class DocumentInfo
{
Dictionary<int, Collection<Rectangle>> Coordinates { get; set;}
Dictionary<int, PageInfo> PageAttributes { get; set; }
}
and then define:
DocumentInfo GetDocumentInfo(int DocumentId);
I'm leaning towards the last option, but your insights are very much appreciated.
The last option is definitely the best. I've found that, when taking or returning complex data with multiple meanings, creating a complex type to encapsulate this data is the best practice for a number of reasons.
First, your return data probably will change as your design changes. Encapsulating this data in an object allows you to alter what it carries and how your methods operate on this data without altering the interfaces of your objects. Obviously, your data object shouldn't implement an interface; at most, have a base class with the minimum interface and then pass references to the base around.
Second, you may find your data gets complex to the point where you will need to perform validation on it. Rather than have this validation in all the methods of your classes where you act upon this data, you can easily wrap this up in the data class. Single responsibility, etc.
It seems like you need a lot of data out. The last option should be fine, and is extensible; it you wanted (to simplify the Dictionary<,> usage), you could encapsulate things a bit more, but the fact that C# doesn't directly support named indexed properties means you'd need a few classes, unless you just wrap with methods like:
class DocumentInfo {
Dictionary<int, Collection<Rectangle>> rectangles = ...
public Collection<Rectangle> GetRectangles(int index) {
return rectangles[index]; // might want to clone to
// protect against mutation
}
Dictionary<int, PageInfo> pages = ...
public PageInfo GetPageInfo(int index) {
return pages[index];
}
}
I'm not quite clear what the int is, so I can't say whether this is sensible (so I've just left it alone).
Also - with the first option, you probably wouldn't need ref - it would be sufficient to use out.
Related
This question already has answers here:
Return multiple values to a method caller
(28 answers)
Closed 6 years ago.
I need to return 2 values (a string and a point) from a method and I dont really want to use ref/out as the values should stay together.
I was thinking of using a Dictionary<string, Point>.
My question is: Is dictionary a good choice of data structure if it only has one KeyValuePair? Or are there any other suitable options?
If you dont want to create a named class , you can use Tuple to return more than one parameter
Tuple<int, Point> tuple =
new Tuple<int, Point>(1, new Point());
return tuple
You can create your own class. But Tuple<T1, T2> may be convenient. It's just for that sort of thing, when you need to pass around an object containing a few different types.
I'd lean toward creating a class unless it's extremely clear what the tuple is for just by the definition. That way you can give it a name that improves readability. And it can also save a maintenance nuisance if you later determine that there are more than two values. You can just maintain one class instead of replacing Tuple<int, Point> with Tuple<int, Point, Something> in multiple places.
I wouldn't use KeyValuePair because someone looking at it would reasonably assume that there's a dictionary somewhere in the picture, so it would create some confusion. If there are just two values and no dictionary then there is no key.
I'd use a Class or a Structure to store both values. I prefer it to maintain the code and allow you to extend the system in the future.
public class MyData {
public string MyString {get;set;}
public Point MyPoint {get;set;}
}
public class Storage {
public MyData retrieveMyData() {
MyData data = new MyData();
return data;
}
}
Whenever you see this you should pause and think "Should this be an object?"
If it's a one off you can use Tuple but most times you'll come up with a situation where the same parameters are used in conjunction again. By that point you'll wish that you had created an object the first time.
By creating a object for the set of parameters you can give it a name which will increase readability. If you encapsulate the parameters into properties of an object you can also create getter and setter methods, which will allow you to further control access to them and add more functionality if the need comes up in the future.
The main thing is readability. I might call it a NamedPoint which tells anyone reading my code why I've paired the string and the point together. I could later add validation to the name if I wanted it to be a certain length or not start with a number or any number of other things.
I am building webservices for many different clients to connect to a database of automotive parts. This parts have a wide variety of properties. Different clients will need different subsets of properties to 'do their thing.'
All clients will need at least an ID, a part number, and a name. Some might need prices, some might need URL's to images, etc. etc. The next client might be written years from now and require yet a different subset of properties. I'd rather not send more than they need.
I have been building separate 'PartDTO's' with subsets of properties for each of these requirements, and serving them up as separate webservice methods to return the same list of parts but with different properties for each one. Rather than build this up for each client and come up with logical names for the DTO's and methods, I'd like a way for the client to specify what they want. I'm returning JSON, so I was thinking about the client passing me a JSON object listing the properties they want in the result-set:
ret = { ImageUrl: true, RetailPrice: true, ... }
First off, does this make sense?
Second, What I'd rather not lose here is the nice syntax to return an IEnumerable < DTO > and let the JSON tools serialize it. I could certainly build up a 'JSON' string and return that, but that seems pretty kludgey.
Suggestions? C# 'dynamic'?
This is a very good candidate for the Entity-Attribute-Value model. Basically you have a table of ID, Name, Value and you allow each customer/facet to store whatever they want... Then when they query you return their name-value pairs and let them use them as they please.
PROS: super flexible. Good for situations where a strong schema adds tons of complexity vs value. Single endpoint for multiple clients.
CONS: Generally disliked pattern, very hard to select from efficiently and also hard to index. However, if all you do is store and return collections of name-value, it should be fine.
I ended up going the dictionary-route. I defined a base class:
public abstract DictionaryAsDTO<T> : IReadOnlyDictionary<string, object>
{
protected DictionaryAsDTO(T t, string listOfProperties)
{
// Populate an internal dictionary with subset of t's props based on string
}
}
Then a DTO for Part like so:
public PartDTO : DictionaryAsDTO<Part>
{
public PartDTO(Part p, string listOfProperties) : base(p, listOfProperties) {}
// Override method to populate base's dictionary with Part properties based on
// listOfProperties
}
Then I wrote a JSON.NET converter for DictionaryAsDTO which emits JSON-y object-properties instead of key-value-pairs.
The web service builds an IEnumerable based on queries that return IEnumerable and serializes them.
Viola!
If it helps, the following question is in the context of a game I am building.
In a few different places I have the following scenario. There exists a parent class, for this example called Skill, and I have a number of sub-classes implementing the methods from the parent class. There also exists another parent class that we will call Vocation. The skills need to be listed in different sub-classes of Vocation. However, those skills need to be available for anything in the game that uses any given vocation.
My current setup is to have an Enum called Skill.Id, so that Vocation contains a collection of values from that Enum and when an entity in the game takes on that Vocation the collection is passed into another class, called SkillFactory. Skill.Id needs a new entry every time I create a new Skill sub-class, as well as a case in the switch block for the new sub-classes' constructor.
i.e.:
//Skill.Id
Enum{FireSkill,WaterSkill,etc}
//SkillFactory
public static Skill Create(Skill.Id id)
{
switch(id)
{
case Skill.Id.FireSkill:
return new FireSkill();
//etc
}
}
This works perfectly fine, but using the enum and switch block as a go between feels like more overhead than I need to solve this problem. Is there a more elegant way to create instances of these Skill sub-classes, but still allows Vocation to contains a collection identifying the skills it can use?
Edit: I am fine throwing out the enum and associated switch block, so long as Vocation can contain a collection that allows arbitrary instantiation of the Skill sub-classes.
You can make a Dictionary<Skill.Id, Func<Skill>> and use it to instantiate.
In the constructor:
Dictionary<Skill.Id, Func<Skill>> creationMethods = new Dictionary<Skill.Id, Func<Skill>>();
public SkillFactory()
{
creationMethods.Add(Skill.Id.FireSkill, () => new FireSkill());
creationMethods.Add(Skill.Id.WaterSkill, () => new WaterSkill());
}
Then, your Create method becomes:
public static Skill Create(Skill.Id id)
{
return creationMethods[id]();
}
Granted, this isn't much better - except that it does allow you to extend this to other functionality that's per ID without duplicating the switch block if that becomes a requirement. (Just put more into the value side of the Dictionary.)
That being said, in the long run, getting rid of the enum entirely can be a good benefit for extensibility. This will require a more elaborate change, however. For example, if you used MEF, you could import a set of SkillFactory types at runtime and associate them to a name (via metadata) via a single ImportMany. This would allow you to add new Skill subclasses without changing your factory, and refer to them by name or some other mechanism.
if this creation function is going to be so used that a "case" will produce overhead, dictionary with enums keys will generate a lot of garbage.
In the context of a xna game, it can be worse than the "case".
"If you use an enum type as a dictionary key, internal dictionary operations will cause boxing. You can avoid this by using integer keys, and casting your enum values to ints before adding them to the dictionary." Extracted from here
You can use a simple array and cast enum to int for indexing:
Enum {FireSkill=0,WaterSkill=1,etc}
Func<Skill>[] CreationMethods = new Func<Skill>()
{
() => new FireSkill(),
() => new WaterSkill(),
}
Suppose I have a table in my database that is made up of the following columns, 3 of which uniquely identify the row:
CREATE TABLE [dbo].[Lines]
(
[Attr1] [nvarchar](10) NOT NULL,
[Attr2] [nvarchar](10) NOT NULL,
[Attr3] [nvarchar](10) NOT NULL,
PRIMARY KEY (Attr1, Attr2, Attr3)
)
Now, I have an object in my application that represents one of those lines. It has three properties on it that correspond to the three Attr columns in the database.
public class Line
{
public Line(string attr1, string attr2, string attr3)
{
this.Attr1 = attr1;
this.Attr2 = attr2;
this.Attr3 = attr3;
}
public Attr1 {get; private set;}
public Attr2 {get; private set;}
public Attr3 {get; private set;}
}
There's a second object in the application that stores a collection of these line objects.
Here's the question: What is the most appropriate design when referencing an individual line in this collection (from a caller's perspective)? Should the caller be responsible for tracking the index of the line he's changing and then just use that index to modify a line directly in the collection? Or...should there be method(s) on the object that says something to the effect of:
public GetLine(string attr1, string attr2, string attr3)
{
// return the line from the collection
}
public UpdateLine(Line line)
{
// update the line in the collection
}
We're having a debate on our team, because some of us think that it makes more sense to reference a line using their internal index in the collection , and others think there's no reason to have to introduce another internal key when we can already uniquely identify a line based on the three attributes.
Thoughts?
Your object model should be designed so that it makes sense to an object consumer. It should not be tied to the data model to the greatest extent practical.
It sounds like it is more intuitive for the object consumer to think in terms of the three attributes. If there are no performance concerns that speak to the contrary, I would let the object consumer work with those attributes and not concern him with the internal workings of data storage (i.e. not require them to know or care about an internal index).
I think the base question you are encountering is how much control the user of your API should have over your data, and what exactly you expose. This varies wildly depending on what you want to do, and either can be appropriate.
The question is, who is responsible for the information you wish to update. From what you have posted, it appears that the Line object is responsible the information, and thus I would advocate a syntax such as Collection.GetLine(attr1, attr2, attr3).UpdateX(newX) and so forth.
However, it may be that the collection actually has a greater responsibility to that information, in which case Collection.UpdateX(line, newX) would make more sense (alternatively, replace the 'line' arg with 'attr1, attr2, attr2').
Thirdly, it is possible, though unlikely (and rarely the best design IMHO) that the API user is most responsible for the information, in which case an approach you mentioned where the user handles tracking Line indices and directly modifies information.
You do not want the calling object to "track the index of the line he's changing" - ever. This makes your design way too interdependent, pushes object-level implementation decisions off onto the users of the object, makes testing more difficult, and can result in difficult to diagnose bugs when you accidentally update one object (due to key duplications) when you meant to update another.
Go back to OO discipline: the Line object that you are returning from the GetLine method should be acting like a real, first class "thing."
The complication, of course, comes if you change one of the fields in the line object that is used as part of your index. If you change one of these fields, you won't be able to find the original in the database when you go to do your update. Well, that is what data hiding in objects is all about, no?
Here is my suggestion, have three untouchable fields in the object that correspond to its state in the database ("originalAttr1", "originalAttr2", "originalAttr3"). Also, have three properties ("attr1", "attr2", "attr3") that start out with the same values as the originals but that are Settable. Your Getters and Setters will work on the attr properties only. When you "Update" (or perform other actions that go back to the underlying source), use the originalAttrX values as your keys (along with uniqueness checks, etc.).
This might seem like a bit of work but it is nothing compared to the mess that you'll get into if you push all of these implementation decisions off on the consumer of the object! Then you'll have all of the various consumers trying to (redundantly) apply the correct logic in a consistent manner - along with many more paths to test.
One more thing: this kind of stuff is done all the time in data access libraries and so is a quite common coding pattern.
What is the most appropriate design
when referencing an individual line in
this collection (from a caller's
perspective)?
If the caller is 'thinking' in terms of the three attributes, I would consider adding an indexer to your collection class that's keyed on the three attributes, something like:
public Line this[string attr1, string attr2, string attr3] {
get {
// code to find the appropriate line...
}
}
Indexers are the go-to spot for "How Do I Fetch Data From This Collection" and, IMO, are the most intuitive accessor to any collection.
I always prefer to just use a single column ID column even if there is a composite key that can be used. I would just add an identity column to the table and use that for look up instead. Also, it would be faster because query for a single int column would perform better than a key spanned across three text columns.
Having a user maintain some sort of line index to look up a line doesn't seem very good to me. So if I had to pick between the two options you posed though, I would use the composite key.
If the client is retrieving the Line object using three string values, then that's what you pass to the getter method. From that point on, everything necessary to update the object in the database (such as a unique row ID) should be hidden within the Line object itself.
That way all the gory details are hidden from the client, which protects the client from damaging it, and also protects the client from any future changes you might make to the dB access within the Line object.
So if I have a method of parsing a text file and returning a list of a list of key value pairs, and want to create objects from the kvps returned (each list of kvps represents a different object), what would be the best method?
The first method that pops into mind is pretty simple, just keep a list of keywords:
private const string NAME = "name";
private const string PREFIX = "prefix";
and check against the keys I get for the constants I want, defined above. This is a fairly core piece of the project I'm working on though, so I want to do it well; does anyone have any more robust suggestions (not saying there's anything inherently un-robust about the above method - I'm just asking around)?
Edit:
More details have been asked for. I'm working on a little game in my spare time, and I am building up the game world with configuration files. There are four - one defines all creatures, another defines all areas (and their locations in a map), another all objects, and a final one defines various configuration options and things that don't fit else where. With the first three configuration files, I will be creating objects based on the content of the files - it will be quite text-heavy, so there will be a lot of strings, things like names, plurals, prefixes - that sort of thing. The configuration values are all like so:
-
key: value
key: value
-
key: value
key: value
-
Where the '-' line denotes a new section/object.
Take a deep look at the XmlSerializer. Even if you are constrained to not use XML on-disk, you might want to copy some of its features. This could then look like this:
public class DataObject {
[Column("name")]
public string Name { get; set; }
[Column("prefix")]
public string Prefix { get; set; }
}
Be careful though to include some kind of format version in your files, or you will be in hell's kitchen come the next format change.
Making a lot of unwarranted assumptions, I think that the best approach would be to create a Factory that will receive the list of key value pairs and return the proper object or throw an exception if it's invalid (or create a dummy object, or whatever is better in the particular case).
private class Factory {
public static IConfigurationObject Factory(List<string> keyValuePair) {
switch (keyValuePair[0]) {
case "x":
return new x(keyValuePair[1]);
break;
/* etc. */
default:
throw new ArgumentException("Wrong parameter in the file");
}
}
}
The strongest assumption here is that all your objects can be treated partly like the same (ie, they implement the same interface (IConfigurationObject in the example) or belong to the same inheritance tree).
If they don't, then it depends on your program flow and what are you doing with them. But nonetheless, they should :)
EDIT: Given your explanation, you could have one Factory per file type, the switch in it would be the authoritative source on the allowed types per file type and they probably share something in common. Reflection is possible, but it's riskier because it's less obvious and self documenting than this one.
What do you need object for? The way you describe it, you'll use them as some kind (of key-wise) restricted map anyway. If you do not need some kind of inheritance, I'd simply wrap a map-like structure into a object like this:
[java-inspired pseudo-code:]
class RestrictedKVDataStore {
const ALLOWED_KEYS = new Collection('name', 'prefix');
Map data = new Map();
void put(String key, Object value) {
if (ALLOWED_KEYS.contains(key))
data.put(key, value)
}
Object get(String key) {
return data.get(key);
}
}
You could create an interface that matched the column names, and then use the Reflection.Emit API to create a type at runtime that gave access to the data in the fields.
EDIT:
Scratch that, this still applies, but I think what your doing is reading a configuration file and parsing it into this:
List<List<KeyValuePair<String,String>>> itemConfig =
new List<List<KeyValuePair<String,String>>>();
In this case, we can still use a reflection factory to instantiate the objects, I'd just pass in the nested inner list to it, instead of passing each individual key/value pair.
OLD POST:
Here is a clever little way to do this using reflection:
The basic idea:
Use a common base class for each Object class.
Put all of these classes in their own assembly.
Put this factory in that assembly too.
Pass in the KeyValuePair that you read from your config, and in return it finds the class that matches KV.Key and instantiates it with KV.Value
public class KeyValueToObjectFactory
{
private Dictionary _kvTypes = new Dictionary();
public KeyValueToObjectFactory()
{
// Preload the Types into a dictionary so we can look them up later
// Obviously, you want to reuse the factory to minimize overhead, so don't
// do something stupid like instantiate a new factory in a loop.
foreach (Type type in typeof(KeyValueToObjectFactory).Assembly.GetTypes())
{
if (type.IsSubclassOf(typeof(KVObjectBase)))
{
_kvTypes[type.Name.ToLower()] = type;
}
}
}
public KVObjectBase CreateObjectFromKV(KeyValuePair kv)
{
if (kv != null)
{
string kvName = kv.Key;
// If the Type information is in our Dictionary, instantiate a new instance of that class.
Type kvType;
if (_kvTypes.TryGetValue(kvName, out kvType))
{
return (KVObjectBase)Activator.CreateInstance(kvType, kv.Value);
}
else
{
throw new ArgumentException("Unrecognized KV Pair");
}
}
else
{
return null;
}
}
}
#David:
I already have the parser (and most of these will be hand written, so I decided against XML). But that looks like I really nice way of doing it; I'll have to check it out. Excellent point about versioning too.
#Argelbargel:
That looks good too. :')
...This is a fairly core piece of the
project I'm working on though...
Is it really?
It's tempting to just abstract it and provide a basic implementation with the intention of refactoring later on.
Then you can get on with what matters: the game.
Just a thought
<bb />
Is it really?
Yes; I have thought this out. Far be it from me to do more work than neccessary. :')