MongoDb and self referencing objects - c#

I am just starting to learn about mongo db and was wondering if I am doing something wrong....I have two objects:
public class Part
{
public Guid Id;
public ILIst<Materials> Materials;
}
public class Material
{
public Guid MaterialId;
public Material ParentMaterial;
public IList<Material> ChildMaterials;
public string Name;
}
When I try to save this particular object graph I receive a stack overflow error because of the circular reference. My question is, is there a way around this? In WCF I am able to add the "IsReference" attribute on the datacontract to true and it serializes just fine.

What driver are you using?
In NoRM you can create a DbReference like so
public DbReference<Material> ParentMaterial;
Mongodb-csharp does not offer strongly typed DbReferences, but you can still use them.
public DBRef ParentMaterial;
You can follow the reference with Database.FollowReference(ParentMaterial).

Just for future reference, things like references between objects which are not embedded within a sub-document structure, are handled extremely well by a NoSQL ODB, which is generally designed to deal with transparent relations in arbitrarity complex object models.
If you are familiar with Hibernate, imagine that without any mapping file AT ALL and orders of magnitude faster performance because there is no runtime JOIN behind the scenes, all relations are resolved with the speed of a b-tree lookup.
Here is a video from Versant (disclosure - I work for them), so you can see how it works.
This is a little boring in the beginning, but shows every single step to take a Java application and make it persistent in an ODB... then make it fault tolerant, distributed, do some parallel queries, optimize cache load, etc...
If you want to skip to the cool part, jump about 20 minutes in and you will avoid the building of the application and just see the how easy it is to dynamically evolve schema, add distribution and fault tolerance to any existing application ):

If you want to store object graphs with relationships between them requiring multiple 'joins' to get to the answer you are probably better off with a SQL-style database. The document-centric approach of MongoDB and others would probably structure this rather differently.
Take a look at MongoDB nested sets which suggests some ways to represent data like this.

I was able to accomplish exactly what I needed by using a modified driver from NoRM mongodb.

Related

How to store information offline. C#, Unity

I'm working on an application in unity that solves chemical problems. I need to store information about each chemical element offline. For example: hydrogen [mass 1, group 1...], oxygen[mass 16, group 6...] and so on. What do I need to use?
The probably simplest solution would be to use a serialization library, like json .net, these can convert your objects to a serialized stream that can be saved to file. Attributes can typically be used to control how the object will be serialized.
The other major option is to use a database, either a stand-alone database like postgres, or a in-process database like sqlite. The later makes things like deployment easier, but introduces some limitations, like not supporting multiple concurrent applications. In either case you would typically use an "Object Relational Mapper" (ORM), like Entity Framework. This is able to convert your objects directly to database tables.
Files are typically simpler to use, and suitable if you want to store few,larger blobs of data that rarely change. Databases are more suitable if you have many more smaller objects that you want to search among, or when persisting data more frequently.
Note that this is general advice, Unity might have some built in persistence that might or might not be suitable for your particular case.
ScriptableObjects are a great fit for this situation:
[CreateAssetMenu]
public class Element : ScriptableObject
{
[SerializeField]
private int mass;
[SerializeField]
private int group;
public int Mass => mass;
public int Group => group;
}
You can create an asset to hold information about each element.
Create scriptable object:
Add it from menu:
Set Desired data to element:

Adding side notes to each property in MongoDB document

I have a collection with a collection of documents. Each document has around 20 different properties with different data types (e.g. Int, Double, String).
I am searching for an efficient way or the appropriate way to add side notes to each property.
My thought (I am using C# to model the document structure) is for each property, instead of
:
public int PageRank {get; set; }
to use:
public Dictionary<int, string> PageRank {get; set;}
This means that each item in the document is a collection of both the value and the string for the side note.
The side notes will be seen at the front-end by the user.
Any better implementation?
Idan, for performance reasons, you should consider your use case from the MongoDB point of view -- not from the object oriented language point of view. The way it ends up looking in C# is an afterthought -- its the DB performance that counts. So, when querying your documents, if the side notes are mostly not needed, it will be better to place them into a separate collection (possibly) thus reducing the size of each document and enabling MongoDB to read more of them into the available memory. If the user does need to look at the side notes, you would do this with a separate query. You know your usage scenario better, so its up to you to decide how to do this, but its these kinds of design decisions that you need to concern yourself with -- and the C# code will be shaped according to your schema

.NET refactoring, DRY. dual inheritance, data access and separation of concerns

Back story:
So I've been stuck on an architecture problem for the past couple of nights on a refactor I've been toying with. Nothing important, but it's been bothering me. It's actually an exercise in DRY, and an attempt to take it to such an extreme as the DAL architecture is completely DRY. It's a completely philosophical/theoretical exercise.
The code is based in part on one of #JohnMacIntyre's refactorings which I recently convinced him to blog about at http://whileicompile.wordpress.com/2010/08/24/my-clean-code-experience-no-1/. I've modified the code slightly, as I tend to, in order to take the code one level further - usually, just to see what extra mileage I can get out of a concept... anyway, my reasons are largely irrelevant.
Part of my data access layer is based on the following architecture:
abstract public class AppCommandBase : IDisposable { }
This contains basic stuff, like creation of a command object and cleanup after the AppCommand is disposed of. All of my command base objects derive from this.
abstract public class ReadCommandBase<T, ResultT> : AppCommandBase
This contains basic stuff that affects all read-commands - specifically in this case, reading data from tables and views. No editing, no updating, no saving.
abstract public class ReadItemCommandBase<T, FilterT> : ReadCommandBase<T, T> { }
This contains some more basic generic stuff - like definition of methods that will be required to read a single item from a table in the database, where the table name, key field name and field list names are defined as required abstract properties (to be defined by the derived class.
public class MyTableReadItemCommand : ReadItemCommandBase<MyTableClass, Int?> { }
This contains specific properties that define my table name, the list of fields from the table or view, the name of the key field, a method to parse the data out of the IDataReader row into my business object and a method that initiates the whole process.
Now, I also have this structure for my ReadList...
abstract public ReadListCommandBase<T> : ReadCommandBase<T, IEnumerable<T>> { }
public class MyTableReadListCommand : ReadListCommandBase<MyTableClass> { }
The difference being that the List classes contain properties that pertain to list generation (i.e. PageStart, PageSize, Sort and returns an IEnumerable) vs. return of a single DataObject (which just requires a filter that identifies a unique record).
Problem:
I'm hating that I've got a bunch of properties in my MyTableReadListCommand class that are identical in my MyTableReadItemCommand class. I've thought about moving them to a helper class, but while that may centralize the member contents in one place, I'll still have identical members in each of the classes, that instead point to the helper class, which I still dislike.
My first thought was dual inheritance would solve this nicely, even though I agree that dual inheritance is usually a code smell - but it would solve this issue very elegantly. So, given that .NET doesn't support dual inheritance, where do I go from here?
Perhaps a different refactor would be more suitable... but I'm having trouble wrapping my head around how to sidestep this problem.
If anyone needs a full code base to see what I'm harping on about, I've got a prototype solution on my DropBox at http://dl.dropbox.com/u/3029830/Prototypes/Prototype%20-%20DAL%20Refactor.zip. The code in question is in the DataAccessLayer project.
P.S. This isn't part of an ongoing active project, it's more a refactor puzzle for my own amusement.
Thanks in advance folks, I appreciate it.
Separate the result processing from the data retrieval. Your inheritance hierarchy is already more than deep enough at ReadCommandBase.
Define an interface IDatabaseResultParser. Implement ItemDatabaseResultParser and ListDatabaseResultParser, both with a constructor parameter of type ReadCommandBase ( and maybe convert that to an interface too ).
When you call IDatabaseResultParser.Value() it executes the command, parses the results and returns a result of type T.
Your commands focus on retrieving the data from the database and returning them as tuples of some description ( actual Tuples or and array of arrays etc etc ), your parser focuses on converting the tuples into objects of whatever type you need. See NHibernates IResultTransformer for an idea of how this can work (and it's probably a better name than IDatabaseResultParser too).
Favor composition over inheritance.
Having looked at the sample I'll go even further...
Throw away AppCommandBase - it adds no value to your inheritance hierarchy as all it does is check that the connection is not null and open and creates a command.
Separate query building from query execution and result parsing - now you can greatly simplify the query execution implementation as it is either a read operation that returns an enumeration of tuples or a write operation that returns the number of rows affected.
Your query builder could all be wrapped up in one class to include paging / sorting / filtering, however it may be easier to build some form of limited structure around these so you can separate paging and sorting and filtering. If I was doing this I wouldn't bother building the queries, I would simply write the sql inside an object that allowed me to pass in some parameters ( effectively stored procedures in c# ).
So now you have IDatabaseQuery / IDatabaseCommand / IResultTransformer and almost no inheritance =)
I think the short answer is that, in a system where multiple inheritance has been outlawed "for your protection", strategy/delegation is the direct substitute. Yes, you still end up with some parallel structure, such as the property for the delegate object. But it is minimized as much as possible within the confines of the language.
But lets step back from the simple answer and take a wide view....
Another big alternative is to refactor the larger design structure such that you inherently avoid this situation where a given class consists of the composite of behaviors of multiple "sibling" or "cousin" classes above it in the inheritance tree. To put it more concisely, refactor to an inheritance chain rather than an inheritance tree. This is easier said than done. It usually requires abstracting very different pieces of functionality.
The challenge you'll have in taking this tack that I'm recommending is that you've already made a concession in your design: You're optimizing for different SQL in the "item" and "list" cases. Preserving this as is will get in your way no matter what, because you've given them equal billing, so they must by necessity be siblings. So I would say that your first step in trying to get out of this "local maximum" of design elegance would be to roll back that optimization and treat the single item as what it truly is: a special case of a list, with just one element. You can always try to re-introduce an optimization for single items again later. But wait till you've addressed the elegance issue that is vexing you at the moment.
But you have to acknowledge that any optimization for anything other than the elegance of your C# code is going to put a roadblock in the way of design elegance for the C# code. This trade-off, just like the "memory-space" conjugate of algorithm design, is fundamental to the very nature of programming.
As is mentioned by Kirk, this is the delegation pattern. When I do this, I usually construct an interface that is implemented by the delegator and the delegated class. This reduces the perceived code smell, at least for me.
I think the simple answer is... Since .NET doesn't support Multiple Inheritence, there is always going to be some repetition when creating objects of a similar type. .NET simply does not give you the tools to re-use some classes in a way that would facilitate perfect DRY.
The not-so-simple answer is that you could use code generation tools, instrumentation, code dom, and other techniques to inject the objects you want into the classes you want. It still creates duplication in memory, but it would simplify the source code (at the cost of added complexity in your code injection framework).
This may seem unsatisfying like the other solutions, however if you think about it, that's really what languages that support MI are doing behind the scenes, hooking up delegation systems that you can't see in source code.
The question comes down to, how much effort are you willing to put into making your source code simple. Think about that, it's rather profound.
I haven't looked deeply at your scenario, but I have some thoughs on the dual-hierarchy problem in C#. To share code in a dual-hierarchy, we need a different construct in the language: either a mixin, a trait (pdf) (C# research -pdf) or a role (as in perl 6). C# makes it very easy to share code with inheritance (which is not the right operator for code-reuse), and very laborious to share code via composition (you know, you have to write all that delegation code by hand).
There are ways to get a kind of mixin in C#, but it's not ideal.
The Oxygene (download) language (an Object Pascal for .NET) also has an interesting feature for interface delegation that can be used to create all that delegating code for you.

How to handle multiple object types when creating a new Type

Been tasked to write some asset tracking software...
Want to try to do this the right way. So I thought that a lot of assets had common fields.
For instance, a computer has a model and a manufacturer which a mobile phone also has.
I would want to store computers, monitors, mobile phones, etc. So I thought the common stuff can be taken into account using an abstract base class. The other properties that do not relate to one another would be stored in the actual class itself.
For instance,
public abstract class Asset {
private string manufacturer;
public string Manufacturer { get; set; }
//more common fields
}
public class Computer : Asset {
private string OS;
public strin OS { get; set; }
//more fields pertinent to a PC, but inherit those public properties of Asset base
}
public class Phone : Asset {
//etc etc
}
But I have 2 concerns:
1)If I have a web form asking someone to add an asset I wanted to give them say a radio box selection of the type of asset they were creating. Something to the effect of:
What are you creating
[]computer
[]phone
[]monitor
[OK] [CANCEL]
And they would select one but I dont want to end up with code like this:
pseudocode:
select case(RadioButtonControl.Text)
{
case "Computer": Computer c = new Computer(args);
break;
case "Phone": Phone p = new Phone(args);
break;
....
}
This could get ugly....
Problem 2) I want to store this information in one database table with a TypeID field that way when an Insert into the database is done this value becomes the typeid of the row (distinguishes whether it is a computer, a monitor, a phone, etc). Should this typeid field be declared inside the base abstract class as some sort of enum?
Thanks
My advice is to avoid this general design altogether. Don't use inheritance at all. Object orientation works well when different types of objects have different behavior. For asset tracking, none of the objects really has any behavior at all -- you're storing relatively "dumb" data, none of which does (or should) really do anything at all.
Right now, you seem to be approaching this as an object oriented program with a database as a backing store (so to speak). I'd reverse that: it's a database with a front-end that is (or at least might be) object oriented.
Then again, unless you have some really specific and unusual needs in your asset tracking, chances are that you shouldn't do this at all. There are literally dozens of perfectly reasonable asset tracking packages already on the market. Unless your needs really are pretty unusual, reinventing this particular wheel won't accomplish much.
Edit: I don't intend to advise against using OOP within the application itself at all. Quite the contrary, MVC (for example) works quite well, and I'd almost certainly use it for almost any kind of task like this.
Where I'd avoid OOP would be in the design of the data being stored. Here, you benefit far more from using something like an SQL-based database via something like OLE DB, ODBC, or JDBC.
Using a semi-standard component for this will give you things like scalability and incremental backup nearly automatically, and is likely to make future requirements (e.g. integration with other systems) considerably easier, as you'll have a standardized, well understood layer for access to the data.
Edit2: As far as when to use (or not use) inheritance, one hint (though I'll admit it's no more than that) is to look at behaviors, and whether the hierarchy you're considering really reflects behaviors that are important to your program. In some cases, the data you work with are relatively "active" in the program -- i.e. the behavior of the program itself revolves around the behavior of the data. In such a case, it makes sense (or at least can make sense) to have a relatively tight relationship between the data and the code.
In other cases, however, the behavior of the code is relatively unaffected by the data. I would posit that asset tracking is such a case. To the asset tracking program, it doesn't make much (if any) real difference whether the current item is a telephone, or a radio, or a car. There are a few (usually much broader) classes you might want to take into account -- at least for quite a few businesses, it matters whether assets are considered "real estate", "equipment", "office supplies", etc. These classifications lead to differences in things like how the asset has to be tracked, taxes that have to be paid on it, and so on.
At the same time, two items that fall under office supplies (e.g. paper clips and staples) don't have significantly different behaviors -- each has a description, cost, location, etc. Depending on what you're trying to accomplish, each might have things like a trigger when the quantity falls below a certain level, to let somebody know that it's time to re-order.
One way to summarize that might be to think in terms of whether the program can reasonably work with data for which it wasn't really designed. For asset tracking, there's virtually no chance that you can (or would want to) create a class for every kind of object somebody might decide to track. You need to plan from the beginning on the fact that it's going to be used for all kinds of data you didn't explicitly account for in the original design. Chances are that for the majority of items, you need to design your code to be able to just pass data through, without knowing (or caring) much about most of the content.
Modeling the data in your code makes sense primarily when/if the program really needs to know about the exact properties of the data, and can't reasonably function without it.

Best way to get a list of differences between 2 of the same objects

I would like to generate a list of differences between 2 instances of the the same object. Object in question:
public class Step
{
[DataMember]
public StepInstanceInfo InstanceInfo { get; set; }
[DataMember]
public Collection<string> AdHocRules { get; set; }
[DataMember]
public Collection<StepDoc> StepDocs
{...}
[DataMember]
public Collection<StepUsers> StepUsers
{...}
}
What I would like to do is find an intelligent way to return an object that lists the differences between the two instances (for example, let me know that 2 specific StepDocs were added, 1 specific StepUser was removed, and one rule was changed from "Go" to "Stop"). I have been looking into using a MD5 hash, but I can't find any good examples of traversing an object like this and returning a manifest of the specific differences (not just indicating that they are different).
Additional Background: the reason that I need to do this is the API that I am supporting allows clients to SaveStep(Step step)...this works great for persisting the Step object to the db using entities and repositories. I need to raise specific events (like this user was added, etc) from this SaveStep method, though, in order to alert another system (workflow engine) that a specific element in the step has changed.
Thank you.
You'll need a separate object, like StepDiff with collections for removed and added items. The easiest way to do something like this is to copy the collections from each of the old and new objects, so that StepDiff has collectionOldStepDocs and collectionNewStepDocs.
Grab the shorter collection and iterate through it and see if each StepDoc exists in the other collection. If so, delete the StepDoc reference from both collections. Then when you're finished iterating, collectionOldStepDocs contains stepDocs that were deleted and collectionNewStepDocs contains the stepDocs that were added.
From there you should be able to build your manifest in whatever way necessary.
Implementing the IComparable interface in your object may provide you with the functionality you need. This will provide you a custom way to determine differences between objects without resorting to checksums which really won't help you track what the differences are in usable terms. Otherwise, there's no way to determine equality between two user objects in .NET that I know of. There are some decent examples of the usage of this interface in the help file for Visual Studio, or here. You might be able to glean some directives from the examples on clean ways to compare the properties and store the values in some usable manner for tracking purposes (perhaps a collection, or dictionary object?).
Hope this helps,
Greg

Categories