how does linq2sql keep track of database objects?

how does linq2sql keep track of database objects? - c#

When using Linq2sql everything automagically works. My experience is that going with the flow is not always the best solution and to understand how something internally works is better so you use the technique optimally.
So, my question is about linq2sql.
If I do a query and get some database objects, or I create a new one, somehow the linqcontext object keeps references to these objects. If something changes in one of the objects, the context object 'knows' what has changed and needs updating.
If my references to the object are set to null, does this mean that the context object also removes it's link to this object? Or is the context object slowly getting filled with tons of references, and keeping my database objects from garbage collecting?
If not, how does this work??
Also, is it not very slow for the database object to always go through the entire list to see what changed and to update it?
Any insight in how this works would be excellent!
thanks

yes, the context keeps references of the loaded objects. That's one of the reasons why it isn't meant to be used with a single instance shared accross the different requests.
It keeps lists for the inserts/deletes. I am not sure if it captures update adding those to a list, or it loops at the end. But, u shouldn't be loading large sets of data at a time, because that alone would be a bigger hit to performance than any last check it might do on the list.

The DataContext registers to your objects PropertyChanged event to know when it is modified. At this point it clones the original object and keeps it to compare the 2 objects together later when you do your SubmitChanges().
If my references to the object are set to null, does this mean that the context object also removes it's link to this object?
Edit: No. Sorry for my original answer I had misinterpreted what you had written. In that case the data context still has a reference to both object but will remove the relationship with those 2 objects on next SubmitChanges().
Be careful though. If you created your own objects instead of using the ones generated from the .dbml, the "magic" that the datacontext performs might not work properly.

Related

Safeguarding against user error when saving a list of information

I have a private List<Experience> experiences; that tracks generic experiences and experience specific information. I am using Json Serialize and Deserialize to save and load my list. When you start the application the List populates itself with the current saved information automatically and when a new experience is added to the list it saves the new list to file.
A concern that is popping into my head that I would like to get ahead of is, there is nothing that would stop the user from at any point doing something like experiences = new List<Experience>(); and then adding new experiences to it. Saving this would result in a loss of all previous data as right now the file is overwritten with each save. In an ideal world, this wouldn't happen, but I would like to figure out how to better structure my code to guard against it. Essentially I want to disallow removing items from the List or setting the list to a new list after the list has already been populated from load.
I have toyed with the idea of just appending the newest addition to the file, but I also want to cover the case where you change properties of an existing item in the List, and given that the list will never be all that large of a file, I thought overwriting would be the simplest approach as the cost isn't a concern.
Any help in figuring out the best approach is greatly appreciated.
Edit* Looked into the repository pattern https://www.infoworld.com/article/3107186/application-development/how-to-implement-the-repository-design-pattern-in-c.html and this seems like a potential approach.

I'm making an assumption that your user in this case is a code-level consumer of your API and that they'll be using the results inside the same memory stack, which is making you concerned about reference mutation.
In this situation, I'd return a copy of the list rather than the list itself on read-operations, and on writes allow only add and remove as maccettura recommends in the comments. You could keep the references to the items in the list intact if you want the consumer to be able to mutate them, but I'd think carefully about whether that's appropriate for your use case and consider instead requiring the consumer to call an update function (which could be the same as your add function a-la HTTP PUT).

Sometimes when you want to highlight that your collection should not be modified, exposing it as an IEnumerable except List may be enough, but in case you are writing some serious API, something like repository pattern seems to, be a good solution.

db4o: serious problems and incoherences

I am trying to figure out what happens by tuning my db4o instance but this is really driving me crazy: it simply does not make sense to me.
Essentially I am creating two objects and store the first in an ArrayList of the second. Then I want to remove the first object both form the whole database and from the list where I have initially stored.
Here is a basic list of the operations I am running.
User user = new User("user");
Device device = new Device("device");
objectContainer.ext().store(user,5); // object storing depth
objectContainer.commit();
objectContainer.delete(device);
//objectContainer.close();
//objectContainer = new ...
At this point if I close and I reopen the objectContainer the user deviceList contains a null object, while if I don't close the container (as a normal running application should normally avoid) the device object is still inside the user object, while it is not in the whole database.
I just want the object to be removed from both the list and from database without any null object in place. Is this possible?? I have tried many tuning the configuration (weakReferences, activations, constraints, ...) a lot but without any success.

Why do you have the object still in the list after reopening, but not before?
If you delete a object, it is deleted in the database.
However, db4o doesn't modify any memory content. So before reopening, the collection is the 'old' in memory representation of that collection. It still contains the reference to the object. db4o won't remove it.
After reopening: The collection is loaded from the database. Since the object has been removed from the database, it will use a 'null' reference for the object no longer existing.
db4o won't 'remove' objects for you 'magically' from in memory object. You have to ensure that the object model has a consistent state, like any other in memory object graph.\
Here are some tips: http://community.versant.com/documentation/reference/db4o-8.1/net/reference/Content/best_practises/managing_relations.htm

Converting Object.GetHashCode() to Guid

I need to assign a guid to objects for managing state at app startup & shutdown
It looks like i can store the lookup values in a dictionary using
dictionary<int,Guid>.Add(instance.GetHashCode(), myGUID());
are there any potential issues to be aware of here ?
NOTE
This does NOT need to persist between execution runs, only the guid like so
create the object
gethashcode(), associate with new or old guid
before app terminate, gethashcode() and lookup guid to update() or insert() into persistence engine USING GUID
only assumption is that the gethashcode() remains consistent while the process is running
also gethashcode() is called on the same object type (derived from window)
Update 2 - here is the bigger picture
create a state machine to store info about WPF user controls (later ref as UC) between runs
the types of user controls can change over time (added / removed)
in the very 1st run, there is no prior state, the user interacts with a subset of UC and modifies their state, which needs to recreated when the app restarts
this state snapshot is taken when the app has a normal shutdown
also there can be multiple instances of a UC type
at shutdown, each instance is assigned a guid and saved along with the type info and the state info
all these guids are also stored in a collection
at restart, for each guid, create object, store ref/guid, restore state per instance so the app looks exactly as before
the user may add or remove UC instances/types and otherwise interact with the system
at shutdown, the state is saved again
choices at this time are to remove / delete all prior state and insert new state info to the persistence layer (sql db)
with observation/analysis over time, it turns out that a lot of instances remain consistent/static and do not change - so their state need not be deleted/inserted again as the state info is now quite large and stored over a non local db
so only the change delta is persisted
to compute the delta, need to track reference lifetimes
currently stored as List<WeakReference> at startup
on shutdown, iterate through this list and actual UC present on screen, add / update / delete keys accordingly
send delta over to persistence
Hope the above makes it clear.
So now the question is - why not just store the HashCode (of usercontrol only)
instead of WeakReference and eliminate the test for null reference while
iterating thru the list
update 3 - thanks all, going to use weakreference finally

Use GetHashCode to balance a hash table. That's what it's for. Do not use it for some other purpose that it was not designed for; that's very dangerous.

You appear to be assuming that a hash code will be unique. Hash codes don't work like that. See Eric Lippert's blog post on Guidelines and rules for GetHashCode for more details, but basically you should only ever make the assumptions which are guaranteed for well-behaving types - namely the if two objects have different hash codes, they're definitely unequal. If they have the same hash code, they may be equal, but may not be.
EDIT: As noted, you also shouldn't persist hash codes between execution runs. There's no guarantee they'll be stable in the face of restarts. It's not really clear exactly what you're doing, but it doesn't sound like a good idea.
EDIT: Okay, you've now noted that it won't be persistent, so that's a good start - but you still haven't dealt with the possibility of hash code collisions. Why do you want to call GetHashCode() at all? Why not just add the reference to the dictionary?

The quick and easy fix seems to be
var dict = new Dictionary<InstanceType, Guid>();
dict.Add(instance, myGUID());
Of course you need to implement InstanceType.Equals correctly if it isn't yet. (Or implement IEQuatable<InstanceType>)

Possible issues I can think of:
Hash code collisions could give you duplicate dictionary keys
Different object's hash algorithms could give you the same hash code for two functionally different objects; you wouldn't know which object you're working with
This implementation is prone to ambiguity (as described above); you may need to store more information about your objects than just their hash codes.
Note - Jon said this more elegantly (see above)

Since this is for WPF controls, why not just add the Guid as a dependency proptery? You seem to already be iterating through the user controls, in order to get their hash codes, so this would probably be a simpler method.
If you want to capture that a control was removed and which Guid it had, some manager object that subscribes to closing/removed events and just store the Guid and a few other details would be a good idea. Then you would also have an easier time to capture more details for analysis if you need.

C#: Is there any way to easily find/update all references to an object?

I've been reading Rockford Lhotka's "Expert C# 2008 Business Objects", where there is such a thing as a data portal which nicely abstracts where the data comes from. When using the DataPortal.Update(this), which as you might guess persists 'this' to the database, an object is returned - the persisted 'this' with any changes the db made to it, eg. a timestamp.
Lhotka has written often and very casually, that you have to make sure to update all references to the old object to the new returned object. Makes sense, but is there an easy way to find all references to the old object and change them? Obviously the GC tracks references, is it possible to tap into that?
Cheers

There are profiling API's to do this but nothing for general consumption. One possible solution and one which I've used myself is to implement in a base class a tracking mechanism where each instance of the object adds a WeakReference to itself to a static collection.
I have this conditionally compiled for DEBUG builds but it probably wouldn't be a good idea to rely on this in a release build.
// simplified example
// do not use. performance would suck
abstract class MyCommonBaseClass {
static readonly List<WeakReference> instances = new List<WeakReference>();
protected MyCommonBaseClass() {
lock (instances) {
RemoveDeadOnes();
instances.Add(new WeakReference(this));
}
}
}

The GC doesn't actually track the references to the objects. Instead, it calculates which objects are reachable starting from global and stack objects at the runtime, and executing some variant of "flood fill" algorithm.
Specifically for your problem, why not just have a proxy holding reference to the "real" object? This way you need to update at only one place.

There isn't a simple way to do this directly, however, Son of Strike has this capability. It allows you to delve into all object references tracked by the CLR, and look at what objects are referencing any specific object, etc.
Here is a good tutorial for learning CLR debugging via SoS.

If you are passing object references around and those object references remain unchanged, then any changes made to the object in a persistence layer will be instantly visible to any other consumers of the object. However if your object is crossing a service boundary then the assemblies on each side of the object will be viewing different objects that are just carbon copies. Also if you have made clones of the object, or have created anonymous types that incorporate properties from the original object, then those will be tough to track down - and of course to the GC these are new objects that have no tie-in to the original object.
If you have some sort of key or ID in the object then this becomes easier. The key doesn't have to be a database ID, it can be a GUID that is new'ed up when the object is instantiated, and does not get changed for the entire lifecycle of the object (i.e. it is a property that has a getter but no setter) - as it is a property it will persist across service boundaries, so your object will still be identifiable. You can then use LINQ or even old-fashioned loops (icky!) to iterate through any collection that is likely to hold a copy of the updated object, and if one is found you can then merge the changes back in.
Having said this, i wouldn't think that you have too many copies floating around. IF you do then the places where these copies are should be very localized. Ensuring that your object implements INotifyPropertyChanged will also help propagate notifications of changes if you hold a list in one spot which is then bound to directly or indirectly in several other spots.

Hashing the state of a complex object in .NET

Some background information:
I am working on a C#/WPF application, which basically is about creating, editing, saving and loading some data model.
The data model contains of a hierarchy of various objects. There is a "root" object of class A, which has a list of objects of class B, which each has a list of objects of class C, etc. Around 30 classes involved in total.
Now my problem is that I want to prompt the user with the usual "you have unsaved changes, save?" dialog, if he tries to exit the program. But how do I know if the data in current loaded model is actually changed?
There is of course ways to solve this, like e.g. reloading the model from file and compare against the one in memory value by value or make every UI control set a flag indicating the model has been changed. Now instead, I want to create a hash value based on the model state on load and generate a new value when the user tries to exit, and compare those two.
Now the question:
So inspired of that, I was wondering if there exist some way to generate a hash value from the (value)state of some arbitrary complex object? Preferably in a generic way, e.g. no need to apply attributes to each involved class/field.
One idea could be to use some of .NET's serialization functionality (assuming it will work out-of-the-box in this case) and apply a hash function to the content of the resulting file. However, I guess there exist some more suitable approach.
Thanks in advance.
Edit:
Point taken about the hashing and possible collisions. Instead, I am going for deep comparing value by value. I am already using the XML serializer for persistence, so I am just going to serialize and compare chars. Not pretty, but it does the trick in this case.

Ok you can use reflection and some sort of recursive function of course.
But keep in mind that every object is a model of a particular thing. I mean there maybe a lot of "unimportant" fields and properties.
And, thanks to #compie!
You can create a hash function just for your domain. But this requires strong mathematic skills.
And you can try to use classic hash functions like SHA. Just assume that your object is a string or byte array.

Because this is a WPF app, it may be easier than you think to be notified of changes as they happen. The event architecture of WPF allows you to create event handlers at a level somewhere above where the event actually originates. So, you could create event handlers for the various "change" events of your UI elements in the root window of your interface and set the "changed" flag at that scope.
WPF Routed Events Overview

I would advice against this. Different objects can have the same hash. It's not safe to rely on this for checking if changes have to be saved.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.