Using XmlDocument.Save() Effectively

Using XmlDocument.Save() Effectively - c#

I'm working with a .XML document in C# to which I'm selecting nodes from, adding nodes to, and deleting nodes many, many times over a span of my code.
All of the XML editing of this document is contained within a class, which other classes call to.
Since the Data Access class has no way of telling if the classes using it are done with editing the document, it has no logic as to if/when to save.
I could save after every modification of the document, but I'm concerned with performance issues.
Alternatively I could just assume/hope that it will be saved by the other classes that use it (I created a one-line public method to save the document, so another class can request a save).
The second option concerns me as I feel like I should have it globally enforced in some manner to avoid it being called upon and modifications not being committed. To this point there will never be a case where a rollback is needed; any change is a change that should be committed.
Does .Net (Or coding design) have a way to balance performance and safety in such a situation?

If you always want to save the changes (just don't know when) then you could add the save command to the class destructor. This way you know the changes will always be saved.
If you need additional help or want an example please leave a comment, otherwise select an answer as correct.
Update: It has been brought to my attention that the class destructor may fire after other objects (like a FileStream) have already been disposed.
I recommended that you test for this condition in your destructor and also that you implement and use the IDisposable interface. You can then subscribe to the either the Application.Exit event or Application.ApplicationExit event and call dispose there.
Be sure to keep the code in the destructor (but make sure you have it in a try block) in case the program crashes or there is some other, unexpected exit.

Basically your question says i all: You need to save, but you don't know when, as the knowledge about the savepoints is otside your class.
My recommendation is to wrap your calls - assuming you have something like public void MyClass.SomeEditing(int foo), create a wrapper like public void MyClass.SomeEditing(int foo, bool ShouldSave) with shouldsave defaultingto true.
This way, a consumer of your class can decide, wether he wants an immediate save or not, chosing false if he knows, an immediately following other edit will cause the save. Existing code, which calls the "old" API is protected by the default of "save imediately"

Related

Call methods from set accessor or force the user to do manually

When a property is updated is it good practice to change other properties based on this or should you force the user to call a method directly? For example:
someObject.TodaysTotalSales = 1234.56;
Would it be OK to have the set accessor update another value say ThisYearsTotalSales or should you force the end user to do it manually.
someObject.TodaysTotalSales = 1234.56;
someObject.UpdateThisYearsTotal();

I think the best practise is to recalculate the total year consumption only when it is accessed. Otherwise if you update the TodaysTotalSales property very often, you will compute the total year count for nothing.
More generally, when you call a property setter, you don't expect a complex operation. By convention, getters and setters are expected to return almost immediately.
If your algorithm is too complex, in that case you can use a cache value to avoid a recalculation at each call; you invalidate the cache value when one of its prerequisite has changed

It depends.
Does he need to know the TotalYearsOfSales even after he updated TodaysSales?
Yes -> Provide an additional method to update someObject.UpdateThisYearsTotal(); and at the same time flag that he has not updated YearsTotal while he did update TodaysSales, so you can throw some error at the end of the process if needed
No -> Autoupdate other properties of which the values are not needed to prior to updating the TodaysSales

TL;DR: it depends
I assume you have public interface of a class in mind.
If you follow OOP Encapsulation principle to the limit, then someObject's externally visible state should be consistent with every public access, i.e. you shouldn't need any public UpdateState methods. So in this case someObject.UpdateThisYearsTotal() is a no-no. What happens internally: be it lazy recalculation, caching, private UpdateAllInternal - would not matter.
But OOP is not an icon/idol - so for performance reasons you may design program flow as you see fit. For example: deferred bulk data processing, game loop, Entity Component System design, ORMs - those systems clearly state in their docs (rarely in code contracts) the way they are supposed to be used.

Programmatically check for a change in a class in C#

Is there a way to check for the size of a class in C#?
My reason for asking is:
I have a routine that stores a class's data in a file, and a different routine that loads this object (class) from that same file. Each attribute is stored in a specific order, and if you change this class you have to be reminded of these export/import routines needs changing.
An example in C++ (no matter how clumsy or bad programming this might be) would be
the following:
#define PERSON_CLASS_SIZE 8
class Person
{
char *firstName;
}
...
bool ExportPerson(Person p)
{
if (sizeof(Person) != PERSON_CLASS_SIZE )
{
CatastrophicAlert("You have changed the Person class and not fixed this export routine!")
}
}
Thus before compiletime you need to know the size of Person, and modify export/import routines with this size accordingly.
Is there a way to do something similar to this in C#, or are there other ways of "making sure" a different developer changes import/export routines if he changes a class.
... Apart from the obvious "just comment this in the class, this guarantees that a developer never screws things up"-answer.
Thanks in advance.

Each attribute is stored in a specific order, and if you change this class you have to be reminded of these export/import routines needs changing.
It sounds like you're writing your own serialization mechanism. If that's the case, you should probably include some sort of "fingerprint" of the expected properties in the right order, and validate that at read time. You can then include the current fingerprint in a unit test, which will then fail if a property is added. The appropriate action can then be taken (e.g. migrating existing data) and the unit test updated.
Just checking the size of the class certainly wouldn't find all errors - if you added one property and deleted one of the same size in the same change, you could break data without noticing it.

A part from the fact that probably is not the best way to achieve what you need,
I think the fastest way is to use Cecil. You can get the IL body of the entire class.

C# detect if calls were in the same UI action

I have some nice, working edit-undo functionality in my winforms application. It works using a CommandStack class, which is two Stack<IStateCommand>s (one for undo, one for redo). Each command has an Execute and an Undo method, and the CommandStack object itself has an event that is fired when the stacks are changed.
The CommandStack also works out if the LogCommand method is called from its own Undo function, and therefore adding it to the redo stack, rather than the undo stack. This is done by simply adding the current ManagingThreadId to a List<int> object, then removing it after the Undo command is completed (as opposed to using the stack trace, which I believe would be much slower and a bit dirty).
There is a lot of different commands within my application so this formula is sort of set in stone as it'll take me a few days to redo all those IStateCommands implementations.
The only problem with this, currently, some UI events within also call other UI events, both of which log an IStateCommand to the undo history. Is there any way in C# that I can detect if the LogCommand function has already been called from the same UI event (Click, DragDrop, SelectedIndexChanged, TextChanged, etc), then I can combine the commands into one command (using my CommandList class, which also inherits IStateCommand)?
I've thought of saving the current time when the undo event was called, then if the next command is logged less than x milliseconds later, combine them in the history, but this seems a bit sloppy. I've also considered searching the stack trace, but I don't really know what to look for to find the root UI event, nor do I know whether I would tell the different between one button click, then a different click on the same button.
It may also be helpful to know that all of these commands are being called from the UI thread from event handlers (mostly from events from custom user controls). The only part of my application that uses another thread runs after most UI events, after the undo history is logged.
Thanks!
Sort Version
The same method is being called twice from the same UI event (eg, MouseUp, DragDrop). The second time this method is called, how do I check that it has already been called once by the same UI event?
Edit: The solution (sort of)
It's a bit of a dirty one as I don't have the time to completely re-write this system. However I've implemented it in such a way that gives the option not to be so dirty in the future.
The solution is based on one of Erno's comments on his answer (so I will mark his answer as accepted), where he suggests added a parameter. I added another overload to my LogCommand(IStackCommand) method in the CommandStack class, LogCommand(IStackCommand, string). The string is the actionId, which is stored for each command, and if this string is the same as the last, the commands are combined. This gives the option to go through each event and give a unique ID.
However, the dirty part - to get it working before we have to show the client, the actionId defaults to System.Windows.Forms.Cursor.Position.ToString(), ouch!! Since the cursor position is not changed while the UI thread is executing, this combines each command. It actually even combines TextChanged commands (as long as they don't move their mouse!)

It might be an option to add a local stack of called-commands to a command.
When a command executes other commands add the command to the local stack so you can undo the commands on this local stack when the command must be undone or redone.
EDIT
I am not quite sure what you don't understand.
I would simply add a CommandList property to the StateCommand. Everytime the StateCommand invokes/triggers another StateCommand it should add the new StateCommand to the CommandList. So the global CommandList keeps track of the Commands that can be undone from the UI and each StateCommand keeps track of the StateCommands it invoked (so these are not added to the global undo CommandList)
EDIT 2
If you can't or do not want to change to that setup you would have to pass a parameter to the execution of the commands that links them together.

Did you try to inspect the method stack and analyze it method-by-method:
StackTrace st = new StackTrace();
for ( int i=0; i<st.FrameCount; i++ )
{
StackFrame sf = st.GetFrame(i);
MethodBase mb = sf.GetMethod();
// do whatever you want
}

I don't know what you need exactly to achieve, but I implemented something similar, maybe you can get some ideas...
In summary, you can store some information in a ThreadStatic variable. Then, any time you want to log a command, inspect the thread static variable to find out the context in which you are logging the command. If it's empty, you are starting a new command logging sequence. If not, you are inside a sequence.
Maybe you can store the entry event (e.g. Click, DragDrop,...), or the command itself... It depends on your needs.
When the initial event callback is completed, clean the static variable to signal that the sequence has been completed.
I successfully implemented a similar strategy to track commands executed upon an object model. I encapsulated the logic within an IDisposable class that also implemented the reference counting to handle the nested usings. The first using started the sequence, subsequents using statements increased and decreased the reference counting to know when the sequence was completed. The outermost context disposing fired an event containing all the nested commands. In my specific case it has worked perfectly, I don't know if it may fulfill your needs...

How to enforce the use of a method's return value in C#?

I have a piece of software written with fluent syntax. The method chain has a definitive "ending", before which nothing useful is actually done in the code (think NBuilder, or Linq-to-SQL's query generation not actually hitting the database until we iterate over our objects with, say, ToList()).
The problem I am having is there is confusion among other developers about proper usage of the code. They are neglecting to call the "ending" method (thus never actually "doing anything")!
I am interested in enforcing the usage of the return value of some of my methods so that we can never "end the chain" without calling that "Finalize()" or "Save()" method that actually does the work.
Consider the following code:
//The "factory" class the user will be dealing with
public class FluentClass
{
//The entry point for this software
public IntermediateClass<T> Init<T>()
{
return new IntermediateClass<T>();
}
}
//The class that actually does the work
public class IntermediateClass<T>
{
private List<T> _values;
//The user cannot call this constructor
internal IntermediateClass<T>()
{
_values = new List<T>();
}
//Once generated, they can call "setup" methods such as this
public IntermediateClass<T> With(T value)
{
var instance = new IntermediateClass<T>() { _values = _values };
instance._values.Add(value);
return instance;
}
//Picture "lazy loading" - you have to call this method to
//actually do anything worthwhile
public void Save()
{
var itemCount = _values.Count();
. . . //save to database, write a log, do some real work
}
}
As you can see, proper usage of this code would be something like:
new FluentClass().Init<int>().With(-1).With(300).With(42).Save();
The problem is that people are using it this way (thinking it achieves the same as the above):
new FluentClass().Init<int>().With(-1).With(300).With(42);
So pervasive is this problem that, with entirely good intentions, another developer once actually changed the name of the "Init" method to indicate that THAT method was doing the "real work" of the software.
Logic errors like these are very difficult to spot, and, of course, it compiles, because it is perfectly acceptable to call a method with a return value and just "pretend" it returns void. Visual Studio doesn't care if you do this; your software will still compile and run (although in some cases I believe it throws a warning). This is a great feature to have, of course. Imagine a simple "InsertToDatabase" method that returns the ID of the new row as an integer - it is easy to see that there are some cases where we need that ID, and some cases where we could do without it.
In the case of this piece of software, there is definitively never any reason to eschew that "Save" function at the end of the method chain. It is a very specialized utility, and the only gain comes from the final step.
I want somebody's software to fail at the compiler level if they call "With()" and not "Save()".
It seems like an impossible task by traditional means - but that's why I come to you guys. Is there an Attribute I can use to prevent a method from being "cast to void" or some such?
Note: The alternate way of achieving this goal that has already been suggested to me is writing a suite of unit tests to enforce this rule, and using something like http://www.testdriven.net to bind them to the compiler. This is an acceptable solution, but I am hoping for something more elegant.

I don't know of a way to enforce this at a compiler level. It's often requested for objects which implement IDisposable as well, but isn't really enforceable.
One potential option which can help, however, is to set up your class, in DEBUG only, to have a finalizer that logs/throws/etc. if Save() was never called. This can help you discover these runtime problems while debugging instead of relying on searching the code, etc.
However, make sure that, in release mode, this is not used, as it will incur a performance overhead since the addition of an unnecessary finalizer is very bad on GC performance.

You could require specific methods to use a callback like so:
new FluentClass().Init<int>(x =>
{
x.Save(y =>
{
y.With(-1),
y.With(300)
});
});
The with method returns some specific object, and the only way to get that object is by calling x.Save(), which itself has a callback that lets you set up your indeterminate number of with statements. So the init takes something like this:
public T Init<T>(Func<MyInitInputType, MySaveResultType> initSetup)

I can think of three a few solutions, not ideal.
AIUI what you want is a function which is called when the temporary variable goes out of scope (as in, when it becomes available for garbage collection, but will probably not be garbage collected for some time yet). (See: The difference between a destructor and a finalizer?) This hypothetical function would say "if you've constructed a query in this object but not called save, produce an error". C++/CLI calls this RAII, and in C++/CLI there is a concept of a "destructor" when the object isn't used any more, and a "finaliser" which is called when it's finally garbage collected. Very confusingly, C# has only a so-called destructor, but this is only called by the garbage collector (it would be valid for the framework to call it earlier, as if it were partially cleaning the object immediately, but AFAIK it doesn't do anything like that). So what you would like is a C++/CLI destructor. Unfortunately, AIUI this maps onto the concept of IDisposable, which exposes a dispose() method which can be called when a C++/CLI destructor would be called, or when the C# destructor is called -- but AIUI you still have to call "dispose" manually, which defeats the point?
Refactor the interface slightly to convey the concept more accurately. Call the init function something like "prepareQuery" or "AAA" or "initRememberToCallSaveOrThisWontDoAnything". (The last is an exaggeration, but it might be necessary to make the point).
This is more of a social problem than a technical problem. The interface should make it easy to do the right thing, but programmers do have to know how to use code! Get all the programmers together. Explain simply once-and-for-all this simple fact. If necessary have them all sign a piece of paper saying they understand, and if they wilfully continue to write code which doesn't do anythign they're worse than useless to the company and will be fired.
Fiddle with the way the operators are chained, eg. have each of the intermediateClass functions assemble an aggregate intermediateclass object containing all of the parameters (you mostly do it this was already (?)) but require an init-like function of the original class to take that as an argument, rather than have them chained after it, and then you can have save and the other functions return two different class types (with essentially the same contents), and have init only accept a class of the correct type.
The fact that it's still a problem suggests that either your coworkers need a helpful reminder, or they're rather sub-par, or the interface wasn't very clear (perhaps its perfectly good, but the author didn't realise it wouldn't be clear if you only used it in passing rather than getting to know it), or you yourself have misunderstood the situation. A technical solution would be good, but you should probably think about why the problem occurred and how to communicate more clearly, probably asking someone senior's input.

After great deliberation and trial and error, it turns out that throwing an exception from the Finalize() method was not going to work for me. Apparently, you simply can't do that; the exception gets eaten up, because garbage collection operates non-deterministically. I was unable to get the software to call Dispose() automatically from the destructor either. Jack V.'s comment explains this well; here was the link he posted, for redundancy/emphasis:
The difference between a destructor and a finalizer?
Changing the syntax to use a callback was a clever way to make the behavior foolproof, but the agreed-upon syntax was fixed, and I had to work with it. Our company is all about fluent method chains. I was also a fan of the "out parameter" solution to be honest, but again, the bottom line is the method signatures simply could not change.
Helpful information about my particular problem includes the fact that my software is only ever to be run as part of a suite of unit tests - so efficiency is not a problem.
What I ended up doing was use Mono.Cecil to Reflect upon the Calling Assembly (the code calling into my software). Note that System.Reflection was insufficient for my purposes, because it cannot pinpoint method references, but I still needed(?) to use it to get the "calling assembly" itself (Mono.Cecil remains underdocumented, so it's possible I just need to get more familiar with it in order to do away with System.Reflection altogether; that remains to be seen....)
I placed the Mono.Cecil code in the Init() method, and the structure now looks something like:
public IntermediateClass<T> Init<T>()
{
ValidateUsage(Assembly.GetCallingAssembly());
return new IntermediateClass<T>();
}
void ValidateUsage(Assembly assembly)
{
// 1) Use Mono.Cecil to inspect the codebase inside the assembly
var assemblyLocation = assembly.CodeBase.Replace("file:///", "");
var monoCecilAssembly = AssemblyFactory.GetAssembly(assemblyLocation);
// 2) Retrieve the list of Instructions in the calling method
var methods = monoCecilAssembly.Modules...Types...Methods...Instructions
// (It's a little more complicated than that...
// if anybody would like more specific information on how I got this,
// let me know... I just didn't want to clutter up this post)
// 3) Those instructions refer to OpCodes and Operands....
// Defining "invalid method" as a method that calls "Init" but not "Save"
var methodCallingInit = method.Body.Instructions.Any
(instruction => instruction.OpCode.Name.Equals("callvirt")
&& instruction.Operand is IMethodReference
&& instruction.Operand.ToString.Equals(INITMETHODSIGNATURE);
var methodNotCallingSave = !method.Body.Instructions.Any
(instruction => instruction.OpCode.Name.Equals("callvirt")
&& instruction.Operand is IMethodReference
&& instruction.Operand.ToString.Equals(SAVEMETHODSIGNATURE);
var methodInvalid = methodCallingInit && methodNotCallingSave;
// Note: this is partially pseudocode;
// It doesn't 100% faithfully represent either Mono.Cecil's syntax or my own
// There are actually a lot of annoying casts involved, omitted for sanity
// 4) Obviously, if the method is invalid, throw
if (methodInvalid)
{
throw new Exception(String.Format("Bad developer! BAD! {0}", method.Name));
}
}
Trust me, the actual code is even uglier looking than my pseudocode.... :-)
But Mono.Cecil just might be my new favorite toy.
I now have a method that refuses to be run its main body unless the calling code "promises" to also call a second method afterwards. It's like a strange kind of code contract. I'm actually thinking about making this generic and reusable. Would any of you have a use for such a thing? Say, if it were an attribute?

What if you made it so Init and With don't return objects of type FluentClass? Have them return, e.g., UninitializedFluentClass which wraps a FluentClass object. Then calling .Save(0 on the UnitializedFluentClass object calls it on the wrapped FluentClass object and returns it. If they don't call Save they don't get a FluentClass object.

In Debug mode beside implementing IDisposable you can setup a timer that will throw a exception after 1 second if the resultmethod has not been called.

Use an out parameter! All the outs must be used.
Edit: I am not sure of it will help, tho...
It would break the fluent syntax.

Hashing the state of a complex object in .NET

Some background information:
I am working on a C#/WPF application, which basically is about creating, editing, saving and loading some data model.
The data model contains of a hierarchy of various objects. There is a "root" object of class A, which has a list of objects of class B, which each has a list of objects of class C, etc. Around 30 classes involved in total.
Now my problem is that I want to prompt the user with the usual "you have unsaved changes, save?" dialog, if he tries to exit the program. But how do I know if the data in current loaded model is actually changed?
There is of course ways to solve this, like e.g. reloading the model from file and compare against the one in memory value by value or make every UI control set a flag indicating the model has been changed. Now instead, I want to create a hash value based on the model state on load and generate a new value when the user tries to exit, and compare those two.
Now the question:
So inspired of that, I was wondering if there exist some way to generate a hash value from the (value)state of some arbitrary complex object? Preferably in a generic way, e.g. no need to apply attributes to each involved class/field.
One idea could be to use some of .NET's serialization functionality (assuming it will work out-of-the-box in this case) and apply a hash function to the content of the resulting file. However, I guess there exist some more suitable approach.
Thanks in advance.
Edit:
Point taken about the hashing and possible collisions. Instead, I am going for deep comparing value by value. I am already using the XML serializer for persistence, so I am just going to serialize and compare chars. Not pretty, but it does the trick in this case.

Ok you can use reflection and some sort of recursive function of course.
But keep in mind that every object is a model of a particular thing. I mean there maybe a lot of "unimportant" fields and properties.
And, thanks to #compie!
You can create a hash function just for your domain. But this requires strong mathematic skills.
And you can try to use classic hash functions like SHA. Just assume that your object is a string or byte array.

Because this is a WPF app, it may be easier than you think to be notified of changes as they happen. The event architecture of WPF allows you to create event handlers at a level somewhere above where the event actually originates. So, you could create event handlers for the various "change" events of your UI elements in the root window of your interface and set the "changed" flag at that scope.
WPF Routed Events Overview

I would advice against this. Different objects can have the same hash. It's not safe to rely on this for checking if changes have to be saved.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.