Improving a big hierarchy tree

Improving a big hierarchy tree - c#

I have been working on a project where I need to interface and show a hierarchy from a big database (~100.000 folders, even more files), but it takes 5 minutes to be able to load the data from the database to the project, and I want to reduce it.
My implementation used a tree to represent this data (because it made sense at the beginning), using recursion to navigate down the tree to place new folders and items (which right now are not being used, since I tried to add them but it took so long to populate the tree I just removed them). Are there better structures to hold such hierarchy? Otherwise, is there a better way to transverse the tree (since it seems that most of the time is spent climbing down and up the tree, due to recursion and the fact the database is rather deep)?
I am using C#, but I believe an agnostic answer would be good enough.

You don't need to load everything because you can not see them all in your monitor anyway, you can use:
Lazy loading, load only if user want to, for example if user double click a certain folder
Using Virtualization User Interface or showing only the visible folder
For user interface virtualization example you can read an watch the Video:
Code Canvas

Related

What would be the most effective data structure for storing and comparing directories in C#?

So I am trying to develop an application in C# right now (for practice), a simple file synchronization desktop program where the user can choose a folder to monitor, then whenever a change occurs in said directory, it is copied to another directory.
I'm still in school and just finished my data structures course, so I'm still a bit of a new to this. But what I was currently thinking is the best solution would be a tree, right? Then I could use breadth-first search to compare, and if a node doesn't match then I would copy the node from the original tree to the duplicate tree. However that seems like it might be inefficient, because I would be searching the entire tree every time.
Possibly considering a linked list too. I really don't know where to go with this. What I've got accomplished so far is the directory monitoring, so I can save to a log file every time something is changed. So that's good. But I feel like this is the toughest part. Can anyone offer any guidance?

Use a hash table (e.g., Dictionary<string,FileInfo>. One of the properties of a FileInfo is the absolute path to the file: use that as the key.
Hash table looks up are cheap (and fast).

How to avoid System.OutOfMemoryException in c# when building a non recombining trinomial tree

I have "successfully" implemented a non recombining trinomial tree to price certain fixed-income derivatives. (Something like shown in the picture below - but with three branches that don't reconnect)
Unfortunately it turned out that the number of nodes I can use was severely limited by the available memory. If I build a tree with 20 time-steps this results in 3^19 nodes (so 1,1 Billion nodes)
The nodes of each time step are saved in List<Node> and these arrays are stored in a Dictionary<double,List<Node>>
Each node is instantiated via new Node(...). I also instantiate each of the lists and the dictionary via new Class() Perhaps this is the source of my error.
Also System.OutOfMemoryException isn't thrown because of the Dictionary/List-Object being to large (as is often the case) but because I seem to have too many Nodes - after a while new Node(...) can't allocate any further memory. Eventually the 2GB max List-Capacity will also kick in I think - seeing as how List grows exponentially larger with each time step.
Perhaps my data-structure is too wasteful or not really suited for the task at hand.
A possible solution could be to save the tree to a text-file thus avoiding the memory-problem completely. This however would necessitate a HUGE workaround.
Edit:
To add some more background. I need the tree to price path dependant products. This means that unfortunately I will have to access all the nodes. What is more after the tree has been build I start from the leaves and go backwards in time to determine the price. I also already only generate the nodes I need.
Edit2:
I have given the topic some though and also considered the various responses. Could it be that I just need to serialize the respective tree levels to the hard-drive. So basically - I create one time-step (List<Node>) write it to Disk etc. Later on when I start from the leaves - I will just have to load it in reverse oder.

You basically have two choices. evaluate only the branches you care about (Andrew's yield) and don't store results or build up your tree and save it to disk and implement a custom collection interface on top of it that accesses the right part of the disk. In this case you are still going to keep a minimal amount of data in your process memory and rely on the OS to do proper disk caching to make access fast. If you start working with large data sets the second option is a good tool to have in your tool belt, so you should probably write this with reuse in mind.

What we have here is a classic problem of doing an enormous amount of processing up front... and then storing EVERYTHING into memory to be processed at a later time.
While simple, given harsh enough conditions (like having a billion entries), it will eat up all the memory.
Now, the OP didn't really specify what the intention of the tree was or how it was going to be used... but I would propose that instead of building it all at once... build it as you need it.
Lazy Evaluation with yield
Instead of doing everything all at once and having to store it... it might be ideal to do it ONLY when you actually require it. Check out this post for more info and examples of using yield.
This won't work great though if you need to traverese the tree a bunch of times... but it might still allow you to have deeper depth than you currently do.

I don't think Serializing to disk will help much. One, when you attempt to deserialize the list you will still run out of memory (as, to the best of my knowledge, there is no way to partially deserialize an object).
Have you considered changing your data structure into a relational database model and storing it in a SQLEXPRESS database?
This would give you the added benefit of performing queries with indexes instead of your custom tree traversal logic.

Undo Redo in WPF/C# in an already functional application

I have done some research already as to how I can achieve the title of this question. The app I am working on has been under development for a couple of years or so (slow progress though, you all know how it is in the real world). It is now a requirement for me to put in Undo/Redo multiple level functionality. It's a bit late to say "you should have thought about this before you started" ... well, we did think about it - and we did nothing about it and now here it is. From searching around SO (and external links) I can see that the two most common methods appear to be ...
Command Pattern
Memento Pattern
The command pattern looks like it would be a hell of a lot of work, I can only imagine it throwing up thousands of bugs in the process too so I don't really fancy that one.
The Memento pattern is actually a lot like what I had in my head for this. I was thinking if there was some way to quickly take a snapshot of the object model currently in memory, then I would be able to store it somewhere (maybe also in memory, maybe in a file). It seems like a great idea, the only problem I can see for this, is how it will integrate with what we have already written. You see the app as we have it draws images in a big panel (potentially hundreds) and then allows the user to manipulate them either via the UI or via a custom built properties grid. The entire app is linked up with a big observer pattern. The second anything changes, events are fired and everything that needs to update does. This is nice but I cant help thinking that if a user is entering text into a texfield on the properties grid there will be a bit of delay before the UI catches up (seems as everytime the user presses a key, a new snapshot will be added to the undo list). So my question to you is ....
Do you know of any good alternatives to the Memento pattern that might work.
Do you think the Memento pattern will fit in here or will it slow the app down too much.
If the Memento pattern is the way to go, what is the most efficient way to make a snapshot of the object model (i was thinking serialising it or something)
Should the snapshots be stored in memory or is it possible to put them into files?
If you have got this far, thankyou kindly for reading. Any input you have will be valuable and very much appreciated.

Well , Here is my thought on this problem.
1- You need multi level undo/redo functionality. so you need to store user actions performed which can be stored in a stack.
2- Your second problem how to identify what has been changed by a operation i think through Memento pattern , it is quite a challenge. Memento is all about toring initial object state in your memory.
either , you need to store what is changed by a operation so that you can use this information to undo the opertions.
Command pattern is designed for the Undo/Redo functionality and i would say that its late but its worth while to implement the design which is being used for several years and works for most of the applications.

If performance allows it you could serialize your domain before each action. A few hundred objects is not much if the objects aren't big themselves.
Since your object graph is probably non trivial (i.e. uses inheritance, cycles,...) the integrated XmlSerializer and JsonSerializers are out of question. Json.net supports these, but does some lossy conversions on some types (local DateTimes, numbers,...) so it's bad too.
I think the protobuf serializers need either some form of DTD(.proto file) or decoration of all properties with attributes mapping their name to a number, so it might not be optimal.
BinaryFormatter can serialize most stuff, you just need to decorate all classes with the [Serializable] attribute. But I haven't used it myself, so there might be pitfalls I'm not aware of. Perhaps related to Singletons or events.

The critical things for undo/redo are
knowing what state you need to save and restore
knowing when you need to save the state
Adding undo/redo after the fact is always a painful thing to do - (I know this comment is of no use to you now, but it's always best to design support into the application framework before you start, as it helps people use undo-friendly patterns throughout development).
Possibly the simplest approach will be a memento-based one:
Locate all the data that makes up your "document". Can you unify this data in some way so that it forms a coherent whole? Usually if you can serialise your document structure to a file, the logic you need is in the serialisation system, so that gives you a way in. The down side to using this directly is usually that you will usually have to serialise everything so your undo will be huge and slow. If possible, refactor code so that (a) there is a common serialisation interface used throughout the application (so any and every part of your data can be saved/restored using a generic call), and (b) every sub-system is encapsulated so that modifications to the data have to go through a common interface (rather than lots of people modifying member variables directly, they should all call an API provided by the object to request that it makes changes to itself) and (c) every sub-portion of the data keeps a "version number". Every time an alteration is made (through the interface in (b)) it should increment that version number. This approach means you can now scan your entire document and use the version numbers to find just the parts of it that have changed since you last looked, and then serialise the minimal amount to save and restore the changed state.
Provide a mechanism whereby a single undo step can be recorded. This means allowing multple systems to make changes to the data structure, and then when everything has been updated, triggering an undo recording. Working out when to do this may be tricky, but it can usually be accomplished by scanning your document for changes (see above) in your message loop, when your UI has finished processing each input event.
Beyond that, I'd advise going for a command based approach, because there are many benefits to it besides undo/redo.

You may find the Monitored Undo Framework to be useful. http://muf.codeplex.com/
It uses something similar to the memento pattern, by monitoring for changes as they happen and allows you to put delegates on the undo stack that will reverse / redo the change.
I considered an approach that would serialize / deserialize the document but was concerned about the overhead. Instead, I monitor for changes in the model (or view model) on a property by property bases. Then, as needed, I use the MUF library to "batch" related changes so that they undo / redo as a unit of change.
The fact that you have your UI setup to react to changes in the underlying model is good. It sounds like you could inject the undo / redo logic there and the changes would bubble up to the UI.
I don't think that you'd see much lag or performance degradation. I have a similar application, with a diagram that we render based on the data in the model. We've had good results with this so far.
You can find more info and documentation on the codeplex site at http://muf.codeplex.com/. The library is also available via NuGet, with support for .NET 3.5, 4.0, SL4 and WP7.

wpf: usercontrol vs. customcontrol performance issue

Which one is better from performance view user control or custom control?
Right now I am using user control and In a specific scenario, I am creating around 200(approx.) different instances of this control but it is bit slow while loading and I need to wait atlest 20-30 second to complete the operation. What should I do to increase the performance?
Edit:
The scenario is:
In my Window, I have a TreeView, each item of this represents different user-defined types, So I have defined DataTemplate for each type. These DataTemplates are using user-controls and these usercontrols are binded with properties of user-defined types. As simple, TreeView maps a Hierarchical Data Structure of user-defined types. Now I read from Xml and create the Heirarchical structure and assign it to TreeView and it takes a lot of time to load. Any help?

I have an application that is loading around 500 hundred small controls. We originally built these as user controls, but loading the baml seems to cause the controls to load slow (each one is really fast, but when we get around 300, the total of all of them together seems to add up). The user controls also seem to use up a good amount of memory. We switched these to custom controls and the app launches almost twice as fast and takes up about 1/3 the ram. Not saying this will always be the case, but custom controls made a big difference for us.

FYI: Here's a link on using a VirtualizingPanel with the TreeView: http://msdn.microsoft.com/en-us/library/cc716882.aspx

Make sure to SuspendLayout while adding controls en masse. Try to completely configure the control before adding it to any container.

Here is the follow-up article to my issues with WPFs Virtualizing Stack Panel and TreeView. I hope this helps you.
http://lucisferre.net/2010/04/21/virtualizing-stack-panel-wpf-part-duex/
Long story short: It is possible to do the navigation with the current VSP, but it is a bit of a hack. The current VSP design needs a rework, as the way it currently virtualizes the View breaks the coupling between the View and ViewModel which, in turn, breaks the whole concept of MVVM.

I worked at Microsoft and was not allowed to use the UserControl because of its poor performance. We always created controls in C#. Not sure about the performance of DataTemplates, but am interested in knowing if it is better. I suspect that it is.

Creating a Menu Stack

I'm trying to create a menu system that allows you to go backward and forward while returning the final selected data to the calling method.
Take for example a orderFood() method displays a menu of choices of types of food that can be ordered. If someone selects seafood a seafood() method would run and query what types of seafood is availble to order then display it
if the user selects fishsticks, fishsticks would be returned to the method that called order food. Likewise, this menu system would allow the user to go back to the previous menu.
I'm thinking (Using C#) I'd have to use reflection and unsafe code (pointers) to get this sort of effect but I am positive that there is a simpler way to do this. Any suggestions?
Thanks,
Michael

Instead of thinking about the menus as a stack, try thinking of them like a tree.
If you do this, it should be fairly easy to "walk" up and down the tree as you need to implement your stack approach.
This would be fairly easy to read from a file or database (very easy from XML, in particular), and also shouldn't be too tough to walk up and down.
There isn't really anything in this that should require unsafe code or reflection - it can all be done with standard collections in C#.

You could easily do what you are describing without unsafe code provided you know at compile time, by making it data driven. Instead of thinking of menus as routines that do these things, think of menus as a class of objects that does these things. Ommm.
Even if you don't know everything at compile time (say you need to read the options from a file) you could still do it by building the nest of objects which represent your menus at run time, based on the contents of the file.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.