How would I replicate a file hierarchy from template? - c#

I need to restore a certain file hierarchy in a folder, after it gets deleted (yeah, don't ask).
For now I imagine it as a simple application that gets run by Windows Task Scheduler. While there are some ways to achieve that effect, I wanted to create a simple single exe.
So I put my structure into my project, and set all files Build Action to "Embedded resource". I can sort of access them through Assembly.GetExecutingAssembly().GetManifestResourceNames() and Assembly.GetExecutingAssembly().GetManifestResourceStream(name), however I don't see any simple way to preserve hierarchy like that.
While I'm solving a real problem, the question is more academic in nature - I don't need anything convoluted, like parsing resource name to determine what hierarchy it resulted from. My file structure actually is just 4 files in two folders, if push comes to shove I can just write everything out manually.
I'm sure there should be a simple way to just say "Hey, here's how those files should be arranged, repeat". Maybe resources are a wrong mechanism?

You're right, you can't. Because resource names are identifiers, some characters are replaced during the build and you can't store empty folders either You can however build a Zip file from the structure embed that single file and extract it when needed - with one single call if you want. The framework has built-in zip support. Please note however that the ZipFile can't be accessed concurrently only sequentially as there is only a single inner state.

Related

What would be the most effective data structure for storing and comparing directories in C#?

So I am trying to develop an application in C# right now (for practice), a simple file synchronization desktop program where the user can choose a folder to monitor, then whenever a change occurs in said directory, it is copied to another directory.
I'm still in school and just finished my data structures course, so I'm still a bit of a new to this. But what I was currently thinking is the best solution would be a tree, right? Then I could use breadth-first search to compare, and if a node doesn't match then I would copy the node from the original tree to the duplicate tree. However that seems like it might be inefficient, because I would be searching the entire tree every time.
Possibly considering a linked list too. I really don't know where to go with this. What I've got accomplished so far is the directory monitoring, so I can save to a log file every time something is changed. So that's good. But I feel like this is the toughest part. Can anyone offer any guidance?
Use a hash table (e.g., Dictionary<string,FileInfo>. One of the properties of a FileInfo is the absolute path to the file: use that as the key.
Hash table looks up are cheap (and fast).

How to get access to the files stored on another repository in code

I want to get the access (read) files with text which located on another repository. Is it possible to do at all?
We faced the problem of preserving the history of big files if we place them in the same repo. For every commit it saves another copy of these files in History, which leads to very understandable issues. So we decided to create another repo and store them there. But I have noe exp how can I access it from the code inside the current solution.
I'd be nice to get the filePath of this files in currect solution, so can read them and process.
If you want to reference something, it either needs to be placed alongside your project, or you need a build step that retrieves it and places it somewhere your project can reference.
If these are actual text files you're wanting to read at runtime, those text files need to be discoverable by some means... The fact they're in another repository doesn't help, because that's just another file path that you aren't aware of.
I'd recommend building/publishing your other repository to some discoverable location that your main project can reference at build time or run time.
You can use git clone operation, and just download files to your project. In your main project add rules to .gitingnore to skip those big files from main repo.
You should take a step back and revisit the original problem - large files bogging down the repo. As I noted in comments, what you say (that each such file is copied in every commit) is not accurate; but it is true that large files - especially large binary files - can cause problems in git repos.
And the standard tool to solve those problems is LFS. This creates a separate "LFS repo" and manages its relationship to the base repo automatically, which means questions about how to manually read files from a different repo can be avoided entirely.

Advice loading resources within a visual c# project

I have been asked to create a project which involves loading profiles into a UI in which the user can edit the values. I need to be able to load data from a file within the project and allow the user to make chnages and save back to that file.
All of this has to be contained within an executable but I am unsure of the best way to approach this, I was think of using an XML file with an XML structure or a text file and just string split on it or even a resources file and just call out to it.
I thought I would put my problem up on here and see what the community suggest, thanks!
Embedded resources are not intended to be changed during runtime. A database is really easy to auto-create these days using code-first EF, but a file containing XML or JSON would also be a good option (as there are third-party libraries to help you parse the result). Hand-rolling your own string.split solution is not recommended (because if requirements get more complex in the future, your code may become unmanageable)

How to project any folder changes to a new folder and leave original untouched?

In my program I am calling methods that do lots of changes to a content of a folder, including:
deleting files/folders,
changing files/folders,
adding files/folders,
adding/deleting symboliclinks/junctions.
That is no problem so far. But I came up with the idea of optionally projecting the final state of the folder (after all the operations are done) to another folder, so that the original folder remains untouched.
Just copying the folder before applying the operations is not appropriate, because the operations might delete large chunks of data, that would have to be unnecessarily copied beforehand. And so it came to my mind, that a professional programmer would certainly not approach it this way.
Ideally I would write something like this (pseudo code):
originalFolder.Delete(lots of files).Add(Some other stuff, maybe change some permissions etc).ProjectTo(newFolder)
Is there some kind of design pattern or other way I could achieve something like this? Maybe some virtual file system I can do stuff on before materializing it into a seperate folder?
I know how to write extension methods and I have already written lots of trivial ones, but I really need to be put on the right path on how to achieve something like this.
If the adding and deleting would be done through YOUR apis, then you can modify the list of files in memory without touching the physical files and when you are set do the changes with the copy on the final folder.
Of course that assumes that you don't need the files changed in any matter thus you won't need to read the new structure through the filesystem before committing, I mean that it would be totally within your application.
If this was on linux, I would have suggested another solution which is to use hard links and hard link the files to many folders and thus actually do whatever you want with the first folder without touching the second. I am not sure if NTFS supports that.
If all you want is to delay changes to the original folder until you are certain that you want to commit them, then a Unit of Work pattern might do the trick. Store all operations that are to be applied to the folder in a container, and then commit them sequentially.
This sounds a bit dangerous though, since changes to the original folder before changes are committed easily can mess things up. In that case you would have to implement some sort of concurrency check to be as certain as possible that all operations will succeed.

How can I determine when a file was most recently renamed?

I have a program that compares files in two folders. I want to detect if a file has been renamed, determine the newest file (most recently renamed), and update the name on the old file to match.
To accomplish this, I would check to see if the newest file is bit by bit identical to the old one, and if it is, simply rename the old file to match the new one.
The problem is, I have nothing to key on to tell me which file was most recently renamed.
I would love some property like FileInfo.LastModified, but for files that have been renamed.
I've already looked at solutions like FileSystemWatcher, and that is not really what I'm looking for. I would like to be able to run my synchronizer whenever I want, without having to worry about some dedicated process tracking a folder's state.
Any ideas?
A: At least on NTFS, you can attach alternate data streams to a file.
On your first sync, you can just attach a GUID in an ADS to the source files to tag them.
B: If you don't have write access to the source, store hashes of the files you synced in your target repository. When the source changes, you only have to hash the source files and only compare bit-by-bit if the hashes collide. Depending on the quality and speed of your hash function, this will save you a lot of time.
If you are running on an NTFS drive you can enable the change journal which you can then query for things like rename events. However you need to be an admin to enable it to begin with and it will use disk space. Unfortunately I don't know of any specific C# implementations of reading the journal.
You could possibly create a config file that holds a list of all expected names within the folder, and then, if a file in the folder is not a member of the expected list of names, determine that the file has then been renamed. This would, however, add another layer of work considering you'd have to change the list every time you wish to add a new file to the folder.
Filesystems generally do not track this.
Since you seem to be on Windows, you can use GetFileInformationByHandle(). (Sorry, I don't know the C# equivalent.) You can use the "file index" fields in the struct returned to see if files have the same index as something you've seen before. Keep in mind that hardlinks will also have the same index.
Alternatively you could hash file contents somehow.
I don't know precisely what you're trying to do, so I can't tell you whether either of these points makes sense. It could be that the most reasonable answer is, "no, you can't do that."
I would make a CRC (e.g. CRC example) of (all?) the files in the 2 directories storing the last update time with the CRC value, file name etc. After that, interate through the lists finding maches by the CRC and then use the date values to decide what to do.

Categories