Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
Here is my scenario:
Customer blames that a specific type of file will appear in a specific folder, when saving some files with our system. Our system works together with another system, not developed by us. I want to find out which system ist creating such files.
This is what makes it more difficult:
I suppose that those files appear temporarily. So in my small development scenario, it is nearly not possible to recognize them. But when working with many thousands of files, I suppose, the amount of temporary files will increase. And, due to the limits of customers hardware, they will exist much longer.
So what I am looking for is a tool which traces all changes of content in a specific folder. Ideally, I could filter for a specific type of file. It should work on Win10.
My questions:
Does anybody know such a tool or could give me a suitable keywork for
searching?
Or is this too specific, so I have to make my own tool?
In the 2nd case I usually prefer C#/.NET. Is there anything suitable available, which I can extend or change or should use? (e.g. a tool or framework or NuGet, e.g. extending a tool such as Everything)
The namespace System.IO has a class that allows file and folder monitoring: FileSystemWatcher.
From the documentation:
Listens to the file system change notifications and raises events when
a directory, or file in a directory, changes.
See https://learn.microsoft.com/en-us/dotnet/api/system.io.filesystemwatcher?view=net-6.0.
The documentation above gives a good code example, as well as many explanations about how changes are tracked, and eventual limitations.
You can use this class to log each change in a target folder, and then use this log to understand what happens.
If needed, you could then narrow down the issue using a tool like SysInternal's ProcessMonitor. Assuming you gathered enough informations to be able to reproduce the problem, or be able to predict roughly when it could happen again, you could use ProcessMonitor to record system events.
ProcessMonitor records system events, via its Capture button. You can filter the events with the provided Filter mechanism. For instance (this is a simple case) you can filter to see only events from a specific PID (Process ID). You can find the PID of your target by looking at the details page of the Task Manager. This way you will likely find which process created which file.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm writing a Web app that allows a user to upload and store files; however I'd like to add a simple version history feature to the files they upload based on the file name.
Is there an existing framework/module I can integrate for the version history part or is it better for me to write it up myself? I feel like there could be a lot of plumbing that's already been done in a framework. I couldn't find any and most of my Google searches turned up actual project version control software.
I'm looking at using .NET and C# to make this Web app.
I don't know of any libraries off the top of my head, but this is something I would probably roll myself anyway. The solution is simple. Take a SHA-1 (or other appropriate) hash of the file bytes, and use that as the filename/primary key in your backing store for that version of the file. This is called 'content-addressable', and is a simplified version of what git does.
One possible benefit of this is that if 2 users upload identical versions of a file, you only have to store it once.
Then you just need an list somewhere that tracks which hashes go in which order for a given user filename.
EDIT:
Its also worth noting that if you were not dealing with blobs, but structured data or your app objects, you might get much of this functionality from your data store via SQL triggers, or the RavenDB versioning bundle, for example.
I would use a version control system, like Subversion. This will be really reliable, easily integrable, it will offer history (with great details) and capability to download any of the past versions. Bonus: you can even diff two versions (obviously this make sense only for text based file types).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
As part of the Discovery process for an upcoming project, I am trying to find a way of taking a representative sample of the PPT files on our network. So far, I have collected & organized all of the PPT files that we have, however I've realized that there is an overwhelming volume of documents, such that I need to find a way to reduce it down. To this end, I was thinking that it'd be helpful to delete all "duplicate" files.
Our company does not have any sort of version control system for files on our network. As such, users often create copies of files in order to make small minor changes. This has led to a high volume of "duplicate" files with no real naming convention, etc. Ideally, I'd be able to make a best-guess as to which files are "duplicates" and keep the most recent version. Since I just need a representative sample, I do not need to be 100% accurate regarding the save/delete decision and it's also ok if I lose a chunk of the files due to (there are currently 135K files, and I expect to end up with 3-5K). I am not sure how to go about this, as tools like http://www.easyduplicatefinder.com/ seem to look for truly identical documents, as opposed to a more nuanced difference.
Here are a couple of additional details:
File names do not follow any standard convention
I think it's fair to assume that many of the PPT properties would remain unchanged across versions
Versions of files are always located in the same folder, however other PPT files may also exist in the same folder
I'm open to addressing this problem in any of the following languages/technologies: C#, VB, Ruby, Python, IronPython, PowerShell
I would approach it like:
extract all visible text strings from each .ppt file
dump the strings into text files, one per .ppt
run diff across all pairs of text files (in the same directory?) to get min edit distance
run the resulting distance matrix through a clustering algorithm
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a question about changing an variable of an application from another application.
For example: If in 1.exe I have defined string a="a", how will I be able to change a="a" to a="b" by using another application?
Do I have to get the memory address of string and then change it's content to b? Or Is there any another easier way?
You can set up a shared resource for the two applications and read the values from there. It could, be a database, cache or even a simple text file.
Refresh the variables from the shared resource when appropriate.
Given the scenario you have mentioned (i.e. you do not control the code for the 1st application).. The general idea of opening the target process with admin privileges, finding the memory location you want to update, and then updating it applies..
However, be warned that it will generally not be that simple. For example,
It can be extremely hard to predict, how many copies, of the variable are maintained by the applications logic, and where?
Without disassembling the code (no way a trivial task.. none of this is), scanning for the value and guessing the memory location is the only option which comes to mind. But it has the risk of making wrong guesses, and corrupting the entire process.
PS - There are freely available software, which attempt to do exactly what I've described above.. I'd advise that you try to examine how they work (scenarios they support), to get better idea of what you are trying to accomplish.
PPS - Also be careful what you download.. Applications like these, if downloaded from un-reliable sites, can be damaging / security risk.
I think the easiest way is communication with network sockets in localhost via UDP or TCP. It gives you a good event mechanism so you can easily handle your data without checking the new data changes frequently, also will be doesn't matter how amount of application communicating each other in same time. Other solutions like shared memory etc. will be hard to control especially when you running three and more apps.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There has been a lot of discussion on SO about using blobs vs. files to store binaries, but the current issue I'm facing involves virus scanning. There are likely a lot of APIs that can be used to scan files saved to a file system. Are there any for blobs? Are there APIs that can be given streams or byte[]s and told to scan them for viruses and malware? If so, does anybody have any recommendations? Or is this yet another reason to steer clear of blobs?
FYI - I'm using C# and MongoDb right now for my blobs.
I was in need of a solution that the question was asking about. I evaluated a lot of things and came to the conclusion that there was really not one good .NET library for this. So I made my own.
The library is called nClam, and it connects to a ClamAV server. It is open-source (Apache License 2.0) library which has a very simple API. You can obtain it here: https://github.com/tekmaven/nClam. There is also a nuget package: nClam. I also have some instructions on how to set up the ClamAV server on my blog, here: http://architectryan.com/2011/05/19/nclam-a-dotnet-library-to-virus-scan/.
I don't know if APIs exist for scanning in-memory data (I haven't found any), but you can always put your binary data into a temporary file, scan the file (by calling an external program working in command line) and delete it when it's done.
Certainly Sophos's API (SAVI) can scan arbitrary memory buffers - you can provide call-backs for accessing the data, so it can be any data you can access.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I'm working on a project which generates (composite) Microsoft Word documents which are comprised of one or more child documents. There are tens of thousands of permutations of the composite documents. Far too many for users to easily manage. Users will need to view/edit the child documents through the app which hides all of the nasty implementation details. A requirement of the system is that the child documents must be version controlled. That is what has been tripping me up.
I've been torn between using an off-the-shelf solution or rolling my own. At a minimum, the system needs to support get latest, get specific version, add new, rename and possibly delete. I’ve whiteboarded it enough to realize it won’t be a trivial task to create my own. As far as commercial systems I have VSS and TFS at my disposal. I've played with the TFS API some, but it isn’t as intuitive or well documented as I had hoped. I'm not averse to an open source solution (e.g. SVN), but I have less familiarity with them.
Which approach or tool would you recommend? Why? Do you have any links to API documentation you would recommend?
Environment: C#, VS2008, SQL Server 2005/2008, low volume (a few hundred operations per day)
SharePoint does a pretty good job of document management, with versioning, etc. It also has plenty of APIs and is a much more modern approach than using the COM layer for VSS. SP would be a good solution if you are writing this as an enterprise solution (dedicated server, etc), but not so good for a desktop or small-business/SOHO app.
Its actually pretty easy to get rolling with document versioning in Sharepoint. If you setup a new list you will be able to define version options for attachments and list items right in the SP list settings.
You can also get a much more detailed control over versioning by using the SP webservices. If your planning on doing all of your document access from within your application, and don't want to have to push users into the Sharepoint site I would use this approach. Here is a good tutorial to get started with SP versioning
Give a try to Plastic SCM. It's distributed, has a great GUI, it can work as centralized too and you'll find tons of .NET assemblies to hook your code.
alt text http://www.codicesoftware.com/images3mk/screenshots/visualize_4.JPG