This is likely a long shot, but I thought I'd ask anyway. I'm using a document management system's API. They provide a "WriteFile" method to save a given document to disk. However, the library does not have a way to simply read a document into memory. My only option, it seems, is to write to disk, then read it back in again. I'm wondering if there is a better way to work around this obvious limitation.
The method takes a string for the resulting file path. Method signature:
void ImageInfo.WriteFile(string Filename);
Theoretically, it is possible to intercept the WriteFile win32 API calls of any process, be it .NET, C++, etc using something called as Import Address Table Hooking which actually is a valuable tool in software testing on windows.
Basically you could overwrite the WriteFile,kernel32.dll entry in the Import Address Table to point to your method and then intercept the bytes which are attempted to be written.
There are probably other ways in layers above, like in .NET where you could possibly change the ILASM code of the 3rd party app dll. Or have your own version of some of the .NET dlls which replace some of the standard .NET classes.
Practically, it might not really be worth it, for e.g. If the API does not explicitly flush the file to disk, your subsequent reads might end up coming from the OS file cache and won't be that big a perf problem. You could probably achieve this by creating the file and keeping it open before calling WriteFile (just a guess).
Of course, I suppose you have profiled and measured it already.
You'd need a Windows API hooking library that can call a managed code callback. Easyhook is one such library. Beware that you might out that you haven't gained anything after you're done, the file system cache already provides direct memory access to file data.
It sounds like the API does not provide the reading part because they can't provide a better (more performant) manner than what is already available in the .NET framework.
Related
is it possible to use either File.Delete or File.Encrypt to shred files? Or do both functions not overwrite the actual content on disk?
And if they do, does this also work with wear leveling of ssds and similar techniques of other storages? Or is there another function that I should use instead?
I'm trying to improve an open source project which currently stores credentials in plaintext within a file. Because of reasons they are always written to that file (I don't know why Ansible does this, but for now I don't want to touch that part of the code, there may be some valid reason, why that is that way, at least for now) and I can just delete that file afterwards. So is using File.Delete or File.Encrypt the right approach to purge that information off the disk?
Edit: If it is only possible using native API and pinvoke, I'm also fine with that. I'm not limited to only .net, but to C#.
Edit2: To provide some context: The plaintext credentials are saved by the ansible internals as they are passed as a variable for the modules that get executed on the target windows host. This file is responsible for retrieving the variables again: https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/powershell/Ansible.ModuleUtils.Legacy.psm1#L287
https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/csharp/Ansible.Basic.cs#L373
There's a possibility that File.Encrypt would do more to help shred data than File.Delete (which definitely does nothing in that regard), but it won't be a reliable approach.
There's a lot going on at both the Operating System and Hardware level that's a couple of abstraction layers separated from the .NET code. For example, your file system may randomly decide to move the location where it's storing your file physically on the disk, so overwriting the place where you currently think the file is might not actually remove traces from where the file was stored previously. Even if you succeed in overwriting the right parts of the file, there's often residual signal on the disk itself that could be picked up by someone with the right equipment. Some file systems don't truly overwrite anything: they just add information every time a change happens, so you can always find out what the disk's contents were at any given point in time.
So if you legitimately cannot prevent a file getting saved, any attempt to truly erase it is going to be imperfect. If you're willing to accept imperfection and only want to mitigate the potential for problems somewhat, you can use a strategy like the ones you've found to try to overwrite the file with garbage data several times and hope for the best.
But I wouldn't be too quick to give up on solving the problem at its source. For example, Ansible's docs mention:
A great alternative to the password lookup plugin, if you don’t need to generate random passwords on a per-host basis, would be to use Vault in playbooks. Read the documentation there and consider using it first, it will be more desirable for most applications.
Imagine there's a mission-critical process that'll be used in a business which handles sensitive information (think of Credit Card, social security, patient records...etc). I would think this unit ideally should do whatever it has to do on-the-fly, meaning it won't intentionally write files to disk containing sensitive information. The idea here is that if the computer that runs this process is compromised, no sensitive information can be leaked, at least not by means of files.
What approaches could be taken to, say, come up with a unit test that will fail if the unit under test tries to write any file to disk?
There is the FileSystemWatcher (http://www.c-sharpcorner.com/uploadfile/puranindia/filesystemwatcher-in-C-Sharp/) however this requires you to know a specific directory. In your case this probably isn't very helpful since the program could write anything to disk any where. This introduces a unique problem. However, I have also found something called Detours from Microsoft. This appears to intercept all native win32 api calls. http://research.microsoft.com/en-us/projects/detours/ The issue with this is that its kind of hard to test, and integrating it into unit testing will be a challenge.
When you have to treat your software as "untrusted" in the sense that you need to prove it doesn't do something, testing becomes a complex task that requires you to run them on very controlled environments. When hooking in to the Win32 API, you will be deluged with API calls that need to be processed quickly. This can result in unintentional side effects because the application is not running in a truly native environment.
My suggestion to you (having worked several years doing software testing for Pharma automation to the exacting standards of the FDA) is to create a controlled environment, eg a virtual machine, that has a known starting state. This can be accomplished by never actually saving vmdk changes to disk. You have to take a snapshot of the file system. You can do this by writing a C# app to enumerate all files on the virtual drive, getting their size, some timestamps and maybe even a hash of the file. This can be time consuming so you may want (or be able) to skip the hashing. Create some sort of report, easiest would be by dropping them in a CSV or XML export. You then run your software under normal circumstances for a set period of time. Once this is complete, you run a file system analysis again and compare the results. There are some good apps out there for comparing file contents (like WinMerge). When taking these snap shots, the best way to do it would be to mount the vmdk as a drive in the host OS. This will bypass any file locks the guest OS might have.
This method is time intensive but quite thorough. If you don't need something of this depth, you can use something like Process Monitor and write the output to a file and run a report against that. However in my work I would have to prove that Process Monitor shows all IO before I could use it which can be just as hard as the method I spoke of above.
Just my 2 cents.
UPDATE:
I've been thinking about it, and you might be able to achieve fairly reliable results if you remove all references to System.IO from your code. Write a library to wrap around System.IO that either does not implement a write method, or only implements one that also writes to a log file. In this case, you simply have to validate that every time a write occurs using your library, it gets logged. Then validate using reflection that you don't reference System.IO outside of this new wrapper library. Your tests can then simply look at this log file to make sure only approved writes are occurring. You could make use of a SQL Database instead of a flat log file to help avoid cases of tampering or contaminated results. This should be much easier to validate than trying to script a virtual machine setup like I described above. This, of course, all requires you to access to the source code of the "untrusted" application, although since you are unit testing it, I assume you do.
1st option:
Maybe you could use Code Access Security, but the "Deny" is obsolete in .NET 4 (but should works in previous version):
[FileIOPermission(SecurityAction.Deny)]
public class MyClass
{
...
}
You may reactivate this behavior in .NET 4 using NetFx40_LegacySecurityPolicy
2nd option:
reducing the level of privilege may also works, as I know that downloaded app can't write on the disk and must use a special storage area.
3rd option:
Remove any reference to System.IO and replace by an interface that your code must use to write data to disk.
Then write an implementation that use System.IO (in a separate project)
In the nunit test, mock this interface and throw an exception when a method id called.
Problem is to ensure any developers will not call System.IO anymore. You can try to do this by enforcing coding rules using FxCop (or other similar tools)
I've ran into a bit of a stupid problem today:
In my project I have to use a library (that I can't replace), he problem is that I'm using MemoryStream instead of frequently saving to the HDD (because there are many files, and they are small in size, so it's perfect for MemoryStream). The problem is that the library API is built around filesystem access - and one of the functions accepts only direct path to file.
How can I still send a string (path) to the method, which makes a new FileStream without actually touch the hard-drive?
For example "\MEMORY\myfile.bin"?
Well - that's thought.
Basically, you have three possible solutions:
You can use a reflector to modify the library given.
You can inspect the appropriate method, and then, by using some reflection magic you might be able to modify the object at runtime (very un-recommended)
You can play around with system calls and API - and by going into low-level ring0 assembly modify kernal.dll to referrer I/O queries from your path to the memory. (maybe that's possible without ring0 access - I am not sure).
Obviously, the most recommended is to use a reflector to modify the library given. otherwise, I can't see a solution for you.
In respond to the first comment, you can:
use RAMDrive (a program which allocates small chunks of the system memory and show it as partition)
If the file must exist on the disk (and only disk paths are accepted), then the main option is a virtual filesystem which lets you expose custom data as a filesystem. There exist several options, such as now-dead Dokan, our Solid File System OS Edition and Callback File System (see description of our Virtual Storage product line) and maybe Pismo File Mount would work (never looked at it closely).
It all depends on how the library is constructed.
If it's a 100% managed library that uses a FileStream, you are probably stuck.
If it takes the provided filename and call a native WIN32 CreateFile function, it's possible to give it something else than a file such as a named pipe.
To test quickly if it's possible, pass #"\\.\pipe\random_name" to the method: if it responds by saying explicitely that it can't open pipes and filenames begining with \\.\, well, sorry. ON the other hand, if it says it can't find the file, you have a chance to make it work.
You can then create a NamedPipeServerStream and use the same name for your library method call prepended with \\.\pipe\.
You can't "represent" it as a file, but you could "convert" it to a file using a StreamWriter class.
I am not sure the best way to explain this so please leave comments if you do not understand.
Basically, I have a few libraries for various tasks to work with different programs - notification is just one example.
Now, I am building a new program, and I want it to be as lightweight as possible. Whilst I would like to include my notification engine, I do not think many people would actually use its functionality, so, I would rather not include it by default - just as an optional download.
How would I program this?
With unmanaged Dlls and P/Invoke, I can basically wrap the whole lot in a try/catch loop, but I am not sure about the managed version.
So far, the best way I can think of is to check if the DLL file exists upon startup then set a field bool or similar, and every time I would like a notification to be fired, I could do an if/check the bool and fire...
I have seen from the debug window that DLL files are only loaded as they are needed. The program would obviously compile as all components will be visible to the project, but would it run on the end users machine without the DLL?
More importantly, is there a better way of doing this?
I would ideally like to have nothing about notifications in my application and somehow have it so that if the DLL file is downloaded, it adds this functionality externally. It really is not the end of the world to have a few extra bytes calling notification("blabla"); (or similar), but I am thinking a lot further down the line when I have much bigger intentions and just want to know best practices for this sort of thing.
I do not think many people would
actually use its functionality, so, I
would rather not include it by default
- just as an optional download.
Such things are typically described as plugins (or add-ons, or extensions).
Since .NET 4, the standard way to do that is with the Managed Exensibility Framework. It is included in the framework as the System.ComponentModel.Composition assembly and namespace. To get started, it is best to read the MSDN article and the MEF programming guide.
You can use System.Reflection.Assembly and its LoadFile method to dynamically load a DLL. You can then use the methods in Assembly to get Classes, types etc. embedded in the DLL and call them.
If you just check if the .dll exists or load every .dll in a plugin directory you can get what you want.
To your question if the program will run on the user's machine without the dlls already being present - yes , the program would run. As long as you dont do something that needs the runtime to load the classes defined in the dll , it does not matter if the dll is missing from the machine. To the aspect you are looking for regarding loading the dll on demand , I think you are well of using some sort of a configuration and Reflection ( either directly or by some IoC strategy. )
Try to load the plugin at startup.
Instead of checking a boolean all over the place, you can create a delegate field for the notification and initialize it to a no-op function. If loading the plugin succeeds, assign the delegate to the plugin implementation. Then everywhere the event occurs can just call the delegate, without worrying about the fact that the plugin might or might not be available.
Is there a way to hook into the Windows File Copy API from C#? I'm aware this would require unmanaged code, but a code sample or starter would be helpful. I've already seen the C++ code, but it's all greek.
UPDATE: I apologize, I should have been more clear about my intentions. I wish to actually change the copy feature of Windows to be more rigid (e.g. allow queing, scheduling, handle restarts, pauses, etc.). When I said hook, I meant API hook so that when someone starts a copy I get the sources and destinations and can handle it to my heart's desire. I'm old school and used to hook the Mac OS API a lot to do these things so I assumed that in the C++ WINAPI world there was some type of equiv.
Update:
As others have stated, why not just use System.IO.File.Copy(...)? It calls this same underlying API. As Michael G points out, perhaps you intend to call the the FileCopyEx API that allows you to hook progress-indication callbacks(???) That's really the only reason to P/Invoke file-copy stuff in .NET. Details on how to implement FileCopyEx that can be found here: http://pinvoke.net/default.aspx/kernel32/CopyFileEx.html
Original answer: (which you really shouldn't use...)
Code snippet removed because you really shouldn't use it...
If you're hell-bent on making busted-code, you can find out how to use it at: Found at http://pinvoke.net/default.aspx/kernel32/CopyFile.html
I wish to actually change the copy feature of Windows to be more rigid
You shouldn't do that in managed code, because of the same reasons you should not write managed shell extensions.
You can do so by calling System.IO.File.Copy. Its internal implementation already uses the Windows API.
Edit: File.Copy also handles permissions correctly and has the benefit of throwing an exception with meaningful data if something fails, so you don't have to manually check and analyze the return status.
You can use Deviare API Hook that lets you intercept any API from .NET and read parameters using VARIANT types. There is a full example very easy to follow in C#.
The other benefit of using unmanaged Copy File API is the ability to have a progress callback.
Note: as stated in other answers, I would use the managed version of File.Copy as it's safer, and can usually do everything you require.