I have several (managed / .NET) processes communicating over a ring buffer which is held in shared memory via the MemoryMappedFile class (just memory no file mapped). I know from the SafeBuffer reference source that writing a struct to that memory is guarded by a CER (Constrained Execution Region) but what if the writing process gets abnormally terminated by the OS while doing so? Can it happen that this leads to the struct being written only partially?
struct MyStruct
{
public int A;
public int B;
public float C;
}
static void Main(string[] args)
{
var mappedFile = MemoryMappedFile.CreateOrOpen("MyName", 10224);
var accessor = mappedFile.CreateViewAccessor(0, 1024);
MyStruct myStruct;
myStruct.A = 10;
myStruct.B = 20;
myStruct.C = 42f;
// Assuming the process gets terminated during the following write operation.
// Is that even possible? If it is possible what are the guarantees
// in regards to data consistency? Transactional? Partially written?
accessor.Write(0, ref myStruct);
DoOtherStuff(); ...
}
It is hard to simulate / test whether this problem really exists since writing to memory is extremly fast. However, it would certainly lead to a severe inconsistency in my shared memory layout and would make it necessary to approach this with for example checksums or some sort of page flipping.
Update:
Looking at Line 1053 in
https://referencesource.microsoft.com/#mscorlib/system/io/unmanagedmemoryaccessor.cs,7632fe79d4a8ae4c
it basically comes down to the question whether a process is protected from abnormal termination while executing code in a CER block (having the Consistency.WillNotCorruptState flag set).
Yes a process can be stopped at any moment.
The SafeBuffer<T>.Write method finally calls into
[MethodImpl(MethodImplOptions.InternalCall)]
[ResourceExposure(ResourceScope.None)]
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]
private static extern void StructureToPtrNative(/*ref T*/ TypedReference structure, byte* ptr, uint sizeofT);
which will do basically a memcpy(ptr, structure, sizeofT). Since unaligned writes are never atomic except for bytes you will run into issues if your process is terminated in the middle while writing a value.
When a process is terminated the hard way via TerminateProcess or an unhandled exception no CERs or something related is ever executed. There is no graceful managed shutdown happening in that case and your application can be stopped right in the middle of an important transaction. Your shared memory data structures will be left in an orphaned state and any locks you might have taken will return the next waiter in WaitForSingleObject WAIT_ABANDONED. That way Windows tells you that a process has died while it had taken the lock and you need to recover the changes done by the last writer.
Related
When I query
fileSystemWatcher.InternalBufferSize
It will give the total Internal buffer size allocated to the Watcher. But I want to know (during debugging) how much buffer size for the Watcher is left and can be used and when I use the above statement in the Event handler method (say for the write operation) it always gives me the total buffer allocated size to the Watcher. Is there any way to obtain the remaining size of the buffer?
Other Questions:
From this answer, it is clear that event is handled on the separate thread than the thread which received the event. Suppose we have many concurrent events coming for a single Watcher which is watching the file. What I think (correct me if I am wrong) the main thread which received the event information will spawn a new thread for each event and processing of events will happen on different threads. So I want to ask:
Will the main thread wait to finish the processing of all the events?
Which thread will clear the internal buffer associated with the Watcher and when?
I have read at lots of places that the handler method should take as minimum time as possible or we can get InternalBufferOverflow Exception.
So, is it safe to assume that the Internal Buffer for the Watcher is only cleaned up when the thread(s) (I can't say one or all, but want to ask from you) which are processing the handler method has processed the method?
No, you can't know how much buffer is left.
It is an implementation detail hidden in an internal class called FSWAsyncResult. If you get hold of an instance of that class and the buffer byte array it contains you still can't reliably answer how much space is left as that byte array only acts as reserved memory for the result of a call to ReadDirectoryChangesW
Find at the bottom of this answer a stripped down, reverse engineered version version of watching a folder for filechanges. Its logic and code matches what you'll find in the real FileSystemWatcher. I didn't bother to replace the magic constants with their proper meaning. It just works™. Don't forget to change the build setting unsafe as the code fiddles with pointers and native structures a lot. And I stripped out all error handling ...
If you follow below code, you'll notice that there is only one place where the byte[] buffer is created, and that only happens once. That same buffer is re-used. Reading the documentation, blogs and worker and I/O threads I understand that ReadDirectoryChangesW is used to issue a callback in an I/O completion fashion. It doesn't matter much for the managed world, that is just another thread.
The callback is scheduled on managed threadpool thread. Sometimes you'll get the same managed id you had before, when it is busy you get several. On that thread CompletionStatusChanged is executed. And that method is responsible for processing all the events that are present in the current byte buffer. Notice that I included a sizeused variable so you can see the actual size of valid data that was present in the buffer. For each event it found it raises/calls the subcribers of the events synchronously (so on the same thread). Once that is complete it calls Monitor again with the same byte[] buffer it just processed. Any file changes during the time CompletionStatusChanged is executing are kept by the OS and send the next time CompletionStatusChanged is called.
tl;dr;
Here is a recap of the answers to your questions:
... I want to know (during debugging) how much buffer size for the Watcher is left and can be used
There is only one buffer used and it makes no sense to know how much is used or how much is left. Once your eventhandlers are called the buffer is reset and starts at 0 again. It raises an exception when there are more events then the byte buffer can handle.
Will the main thread wait to finish the processing of all the events?
The OS will issue an asynchronous callback via an IOCompletionPort but that will manifest itself as normal managed threadpool thread. That thread will handle all events in the current buffer and call the eventhandlers.
Which thread will clear the internal buffer associated with the Watcher and when?
The thread that executes the CompletionStatusChanged method. Notice in my testing the buffer was never cleared (as in filled with zeroes). Data was just overwritten.
I have read at lots of places that the handler method should take as minimum time as possible or we can get InternalBufferOverflow Exception. So, is it safe to assume that the Internal Buffer for the Watcher is only cleaned up when the thread(s) (I can't say one or all, but want to ask from you) which are processing the handler method has processed the method?
You should keep your processing as short as possible because there is only one thread that will call all eventhandlers and in the end it has to call ReadDirectoryChangesW again. During this time it will keep track of filechanges. When those filechange events don't fit in the buffer it will raise an InternalBufferOverflow the next time the completion method is called.
Setup
A simple console app, with a ReadLine to keep it running while waiting for events.
static object instance = new object(); // HACK
static SafeFileHandle hndl; // holds our filehandle (directory in this case)
static void Main(string[] args)
{
// the folder to watch
hndl = NativeMethods.CreateFile(#"c:\temp\delete", 1, 7, IntPtr.Zero, 3, 1107296256, new SafeFileHandle(IntPtr.Zero, false));
// this selects IO completion threads in the ThreadPool
ThreadPool.BindHandle(hndl);
// this starts the actual listening
Monitor(new byte[4096]);
Console.ReadLine();
}
Monitor
This method is responsible for creating the Native structures and an instance of a helper class to act as IAsyncResult implementation.
This method also calls ReadDirectoryChangesW and it chooses the combination of parameters that sets it up for asynchronous completion, with IOCompletinPorts. More background on those options can be found in Understanding ReadDirectoryChangesW - Part 1
static unsafe void Monitor(byte[] buffer)
{
Overlapped overlapped = new Overlapped();
// notice how the buffer goes here as instance member on AsyncResult.
// Arrays are still Reference types.
overlapped.AsyncResult = new AsyncResult { buffer = buffer };
// CompletionStatusChanged is the method that will be called
// when filechanges are detected
NativeOverlapped* statusChanged = overlapped.Pack(new IOCompletionCallback(CompletionStatusChanged), buffer);
fixed (byte* ptr2 = buffer)
{
int num;
// this where the magic starts
NativeMethods.ReadDirectoryChangesW(hndl,
new HandleRef(instance, (IntPtr)((void*)ptr2)),
buffer.Length,
1,
(int)(NotifyFilters.FileName | NotifyFilters.LastAccess | NotifyFilters.LastWrite | NotifyFilters.Attributes),
out num,
statusChanged,
new HandleRef(null, IntPtr.Zero));
}
}
CompletionStatusChanged
The CompletionStatusChanged method is called by the OS as soon as a filechange is detected. In the Overlapped structure we will find, after unpacking, our earlier ResultAsync instance with a filled buffer. The remainder of the method then decodes the byte array by reading the offset for any following events in the array as well as flags and the filename.
// this gets called by a ThreadPool IO Completion thread
static unsafe void CompletionStatusChanged(uint errorCode, uint numBytes, NativeOverlapped* overlappedPointer)
{
var sb = new StringBuilder();
Overlapped overlapped = Overlapped.Unpack(overlappedPointer);
var result = (AsyncResult) overlapped.AsyncResult;
var position = 0;
int offset;
int flags;
int sizeused = 0;
string file;
// read the buffer,
// that can contain multiple events
do
{
fixed (byte* ptr = result.buffer)
{
// process FILE_NOTIFY_INFORMATION
// see https://msdn.microsoft.com/en-us/library/windows/desktop/aa364391(v=vs.85).aspx
offset = ((int*)ptr)[position / 4];
flags = ((int*)ptr + position / 4)[1];
int len = ((int*)ptr + position / 4)[2];
file = new string((char*)ptr + position / 2 + 6, 0, len / 2);
sizeused = position + len + 14;
}
sb.AppendFormat("#thread {0}, event: {1}, {2}, {3}, {4}\r\n", Thread.CurrentThread.ManagedThreadId, position, offset, flags, file);
// in the real FileSystemWatcher here the several events are raised
// so that uses the same thread this code is on.
position += offset;
} while (offset != 0);
// my own logging
sb.AppendFormat(" === buffer used: {0} ==== ", sizeused);
Console.WriteLine(sb);
// start again, reusing the same buffer:
Monitor(result.buffer);
}
}
Helper methods
The AsyncResult implements IAsyncResult (all empty) and holds the member to the byte array buffer.
The NativeMethods are exactly what they are called: entry points for Native calls into the WinAPI.
class AsyncResult : IAsyncResult
{
internal byte[] buffer;
// default implementation of the interface left out
// removed default implementation for brevity
}
static class NativeMethods
{
[DllImport("kernel32.dll", BestFitMapping = false, CharSet = CharSet.Auto)]
public static extern SafeFileHandle CreateFile(string lpFileName, int dwDesiredAccess, int dwShareMode, IntPtr lpSecurityAttributes, int dwCreationDisposition, int dwFlagsAndAttributes, SafeFileHandle hTemplateFile);
[DllImport("kernel32.dll", CharSet = CharSet.Auto, SetLastError = true)]
public unsafe static extern bool ReadDirectoryChangesW(SafeFileHandle hDirectory, HandleRef lpBuffer, int nBufferLength, int bWatchSubtree, int dwNotifyFilter, out int lpBytesReturned, NativeOverlapped* overlappedPointer, HandleRef lpCompletionRoutine);
}
I have a C#-program calling a DLL-method written in C++ to get a pointer to memory allocated on my graphics card using cudaMalloc. Later I pass this pointer to some CUDA-Method of the same DLL. This works fine for data up to 2GB. But as soon as I try to keep pointers to more than two 1-GB-data-chunks, the program terminates without any error message:
char*_test1 = CudaDllWrapper.getDeviceCharPointerTo1GBData(filename);
char* test2 = CudaDllWrapper. getDeviceCharPointerTo1GBData (filename);
char* test3 = CudaDllWrapper.. getDeviceCharPointerTo1GBData (filename); //program terminates in this line
The Cuda-DLL-Code is this:
char* getDeviceCharPointerTo1GBData (const char* a_pcFileName) {
char* pcLargeData = ReadPreRasteredImageAsChar(a_pcFileName);
char* pcPrerasteredImage_dyn = NULL;
unsigned long long iSourceImageSize_byte = getFileSize(a_pcFileName);
size_t freeMem, total;
cudaMemGetInfo(&freeMem, &total);
if (freeMem > iSourceImageSize_byte)
cudasafe(cudaMalloc((void **)&pcPrerasteredImage_dyn, iSourceImageSize_byte), "Original image allocation ", __FILE__, __LINE__);
else
return NULL;
}
As You see, I check for sufficient memory being left on graphics card, but there still seems to be enough memory left and so the DLL-method seems to call cudaMalloc, which seems to cause the program being terminated. When I leave the cudaMalloc uncalled by passing a bool on the third call of getDeviceCharPointerTo1GBData, the program does not terminate anymore.
I am running Windows 7 and now, I am wondering, if WDDM is making my life difficult with its 2GB-limit. But I expected cudaMalloc will simply fail but not that the whole calling C#-application is terminated. Can it be that Windows 7 terminates my program, when it tries to allocate graphics-card-memory past this 2 GB limit? And how can I prevent such a crash returning a nullpointer instead?
The termination was caused to a __debugbreak(); being hit in case of too little memory, which caused an unhandeled exception and thus program termination.
I have the following code that uses new .NET 4.5 multi-threading functionality.
Action2 is a call to a windows API library MLang through Interop.
BlockingCollection<int> _blockingCollection= new BlockingCollection<int>();
[Test]
public void Do2TasksWithThreading()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var tasks = new List<Task>();
for (int i = 0 ; i < Environment.ProcessorCount; i++)
{
tasks.Add((Task.Factory.StartNew(() => DoAction2UsingBlockingCollection(i))));
}
for (int i = 1; i < 11; i++)
{
DoAction1(i);
_blockingCollection.Add(i);
}
_blockingCollection.CompleteAdding();
Task.WaitAll(tasks.ToArray());
stopwatch.Stop();
Console.WriteLine("Total time: " + stopwatch.ElapsedMilliseconds + "ms");
}
private void DoAction2UsingBlockingCollection(int taskIndex)
{
WriteToConsole("Started wait for Action2 Task: " + taskIndex);
int index;
while (_blockingCollection.Count > 0 || !_blockingCollection.IsAddingCompleted)
{
if (_blockingCollection.TryTake(out index, 10))
DoAction2(index);
}
WriteToConsole("Ended wait for Action2 Task: " + taskIndex);
}
private void DoAction2()
{
... Load File bytes
//Call to MLang through interop
Encoding[] detected = EncodingTool.DetectInputCodepages(bytes[], 1);
... Save results in concurrent dictionary
}
I did some testing with this code and increasing number of threads from 1 to 2 to 3, etc.. doesn't make process run any faster. It looks like the the threads are waiting for interop call to finish, which makes me think that it is using single thread for some reason.
Here is the definition of Interop method:
namespace MultiLanguage
{
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
[ComImport, InterfaceType((short) 1), Guid("DCCFC164-2B38-11D2-B7EC-00C04F8F5D9A")]
public interface IMultiLanguage2
[MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime)]
void DetectInputCodepage([In] MLDETECTCP flags, [In] uint dwPrefWinCodePage,
[In] ref byte pSrcStr, [In, Out] ref int pcSrcSize,
[In, Out] ref DetectEncodingInfo lpEncoding,
[In, Out] ref int pnScores);
I there anything that can be done to make this use multiple threads? The only thing I noticed that would require single thread is MethodImplOptions.Synchronized, but that's not being used in this case.
The code for EncodingTools.cs was taken from here:
http://www.codeproject.com/Articles/17201/Detect-Encoding-for-In-and-Outgoing-Text
... Load File bytes
Threads can speed up your program when your machine has multiple processor cores, easy to get these days. Your program is however liable to spend a good bit of time on this invisible code, disk I/O is very slow compared to the raw processing speed of a modern processor. And you still have only a single disk, there is no concurrency at all. Threads will just wait their turn to read data from the disk.
[ComImport, InterfaceType((short) 1), Guid("DCCFC164-2B38-11D2-B7EC-00C04F8F5D9A")]
public interface IMultiLanguage2
This is a COM interface, implemented by the CMultiLanguage coclass. You can find it back in the registry with Regedit.exe, the HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\{275C23E2-3747-11D0-9FEA-00AA003F8646} key contains the configuration for this coclass. Threading is not a detail left up to the client programmer in COM, a COM coclass declares what kind to threading it supports with the ThreadingModel key.
The value for CMultiLanguage is "Both". Which is good news, but it now greatly matters exactly how you created the object. If the object is created on an STA thread, the default for the main thread in a Winforms or WPF project, then COM ensures all the code stays thread-safe by marshaling interface method calls from your worker thread to the STA thread. That will cause loss of concurrency, the threads take their turn entering the single-threaded apartment.
You can only get concurrency when the object was created on an MTA thread. The kind you get from a threadpool thread or your own Thread without a call to its SetApartmentState() method. An obvious approach to ensure this is to create the CMultiLanguage object on the worker thread itself and avoid having these worker threads shared the same object.
Before you start fixing that, you first need to identify the bottleneck in the program. Focus on the file loading first and make sure you get a realistic measurement, avoid running your test program on the same set of files over and over again. That gives unrealistically good results since the file data will be read from the file system cache. Only the first test after a reboot or file system cache reset gives you a reliable measurement. The SysInternals' RamMap utility is very useful for this, use its Empty + Empty Standby List menu command before you start a test to be able to compare apples to apples.
If that shows that the file loading is the bottleneck then you are done, only improved hardware can solve that. If however you measure that IMultiLanguage2 calls then focus on the usage of the CMultiLanguage object. Without otherwise a guarantee that you can get ahead, a COM server typically provides thread-safety by taking care of the locking for you. Such hidden locking can ruin your odds for getting concurrency. The only way to get ahead then is to get the file reading in one thread to overlap with the parsing in another.
Try running nunit-console with parameter /apartment=MTA
I have a question about remote threads.I've read Mike Stall's article present here: <Link>
I would like to create a remote thread that executes a delegate in another process, just like Mike Stall does. However, he declares the delegate in the target process, obtaining a memory address for it and then he creates the remote thread from another process using that address. The code of the target process CANNOT be modified.
So, I cannot use his example, unless I can allocate memory in the target process and then WriteProcessMemory() using my delegate.
I have tried using VirtualAllocEx() to allocate space in the target process but it always returns 0.
This is how it looks so far.
Console.WriteLine("Pid {0}:Started Child process", pid);
uint pidTarget= uint.Parse(args[0]);
IntPtr targetPid= new IntPtr(pidTarget);
// Create delegate I would like to call.
ThreadProc proc = new ThreadProc(MyThreadProc);
Console.WriteLine("Delegate created");
IntPtr fproc = Marshal.GetFunctionPointerForDelegate(proc);
Console.WriteLine("Fproc:"+fproc);
uint allocSize = 512;
Console.WriteLine("AllocSize:" + allocSize.ToString());
IntPtr hProcess = OpenProcess(PROCESS_ALL_ACCESS, false, pidParent);
Console.WriteLine("Process Opened: " + hProcess.ToString());
IntPtr allocatedPtr = VirtualAllocEx(targetPid, IntPtr.Zero, allocSize, AllocationType.Commit, MemoryProtection.ExecuteReadWrite);
Console.WriteLine("AllocatedPtr: " + allocatedPtr.ToString());
Now my questions are:
In the code above, why does VirtualAllocEx() not work? It has been imported using DLLImport from Kernel32. The allocatedPtr is always 0.
How can I calculate alloc size? Is there a way I can see how much space the delegate might need or should I just leave it as a large constant?
How do I call WriteMemory() after all of this to get my delegate in the other process?
Thank you in advance.
That blog post is of very questionable value. It is impossible to make this work in the general case. It only works because:
the CLR is known to be available
the address of the method to execute is known
it doesn't require injecting a DLL in the target process
Windows security is unlikely to stop this particular approach
Which it achieves by handing the client process everything it needs get that thread started. The far more typical usage of CreateRemoteThread is to do so when the target process does not cooperate. In other words, you don't have the CLR, you have to inject a DLL with the code, that code can't be managed, you have to deal with the DLL getting relocated and Windows will balk at all this.
Anyhoo, addressing your question: you don't check for any errors so you don't know what is going wrong. Make sure your [DllImport] declarations have SetLastError=true, check the return value for failure (IntPtr.Zero here) and use Marshal.GetLastWin32Error() to retrieve the error code.
Dear skilled. I’m developing an entity which allows user to copy multiple files in async manner with cancellation ability (and reporting progress as well). Obviously the process of copying runs in another thread, different from thread where CopyAsync was called.
My first implementation uses FileStream.BeginRead/BeginWrite with a buffer and reporting progress against number of usages of that buffer.
Later, for education purposes, I was trying to implement the same stuff thru Win32 CopyFileEx function. Eventually, I’ve stumbled upon the following thing: this function takes a pointer to bool value which is treated as cancellation indicator. According to MSDN this value is to be examined multiple times by Win32 during copying operation. When user sets this value to “false” the copying operation is cancelled.
The real problem for me is how to create a boolean value, pass it to Win32 and to make this value accessible for external user to give him an ability to cancel the copying operation. Obviously the user will call CancelAsync(object taskId), so my question is about how to get access to that boolean value in another thread fro my CancelAsync implementation.
My first attempt was to use Dictionary where key is an identifier of async operation and value points to allocated for boolean value memory slot. When user calls “CancelAsync(object taskId)” method, my class retrieves a pointer to that allocated memory from dictionary and writes “1” there.
Yesterday I’ve developed another solution which is based on creating a bool local variable in my method of copying and holding the address of that value in dictionary until copying operation completes. This approach could be described in the following lines of code (very simple and rough, just to illustrate an idea):
class Program
{
// dictionary for storing operaitons identifiers
public Dictionary<string, IntPtr> dict = new Dictionary<string,IntPtr>();
static void Main(string[] args)
{
Program p = new Program();
p.StartTheThread(); // start the copying operation, in my
// implementation it will be a thread pool thread
}
ManualResetEvent mre;
public void StartTheThread()
{
Thread t = new Thread(ThreadTask);
mre = new ManualResetEvent(false);
t.Start(null);
GC.Collect(); // just to ensure that such solution works :)
GC.Collect();
mre.WaitOne();
unsafe // cancel the copying operation
{
IntPtr ptr = dict["one"];
bool* boolPtr = (bool*)ptr; // obtaining a reference
// to local variable in another thread
(*boolPtr) = false;
}
}
public void ThreadTask(object state)
{
// In this thread Win32 call to CopyFileEx will be
bool var = true;
unsafe
{
dict["one"] = (IntPtr)(&var); // fill a dictionary
// with cancellation identifier
}
mre.Set();
// Actually Win32 CopyFileEx call will be here
while(true)
{
Console.WriteLine("Dict:{0}", dict["one"]);
Console.WriteLine("Var:{0}", var);
Console.WriteLine("============");
Thread.Sleep(1000);
}
}
}
Actually I’m a bit new to P/Invoke and all unsafe stuff so hesitating about latter approach for holding a reference to local value in dictionary and exposing this value to another thread.
Any other thoughts on how to expose that pointer to boolean in order to support cancellation of copying operation?
Ah, so that's what that other thread was about. There's a much better way to accomplish this, CopyFileEx() also supports a progress callback. That callback allows you to update the UI to show progress. And it allows you to cancel the copy, just return PROGRESS_CANCEL from the callback.
Visit pinvoke.net for the callback delegate declaration you'll need.
If your goal is to support being able to cancel a file copy operation in progress, I recommend using a CopyProgressRoutine. This gets called regularly during the copy, and allows you to cancel the operation with a return code. It will let you cancel the operation asynchronously without having to deal with pointers directly.
private class FileCopy
{
private bool cancel = false;
public void Copy(string existingFile, string newFile)
{
if (!CopyFileEx(existingFile, newFile,
CancelableCopyProgressRoutine, IntPtr.Zero, IntPtr.Zero, 0))
{
throw new Win32Exception();
}
}
public void Abort()
{
cancel = true;
}
private CopyProgressResult CancelableCopyProgressRoutine(
long TotalFileSize,
long TotalBytesTransferred,
long StreamSize,
long StreamBytesTransferred,
uint dwStreamNumber,
CopyProgressCallbackReason dwCallbackReason,
IntPtr hSourceFile,
IntPtr hDestinationFile,
IntPtr lpData)
{
return cancel ? CopyProgressResult.PROGRESS_CANCEL :
CopyProgressResult.PROGRESS_CONTINUE;
}
// Include p/inovke definitions from
// http://www.pinvoke.net/default.aspx/kernel32.copyfileex here
}
If you do want to use the pbCancel argument, then manually allocating unmanaged memory as you are already doing is probably the safest way to do it. Taking the address of a local variable is a little dangerous because the pointer will no longer be valid once the variable goes out of scope.
You could also use a boolean field in an object rather than a boolean local variable, but you will need to pin it in memory to prevent the garbage collector from moving it. You can do this either using the fixed statement or using GCHandle.Alloc.
Perhaps I'm missing something, but why couldn't you just use the defn already # http://pinvoke.net/default.aspx/kernel32/CopyFileEx.html and then set the ref int (pbCancel) to 1 at cancel time?