Proper way of passing a pointer for P/Invoke function - c#

Dear skilled. I’m developing an entity which allows user to copy multiple files in async manner with cancellation ability (and reporting progress as well). Obviously the process of copying runs in another thread, different from thread where CopyAsync was called.
My first implementation uses FileStream.BeginRead/BeginWrite with a buffer and reporting progress against number of usages of that buffer.
Later, for education purposes, I was trying to implement the same stuff thru Win32 CopyFileEx function. Eventually, I’ve stumbled upon the following thing: this function takes a pointer to bool value which is treated as cancellation indicator. According to MSDN this value is to be examined multiple times by Win32 during copying operation. When user sets this value to “false” the copying operation is cancelled.
The real problem for me is how to create a boolean value, pass it to Win32 and to make this value accessible for external user to give him an ability to cancel the copying operation. Obviously the user will call CancelAsync(object taskId), so my question is about how to get access to that boolean value in another thread fro my CancelAsync implementation.
My first attempt was to use Dictionary where key is an identifier of async operation and value points to allocated for boolean value memory slot. When user calls “CancelAsync(object taskId)” method, my class retrieves a pointer to that allocated memory from dictionary and writes “1” there.
Yesterday I’ve developed another solution which is based on creating a bool local variable in my method of copying and holding the address of that value in dictionary until copying operation completes. This approach could be described in the following lines of code (very simple and rough, just to illustrate an idea):
class Program
{
// dictionary for storing operaitons identifiers
public Dictionary<string, IntPtr> dict = new Dictionary<string,IntPtr>();
static void Main(string[] args)
{
Program p = new Program();
p.StartTheThread(); // start the copying operation, in my
// implementation it will be a thread pool thread
}
ManualResetEvent mre;
public void StartTheThread()
{
Thread t = new Thread(ThreadTask);
mre = new ManualResetEvent(false);
t.Start(null);
GC.Collect(); // just to ensure that such solution works :)
GC.Collect();
mre.WaitOne();
unsafe // cancel the copying operation
{
IntPtr ptr = dict["one"];
bool* boolPtr = (bool*)ptr; // obtaining a reference
// to local variable in another thread
(*boolPtr) = false;
}
}
public void ThreadTask(object state)
{
// In this thread Win32 call to CopyFileEx will be
bool var = true;
unsafe
{
dict["one"] = (IntPtr)(&var); // fill a dictionary
// with cancellation identifier
}
mre.Set();
// Actually Win32 CopyFileEx call will be here
while(true)
{
Console.WriteLine("Dict:{0}", dict["one"]);
Console.WriteLine("Var:{0}", var);
Console.WriteLine("============");
Thread.Sleep(1000);
}
}
}
Actually I’m a bit new to P/Invoke and all unsafe stuff so hesitating about latter approach for holding a reference to local value in dictionary and exposing this value to another thread.
Any other thoughts on how to expose that pointer to boolean in order to support cancellation of copying operation?

Ah, so that's what that other thread was about. There's a much better way to accomplish this, CopyFileEx() also supports a progress callback. That callback allows you to update the UI to show progress. And it allows you to cancel the copy, just return PROGRESS_CANCEL from the callback.
Visit pinvoke.net for the callback delegate declaration you'll need.

If your goal is to support being able to cancel a file copy operation in progress, I recommend using a CopyProgressRoutine. This gets called regularly during the copy, and allows you to cancel the operation with a return code. It will let you cancel the operation asynchronously without having to deal with pointers directly.
private class FileCopy
{
private bool cancel = false;
public void Copy(string existingFile, string newFile)
{
if (!CopyFileEx(existingFile, newFile,
CancelableCopyProgressRoutine, IntPtr.Zero, IntPtr.Zero, 0))
{
throw new Win32Exception();
}
}
public void Abort()
{
cancel = true;
}
private CopyProgressResult CancelableCopyProgressRoutine(
long TotalFileSize,
long TotalBytesTransferred,
long StreamSize,
long StreamBytesTransferred,
uint dwStreamNumber,
CopyProgressCallbackReason dwCallbackReason,
IntPtr hSourceFile,
IntPtr hDestinationFile,
IntPtr lpData)
{
return cancel ? CopyProgressResult.PROGRESS_CANCEL :
CopyProgressResult.PROGRESS_CONTINUE;
}
// Include p/inovke definitions from
// http://www.pinvoke.net/default.aspx/kernel32.copyfileex here
}
If you do want to use the pbCancel argument, then manually allocating unmanaged memory as you are already doing is probably the safest way to do it. Taking the address of a local variable is a little dangerous because the pointer will no longer be valid once the variable goes out of scope.
You could also use a boolean field in an object rather than a boolean local variable, but you will need to pin it in memory to prevent the garbage collector from moving it. You can do this either using the fixed statement or using GCHandle.Alloc.

Perhaps I'm missing something, but why couldn't you just use the defn already # http://pinvoke.net/default.aspx/kernel32/CopyFileEx.html and then set the ref int (pbCancel) to 1 at cancel time?

Related

Beyond "honor code", is there a difference usign a dedicated "lock object" and locking data directly?

I have two threads: one that feeds updates and one that writes them to disk. Only the most recent update matters, so I don't need a PC queue.
In a nutshell:
The feeder thread drops the latest update into a buffer, then sets a flag to indicate a new update.
The writer thread checks the flag, and if it indicates new content, writes the buffered update to disk and disables the flag again.
I'm currently using a dedicate lock object to ensure that there's no inconsistency, and I'm wondering what differences that has from locking the flag and buffer directly. The only one I'm aware of is that a dedicated lock object requires trust that everyone who wants to manipulate the flag and buffer uses the lock.
Relevant code:
private object cacheStateLock = new object();
string textboxContents;
bool hasNewContents;
private void MainTextbox_TextChanged(object sender, TextChangedEventArgs e)
{
lock (cacheStateLock)
{
textboxContents = MainTextbox.Text;
hasNewContents = true;
}
}
private void WriteCache() // running continually in a thread
{
string toWrite;
while (true)
{
lock (cacheStateLock)
{
if (!hasNewContents)
continue;
toWrite = textboxContents;
hasNewContents = false;
}
File.WriteAllText(cacheFilePath, toWrite);
}
}
First of all, if you're trying to use the bool flag in such manner, you should mark it as volatile (which isn't recommended at all, yet better than your code).
Second thing to note is that lock statement is a sintax sugar for a Monitor class methods, so even if you would be able to provide a value type for it (which is a compile error, by the way), two different threads will get their own version of the flag, making the lock useless. So you must provide a reference type for lock statement.
Third thing is that strings are immutable in the C# so it's theoretically possible for some method to store an old reference to the string and do the lock in a wrong way. Also a string could became a null from MainTextbox.Text in your case, which will throw in runtime, comparing with a private object which wouldn't ever change (you should mark it as readonly by the way).
So, introduction of a dedicated object for synchronization is an easiest and natural way to separate locking from actual logic.
As for your initial code, it has a problem, as MainTextbox_TextChanged could override the text which wasn't being written down. You can introduce some additional synchronization logic or use some library here. #Aron suggested the Rx here, I personally prefer the TPL Dataflow, it doesn't matter.
You can add the BroadcastBlock linked to ActionBlock<string>(WriteCache), which will remove the infinite loop from WriteCache method and the lock from both of your methods:
var broadcast = new BroadcastBlock<string>(s => s);
var consumer = new ActionBlock<string>(s => WriteCache(s));
broadcast.LinkTo(consumer);
// fire and forget
private async void MainTextbox_TextChanged(object sender, TextChangedEventArgs e)
{
await broadcast.SendAsync(MainTextbox.Text);
}
// running continually in a thread without a loop
private void WriteCache(string toWrite)
{
File.WriteAllText(cacheFilePath, toWrite);
}

Detect Boolean value changes inside Thread

I have a c++ dll function that i want to run inside the C# thread.
Some times I need to cancel that thread, and here is the issue :
Thread.Abort() is evil from the multitude of articles I've read on
the topic
The only way to do that was to use a bool and check it's value periodically.
My problem that even i set this value to true it didn't change and still equal to false in c++ code. However when I show a MessageBox that value changed and it works fine.
Any ideas why that value changed only when the MessageBox showed and please tell me how to fix that issue.
C#
public void AbortMesh()
{
if (currMeshStruct.Value.MeshThread != null && currMeshStruct.Value.MeshThread.IsAlive)
{
//here is my c++ Object and cancel mesh used to set bool to true;
MeshCreator.CancelMesh();
}
}
C++
STDMETHODIMP MeshCreator::CancelMesh(void)
{
this->m_StopMesh = TRUE;
return S_OK;
}
when I test the boolean value
if (m_StopMesh)
return S_FALSE;
The value here is always false even i call AbortMesh()
if (m_StopMesh)
return S_FALSE;
MessageBox(NULL,aMessage,L"Test",NULL);
if (m_StopMesh) // here the value is changed to true
return S_FALSE;
The non-deterministic thread abortion (like with Thread.Abort) is a really bad practice. The problem is that it is the only practice that allows you to stop your job when job does not know that it could be stopped.
There is no library or framework in .NET I know of that allows to write threaded code that could allow you to run an arbitrary task and abort it at any time without dire consequences.
So, you was completely write when you decided to use manual abort using some synchronization technique.
Solutions:
1) The simplest one is using of a volatile Boolean variable as it was already suggested:
C#
public void AbortMesh()
{
if (currMeshStruct.Value.MeshThread != null && currMeshStruct.Value.MeshThread.IsAlive)
{
MeshCreator.CancelMesh();
}
}
C++/CLI
public ref class MeshCreator
{
private:
volatile System::Boolean m_StopMesh;
...
}
STDMETHODIMP MeshCreator::CancelMesh(void)
{
this->m_StopMesh = TRUE;
return S_OK;
}
void MeshCreator::ProcessMesh(void)
{
Int32 processedParts = 0;
while(processedParts != totalPartsToProcess)
{
ContinueProcessing(processedParts);
processedParts++;
if (this->m_StopMesh)
{
this->MakeCleanup();
MessageBox(NULL,aMessage,L"Test",NULL);
}
}
}
Such code should not require any synchronization if you do not make any assumptions on completion of thread after the CancelMesh call - it is not instantaneous and may take variable amount of time to happen.
I don't know why the use of the volatile didn't help you, but there are few moments you could check:
Are you sure that the MeshCreator.CancelMesh(); method call actually happen?
Are you sure that m_StopMesh is properly initialized before the actual processing begins?
Are you sure that you check the variable inside the ProcessMesh often enough to have decent response time from your worker and not expecting something instantaneous?
2)Also if you use .NET 4 or higher you could also try to use the CancellationToken-CancellationTokenSource model. It was initially designed to work with Tasks model but works well with standard threads. It won't really simplify your code but taking into an account the async nature of your processing code will possibly simplify future integration with TPL
CancellationTokenSource cancTokenSource = new CancellationTokenSource();
CancellationToken cancToken = cancTokenSource.Token;
Thread thread = new Thread(() =>
{
Int32 iteration = 0;
while (true)
{
Console.WriteLine("Iteration {0}", iteration);
iteration++;
Thread.Sleep(1000);
if (cancToken.IsCancellationRequested)
break;
}
});
thread.Start();
Console.WriteLine("Press any key to cancel...");
Console.ReadKey();
cancTokenSource.Cancel();
3) You may want to read about interlocked class,monitor locks, autoresetevents and other synchronization, but they are not actually needed in this application
EDIT:
Well, I don't know how it couldn't help(it is not the best idea, but should work for such a scenario), so I'll try later to mock your app and check the issue - possibly it has something to do with how MSVC and CSC handle volatile specifier.
For now try to use Interlocked reads and writes in your app:
public ref class MeshCreator
{
private:
System::Boolean m_StopMesh;
...
}
STDMETHODIMP MeshCreator::CancelMesh(void)
{
Interlocked::Exchange(%(this->m_StopMesh), true);
return S_OK;
}
void MeshCreator::ProcessMesh(void)
{
Int32 processedParts = 0;
while(processedParts != totalPartsToProcess)
{
ContinueProcessing(processedParts);
processedParts++;
if (Interlocked::Read(%(this->m_StopMesh))
{
this->MakeCleanup();
MessageBox(NULL,aMessage,L"Test",NULL);
}
}
}
P.S.: Can you post the code that actually processes the data and checks the variable(I don't mean your full meshes calculations method, just its main stages and elements)?
EDIT: AT LEAST IT'S CLEAR WHAT THE SYSTEM IS ABOUT
It is possible that your child processes are just not exterminated quick enough. Read this SO thread about process killing.
P.S.: And edit your question to more clearly describe your system and problem. It is difficult to get the right answer to a wrong or incomplete question.
Try putting volatile before the field m_StopMesh:
volatile BOOL m_StopMesh;
I launched the c++ process using a thread and it worked fine.
If you want to communicate across process boundaries, you will need to use some sort of cross-process communication.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365574(v=vs.85).aspx
I find Named Pipes convenient and easy to use.
UPDATE
Your comment clarifies that the C++ code is running in-process.
I would suggest a ManualResetEvent. For a great overview of thread synchronization (and threads in general) check out http://www.albahari.com/threading/

.NET Interop call is limited to single thread?

I have the following code that uses new .NET 4.5 multi-threading functionality.
Action2 is a call to a windows API library MLang through Interop.
BlockingCollection<int> _blockingCollection= new BlockingCollection<int>();
[Test]
public void Do2TasksWithThreading()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var tasks = new List<Task>();
for (int i = 0 ; i < Environment.ProcessorCount; i++)
{
tasks.Add((Task.Factory.StartNew(() => DoAction2UsingBlockingCollection(i))));
}
for (int i = 1; i < 11; i++)
{
DoAction1(i);
_blockingCollection.Add(i);
}
_blockingCollection.CompleteAdding();
Task.WaitAll(tasks.ToArray());
stopwatch.Stop();
Console.WriteLine("Total time: " + stopwatch.ElapsedMilliseconds + "ms");
}
private void DoAction2UsingBlockingCollection(int taskIndex)
{
WriteToConsole("Started wait for Action2 Task: " + taskIndex);
int index;
while (_blockingCollection.Count > 0 || !_blockingCollection.IsAddingCompleted)
{
if (_blockingCollection.TryTake(out index, 10))
DoAction2(index);
}
WriteToConsole("Ended wait for Action2 Task: " + taskIndex);
}
private void DoAction2()
{
... Load File bytes
//Call to MLang through interop
Encoding[] detected = EncodingTool.DetectInputCodepages(bytes[], 1);
... Save results in concurrent dictionary
}
I did some testing with this code and increasing number of threads from 1 to 2 to 3, etc.. doesn't make process run any faster. It looks like the the threads are waiting for interop call to finish, which makes me think that it is using single thread for some reason.
Here is the definition of Interop method:
namespace MultiLanguage
{
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Security;
[ComImport, InterfaceType((short) 1), Guid("DCCFC164-2B38-11D2-B7EC-00C04F8F5D9A")]
public interface IMultiLanguage2
[MethodImpl(MethodImplOptions.InternalCall, MethodCodeType=MethodCodeType.Runtime)]
void DetectInputCodepage([In] MLDETECTCP flags, [In] uint dwPrefWinCodePage,
[In] ref byte pSrcStr, [In, Out] ref int pcSrcSize,
[In, Out] ref DetectEncodingInfo lpEncoding,
[In, Out] ref int pnScores);
I there anything that can be done to make this use multiple threads? The only thing I noticed that would require single thread is MethodImplOptions.Synchronized, but that's not being used in this case.
The code for EncodingTools.cs was taken from here:
http://www.codeproject.com/Articles/17201/Detect-Encoding-for-In-and-Outgoing-Text
... Load File bytes
Threads can speed up your program when your machine has multiple processor cores, easy to get these days. Your program is however liable to spend a good bit of time on this invisible code, disk I/O is very slow compared to the raw processing speed of a modern processor. And you still have only a single disk, there is no concurrency at all. Threads will just wait their turn to read data from the disk.
[ComImport, InterfaceType((short) 1), Guid("DCCFC164-2B38-11D2-B7EC-00C04F8F5D9A")]
public interface IMultiLanguage2
This is a COM interface, implemented by the CMultiLanguage coclass. You can find it back in the registry with Regedit.exe, the HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\{275C23E2-3747-11D0-9FEA-00AA003F8646} key contains the configuration for this coclass. Threading is not a detail left up to the client programmer in COM, a COM coclass declares what kind to threading it supports with the ThreadingModel key.
The value for CMultiLanguage is "Both". Which is good news, but it now greatly matters exactly how you created the object. If the object is created on an STA thread, the default for the main thread in a Winforms or WPF project, then COM ensures all the code stays thread-safe by marshaling interface method calls from your worker thread to the STA thread. That will cause loss of concurrency, the threads take their turn entering the single-threaded apartment.
You can only get concurrency when the object was created on an MTA thread. The kind you get from a threadpool thread or your own Thread without a call to its SetApartmentState() method. An obvious approach to ensure this is to create the CMultiLanguage object on the worker thread itself and avoid having these worker threads shared the same object.
Before you start fixing that, you first need to identify the bottleneck in the program. Focus on the file loading first and make sure you get a realistic measurement, avoid running your test program on the same set of files over and over again. That gives unrealistically good results since the file data will be read from the file system cache. Only the first test after a reboot or file system cache reset gives you a reliable measurement. The SysInternals' RamMap utility is very useful for this, use its Empty + Empty Standby List menu command before you start a test to be able to compare apples to apples.
If that shows that the file loading is the bottleneck then you are done, only improved hardware can solve that. If however you measure that IMultiLanguage2 calls then focus on the usage of the CMultiLanguage object. Without otherwise a guarantee that you can get ahead, a COM server typically provides thread-safety by taking care of the locking for you. Such hidden locking can ruin your odds for getting concurrency. The only way to get ahead then is to get the file reading in one thread to overlap with the parsing in another.
Try running nunit-console with parameter /apartment=MTA

how do I handle messages asynchronously, throwing away any new messages while processing?

I have a C# app that subscribes to a topic on our messaging system for value updates. When a new value comes in, I do some processing and then carry on. The problem is, the updates can come faster than the app can process them. What I want to do is to just hold on to the latest value, so I don't want a queue. For example, the source publishes value "1" and my app receives it; while processing, the source publishes the sequence (2, 3, 4, 5) before my app is done processing; my app then processes value "5", with the prior values thrown away.
It's kind of hard to post a working code sample since it's based on proprietary messaging libraries, but I would think this is a common pattern, I just can't figure out what it's called...It seems like the processing function has to run on a separate thread than the messaging callback, but I'm not sure how to organize this, e.g. how that thread is notified of a value change. Any general tips on what I need to do?
A very simple way could be something like:
private IMessage _next;
public void ReceiveMessage(IMessage message)
{
Interlocked.Exchange(ref _next, message);
}
public void Process()
{
IMessage next = Interlocked.Exchange(ref _next, null);
if (next != null)
{
//...
}
}
Generally speaking one uses a messaging system to prevent losing messages. My initial reaction for a solution would be a thread to receive the inbound data which tries to pass it to your processing thread, if the processing thread is already running then you drop the data and wait for the next element and repeat.
Obviously the design of the messaging library can influence the best way to handle this problem. How I've done it in the past with somewhat similar functioning libraries, is I have a thread that listens for events, and places them into a Queue, and then I have Threadpool workers that dequeue the messages and process them.
You can read up on multithreading asyncronous job queues:
Mutlithreaded Job Queue
Work Queue Threading
A simple way is to use a member variable to hold the last value received, and wrap it with a lock. Another way is to push incoming values onto a stack. When you're ready for a new value, call Stack.Pop() and then Stack.Clear():
public static class Incoming
{
private static object locker = new object();
private static object lastMessage = null;
public static object GetMessage()
{
lock (locker)
{
object tempMessage = lastMessage;
lastMessage = null;
return tempMessage;
}
}
public static void SetMessage(object messageArg)
{
lock (locker)
{
lastMessage = messageArg;
}
}
private static Stack<object> messageStack = new Stack<object>();
public static object GetMessageStack()
{
lock (locker)
{
object tempMessage = messageStack.Count > 0 ? messageStack.Pop() : null;
messageStack.Clear();
return tempMessage;
}
}
public static void SetMessageStack(object messageArg)
{
lock (locker)
{
messageStack.Push(messageArg);
}
}
}
Putting the processing functions on a separate thread is a good idea. Either use a callback method from the processing thread to signal that its ready for another message, or have it signal that it's done and then have the main thread start a new processor thread when a message is received (via the above SetMessage...).
This is not a "pattern", but you could use a shared data structure to hold the value. If there is only one value received from the messaging library, then a simple object would do. Otherwise you might be able to use a hashtable to store multiple message values (if required).
For example, on the message receive thread: when a message comes in, add/update the data structure with its value. On the thread side, you could periodically check this data structure to make sure you still have the same value. If you do not, then discard any processing you have already done and re-process with the new value.
Of course, you will need to ensure the data structure is properly synchronized between threads.

C# program (process) will not unload

I have a C# program that uses a class from another assembly, and this class calls an unmanaged DLL to do some processing. Here is a snippet:
public class Util
{
const string dllName = "unmanaged.dll";
[DllImport(dllName, EntryPoint = "ExFunc")]
unsafe static extern bool ExFunc(StringBuilder path, uint field);
public bool Func(string path, uint field)
{
return ExFunc(new StringBuilder(path), field);
}
...
}
Util util = new Util();
bool val = util.Func("/path/to/something/", 1);
The problem I'm having is that if I call "Func" my main C# program will not unload. When I call Close() inside my main form the process will still be there if I look in Task Manager. If I remove the call to "Func" the program unloads fine. I have done some testing and the programs Main function definitely returns so I'm not sure what's going on here.
It looks like your unmanaged library is spawning a thread for asynchronous processing.
Odds are it supports a cancel function of some sort; I suggest that you attempt to call that at program shutdown. If your program is just completing before the asynchronous call happens to complete, look for a "wait for completion" function and call that before returning from your "Func" method.
It might dispatch a non background thread that is not letting go when your main application closes. Can't say for sure without seeing the code but that is what I would assume.
It's probably less then ideal, but if you need a workaround you could probably use:
System.Diagnostics.Process.GetCurrentProcess().Kill();
This will end your app at the process level and kill all threads that are spawned through the process.
Do you have the source code to unmanaged.dll ? It must be doing something, either starting another thread and not exiting, or blocking in it's DllMain, etc.

Categories